Feature Reinforcement Learning: PhiMDP Agents
Phuong Nguyen (SoCS CECS)
CS HDR MONITORING AI Research GroupDATE: 2010-04-16
TIME: 13:30:00 - 14:00:00
LOCATION: RSISE Seminar Room, ground floor, building 115, cnr. North and Daley Roads, ANU
CONTACT: JavaScript must be enabled to display this email address.
ABSTRACT:
Reinforcement Learning (RL) investigates the problem in which an agent learns to achieve some goal by interacting with an environment; the agent takes actions, and receives observations and rewards from the environment. If the RL problem admits an Markov decision process, that is, all useful information for the agent to take actions is subsumed into the current observation or state; then it is generally solvable under certain assumptions. However, most of the real world RL problems are non-Markov ones, for which efficient solutions are non-trivial, if not impossible, to find. Feature Markov decision process (PhiMDP) framework develops an automatic learning procedure of a map Phi in order to reduce real world RL problems to the class of exact or approximate MDPs where the task of determining smart and efficient actions are much easier.
The main objective of this presentation is to briefly review results obtained from experiments on PhiMDP simulated agents. Studies of those experiments are conducted to illustrate different aspects of PhiMDP cost functions. It is shown that in almost all cases PhiMDP cost functions have capability of correctly identifying the optimal state set for predicting rewards, which is potentially beneficial for the agent's ultimate goal of learning smart policies. In addition, some limits of PhiMDP agents using current cost criteria are also exhibited.
BIO:
PhD student in Computer Science. http://nmphuong.wordpress.com


