An empirical evaluation of PhiMDP Agents
Mayank Daswani (ANU)
NICTA SML SEMINARDATE: 2010-11-18
TIME: 11:00:00 - 12:30:00
LOCATION: CSIT Seminar Room, N101
CONTACT: JavaScript must be enabled to display this email address.
ABSTRACT:
The ideal agent is one that can behave intelligently in a wide range of environments. Phi-MDP is an approach aimed at building an agent that can automatically extract useful state representations based on the history of actions, observations and rewards. The agent can then use well developed reinforcement learning techniques for Markov Decision Processes. This project is the first implementation and evaluation of PhiMDP in the active setting and various design considerations and implementation details are presented. The evaluation focused on issues of exploration vs. exploitation and compared the performance of model-based and model-free PhiMDP agents on several problems. A small comparison with U-Tree is performed. One main issue unique to PhiMDP is identified as the possible instability of optimal policies on optimal maps in some environments and steps toward rectifying this issue are considered.
BIO:
