Scaling Feature Reinforcement Learning
Mayank Daswani (ANU)
CS HDR MONITORING AI groupDATE: 2012-11-14
TIME: 11:30:00 - 12:00:00
LOCATION: RSISE Seminar Room, ground floor, building 115, cnr. North and Daley Roads, ANU
CONTACT: JavaScript must be enabled to display this email address.
ABSTRACT:
A reinforcement learning agent learns via trial-and-error interactions with its environment. Traditional reinforcement learning assumes a Markov Decision Process (MDP) for the environment is given to the agent. The purpose of feature reinforcement learning (FRL) is to automatically extract this MDP from the agent's raw observation-reward-action history. This talk will explain the difficulties and benefits of using dynamic bayesian networks in model-based FRL. I will then present a model-free alternative that is potentially scalable via function approximation.
