Conditional Random Fields for Multi-Agent Reinforcement Learning
Phillip Zhang (ANU, College of Engineering and Computer Science)
CSL PHD MONITORINGDATE: 2007-03-30
TIME: 15:00:00 - 15:30:00
LOCATION: RSISE Seminar Room, ground floor, building 115, cnr. North and Daley Roads, ANU
CONTACT: JavaScript must be enabled to display this email address.
ABSTRACT:
Conditional random fields (CRFs) are state-of-the-art graphical models for modeling the probability of labels given the observations. They have traditionally been trained either in batch or online mode. Underlying all CRFs is the assumption that, conditioned on the training data, the labels are independent and identically distributed (iid). In this paper we explore the use of CRFs in a class of temporal learning algorithms, namely policy-gradient reinforcement learning (RL). Now the labels are no longer iid. They are actions that update the environment and affect the next observation. From an RL point of view, CRFs provide a natural way to model joint actions in a decentralized Markov decision process. They define how agents can communicate with each other to choose the optimal emph{joint} action. Our experiments include an synthetic network alignment problem, a distributed sensor network, and road traffic control; clearly outperforming RL methods which do not model the proper joint policy.
BIO:
