Lex Weaver

Publications

Some of the following papers have been deposited in CoRR, the Computing Research Repository. Postscript and PDF versions of these papers can be obtained from the CoRR website.

Lex Weaver and Nigel Tao. The Variance Minimizing Constant Reward Baseline for Gradient-Based Reinforcement Learning. Technical report, Department of Computer Science, Australian National University, May 2001 (being updated). [PDF][Postscript]

Lex Weaver and Nigel Tao. The Optimal Reward Baseline for Gradient-Based Reinforcement Learning. In Uncertainty in Artificial Intelligence: Proceedings of the Seventeenth Conference (2001), University of Washington, Seattle WASHINGTON, August 2-5 2001, Morgan Kaufman Publishers, San Francisco CALIFORNIA , ISBN 1-55860-800-1, pages 538-545. [PDF] [Postscript]

Lex Weaver and Jonathan Baxter. STD(lambda): learning state differences with TD(lambda). In Proceedings of the Post-graduate ADFA Conference on Computer Science 2001 (PACCS'01), ADFA Monographs in Computer Science Series (1), ISBN 0-7317-0507-6, Canberra ACT, July 14 2001, pages 63-70. [PDF] [Postscript]

Nigel Tao, Jonathan Baxter, and Lex Weaver. A Multi-Agent, Policy-Gradient approach to Network Routing. In ICML 2001: 18th International Conference on Machine Learning, ISBN 1558607781, Morgan Kaufmann Publishers, Williamstown MA, July 2001. [PDF] [Postscript]

Jonathan Baxter, Peter Bartlett, and Lex Weaver. Experiments with Infinite-Horizon, Policy-Gradient Estimation. In JAIR (Journal of Artificial Intelligence Research), November 2001, Vol. 15, pages 351-381, ISSN 1076-9757. [PDF][Postscript]

Jonathan Baxter, Andrew Tridgell, and Lex Weaver. Learning to Play Chess Using Temporal-Differences. In MACHINE LEARNING, ISSN 0885-6125, Vol. 40 No. 3, September 2000, pages 243-263. [PDF][Postscript]

Jonathan Baxter, Lex Weaver, and Peter Bartlett. Direct Gradient-Based Reinforcement Learning: II. Gradient Ascent Algorithms and Experiments. Technical report, CSL, Australian National University, 1999. [PDF][Postscript]

Lex Weaver and Jonathan Baxter. Reinforcement Learning From State and Temporal Differences. Technical report, Department of Computer Science, Australian National University, May 1999 (updated September 1999). [PDF][Postscript]

Jonathan Baxter, Andrew Tridgell, and Lex Weaver. KnightCap: A chess program that learns by combining TD(lambda) with game-tree search. In MACHINE LEARNING Proceedings of the Fifteenth International Conference (ICML '98), ISBN 1-55860-556-8, ISSN 1049-1910, Madison WISCONSIN, July 24-27 1998, pages 28-36. [CoRR: cs.LG/9901002]

J. Baxter, A. Tridgell, and L. Weaver. Experiments in Parameter Learning using Temporal Differences. In the ICCA JOURNAL (Journal of the International Computer Chess Association), ISSN 0920-234X, Vol. 21 No. 2, June 1998, pages 84-99. (The published version has a different format to the postscript and PDF versions provided here, but the content is the same.)[PDF][Postscript]

Jonathan Baxter, Andrew Tridgell, and Lex Weaver. TDLeaf(lambda): Combining Temporal Difference Learning with Game-Tree Search. Australian Journal of Intelligent Information Processing Systems, ISSN 1321-2133, Vol. 5 No. 1, Autumn 1998, pages 39-43 (invited paper).
Also in the Proceedings of the Ninth Australian Conference on Neural Networks (ACNN'98), Brisbane QLD, February 1998, pages 168-172. [PDF][Postscript][CoRR: cs.LG/9901001]

J. Baxter, A. Tridgell, and L. Weaver. KnightCap : A chess program that learns by combining TD(lambda) with minimax search. Technical Report, Department of Systems Engineering, Australian National University, November 1997, 16 pages. [PDF][Postscript]

L. Weaver and T. Bossomaier. Evolution of Neural Networks to Play the Game of Dots-and-Boxes. In Artificial Life V: Poster Presentations, May 16-18 1996, pages 43-50. [PDF][Postscript][CoRR: cs.NE/9809111]

Lex Weaver. Design and Evaluation of Mechanisms for a Multicomputer Object Store. Unpublished Honours thesis. November 1994, 134 pages. [PDF][Postscript][DVI][CoRR: cs.DC/0004010]

Lex Weaver and Chris Johnson. Pre-fetching tree-structured data in distributed memory, Proceedings of the Third Fujitsu Parallel Computing Workshop, pages P1-L-1 to P1-L11, Kawasaki, Japan, November 1994. Fujitsu Laboratories Ltd. [PDF][Postscript][CoRR: cs.DC/9810002]

Lex Weaver and Andrew Lynes. Sorting Integers on the AP1000. Unpublished project report. May 1994, 23 pages. [PDF][Postscript][CoRR: cs.DC/0004013]

Seminars & Presentations

2001 July 14, STD(lambda): learning state differences with TD(lambda), presentation at the Post-graduate ADFA Conference on Computer Science 2001 (PACCS'01), Canberra ACT. [Microsoft Powerpoint][PDF][Postscript]

1998 August 5, Looking into TD(lambda) with Function Approximation, seminar given to Adaptive Network Laboratory, Department of Computer Science, University of Massachusetts, Amherst, Ma USA. [Microsoft Powerpoint (doesn't include graphs)]

1998 February 13, TDLeaf(lambda): Combining Temporal Difference Learning with Game-Tree Search, presentation at the Ninth Australian Conference on Neural Networks (ACNN '98), Brisbane QLD. [Microsoft Powerpoint]

1996 March 22, Evolving Neural Networks with Strategic Intelligence, invited seminar given to the School of Information Technology at Charles Sturt University (CSU) - Bathurst Campus. [Postscript]


 



[BACK] [Comments: Lex.Weaver@cs.anu.edu.au]
December 8, 2001