Peter Bailey
What I do
I am working on the WAR
project as a researcher. We are doing lots of fun and exciting
things at the nexus between information retrieval and Web search.
I used to work in this area while
trying to avoid doing my Phd a few years ago. Now I get to do it
fulltime.
The other people in the project are
Dave Hawking,
Nick Craswell, and
Francis Crimmins.
Over the last year, we've done a number of things, including
but not limited to the following:
- Developed a demonstration of our research work in Web search
technology. We are currently providing search services for the whole of the
ANU. Check this out at
http://search.anu.edu.au/. The current name is P@NOPTIC search,
though it used to be called S@NITY search. My contributions have
been mainly on the front end UI, the crawler (a parallel
version of GNU's wget), and integration with the PADRE
indexer/query processor. Since the IP is all owned by ACSys,
there are attempts to commercialise the software. We are soon to
be providing Web search services for CSIRO's CMIS.
- Organised the Web track for TREC,
which we are doing again this year. We also participate in TREC
each year, both in the Web track and others.
- Wrote several papers, some of which have been published, and
others have been submitted to conferences.
- Supervised an Honours student investigating dynamic spidering
techniques
- Run the Document Technologies course for the Honours
students. As a project, we got them to build various different
components for an information retrieval system, the goal being
to be able to plug and play different document scanners,
indexers and query processors.
This year there is more of the same, and some new stuff as well. I'm
particularly keen to do some solid time software engineering a new,
robust and configurable Web crawler. We'd like to do this in Java,
both to learn the language and because it should allow us to handle
distribution and platform-independence in a nice fashion.
We are also committed to building a new Web test collection of 10gB of
Web data for this year's Web track in TREC-9. There are some
interesting challenges in constructing this, and it has to be done by
March 2000.
Prior to working with the WAR project, I spent 18 months doing
some very real-world programming.
In search of industry experience after years at ANU, in 1997 I went
and worked as a software engineer in Sydney for
Object Technology International up until
September 1998. Our team of four produced a product called Server
Smalltalk, which comes as part of IBM's
Visual Age for
Smalltalk v4.5. I worked a lot on the distributed garbage
collection and CORBA/IIOP components.
Research interests ...
1998-present
My main research interests presently are in building better retrieval
systems for real world data. I also like to think about the nature
of the Web in general, and how we can make more use of its structure
to do a better job of retrieval. We've written a paper examining
issues around what doesn't get found on the Web. Currently we are
thinking about where to submit it to.
We've also submitted papers about our experiences with
building P@NOPTIC and some of the implications to SIGIR'2000 and
Advances in Digital Libraries. Some of Nick's particular interests
in server selection formed the basis of a paper submitted to the Digital
Libraries conference as well.
1997-1998
I learned a whole lot about object oriented computing while at OTI,
but there was little time to pursue individual research interests.
I also developed a healthy interest in software engineering
practices, rapid application development, guerrilla programming
techniques, and distributed object
technologies.
pre-1997
I finished my PhD in the
Department of Computer Science,
The Australian National University,
in January 1997. I worked on the
CAP project at the same time
as doing my PhD just to keep busy.
My PhD research interests were in the development of an extension
of ML called
paraML, which explored parallel programming languages and
models. The work involved looking at formal semantics for the extensions,
building implementations of data parallelism, algorithmic skeletons, and
object stores to illustrate the efficacy of paraML, and keeping up with
SML/NJ
compiler releases and developments with message passing systems
such as MPI
for the backend. I retain an interest in modern programming languages
and semantics, but more from a practical viewpoint these days.
I spent happy times working with Dave Hawking on older versions of
PADRE as well in these years. During those years, we were very keen on
having entire text bases in main memory, since the AP1000 had 2gB of
RAM. Nowadays of course, you can have that in your desktop PC, and we
are trying to search hundreds of gigabytes of data.
A not completely up to date
list of publications is available.
Administrivia: Contact Details
My official DCS home page
http://cs.anu.edu.au/personnel/staffDisplayNames.html?lastName=bailey
Email
Peter.Bailey@cs.anu.edu.au
Airmail, Voicemail, Faxmail
-
- Peter Bailey
-
- Department of Computer Science
-
- Australian National University
-
- Canberra, ACT, 0200, AUSTRALIA
-
- Phone: +61 6 249 3460
-
- Fax: +61 6 249 0010
Last modified: Sun Feb 13 16:10:29 EST 2000