Modelling Retrieval Models in a Probabilistic Relational Algebra with a new Operator: The Relational Bayes
Thomas Roelleke (Queen Mary University of London)
CSIRO ICTDATE: 2008-04-09
TIME: 14:00:00 - 15:00:00
LOCATION: CSIRO Seminar Room, S206, CSIT Building (Building 108)
CONTACT: JavaScript must be enabled to display this email address.
ABSTRACT:
The work on probabilistic DB technology led to results that feed into DB+IR technology. The talk will browse research on probabilistic DB and reasoning including Cavallo/Pitarelli:VLDB:87 (theory of probabilistic DB), Fuhr/Roelleke:TOIS:97, Chaudhuri...Weikum:04/06 (probabilistic ranking of tuples), Dalvi/Suciu:04/05 (efficient processing of safe expressions), and our recent contribution, the relational Bayes.
The relational Bayes is a new probabilistic relational operator. Traditional database technology is based on five operators. Probabilistic extensions based on those five only captured probability aggregation, but not estimation. The Bayes operator embeds probability estimation conceptually into the probabilistic relational paradigm.
Through the relational Bayes, IR models such as tf-idf,
binary-independent retrieval, and language modelling can
be expressed in probabilistic logical models. This will be
illustrated in examples and a system demo. The outlook
addresses optimisation, design and verification of
probabilistic logical programs, and applications such as
RSS retrieval.
BIO:
Thomas Roelleke is a researcher and lecturer at Queen Mary
University of London (QMUL). Thomas started his IT career
at Nixdorf Computer as product manager for Unix/DB/4GL
technology. While studying Computer Science & Engineering
in Dortmund, he consulted Nixdorf in Europe as Unix/4GL/C
tutor. From 1994-1999, he was a researcher/lecturer at the
University of Dortmund, and after his PhD, he was
appointed as strategic IT consultant at comdirect bank
Germany, and he continued his research as a research
fellow at Queen Mary University London.
Thomas' research contributions include - a probabilistic relational algebra (PRA, TOIS 97), - a probabilistic object-oriented logic (POOL, SIGIR 96/98, chapter in 2002 book Intelligent Exploration of the Web, PhD thesis "POOL: A Probabilistic Object- Oriented Logic", Shaker Verlag 99), - the probabilistic inference engine HySpirit (EDBT 98, FQAS 98, BTW 97), - the probability of being informative (SIGIR 03), - an idf formulation of the probabilistic retrieval model (SIGIR 05), - a general matrix framework for information retrieval (IP&M 06), - a parallel derivation of probabilistic retrieval models (SIGIR 06), and - Probabilistic SQL and the relational Bayes (TREC 05, VLDB Journal 08).
Thomas was a co-organiser of the DB+IR workshop at SIGIR
04, a member of the panel "DB and IR: Rethinking the great
divide" at SIGMOD 05, and he is the founder of a spin-out
to exploit innovative DB+IR technology for information
management. He serves as a reviewer for numerous
conferences and journals in the field.
