Segmented Topic Modeling
Lan Du (SoCS CECS)
CS HDR MONITORING AI Research GroupDATE: 2010-04-16
TIME: 09:00:00 - 09:30:00
LOCATION: RSISE Seminar Room, ground floor, building 115, cnr. North and Daley Roads, ANU
CONTACT: JavaScript must be enabled to display this email address.
ABSTRACT:
In machine learning community, topic modeling has become a very interesting research topic. It has been broadly applied to text analysis, information retrieval, computer vision, etc. Here, we present a Segment Topic Model, which is a four-level generative Bayesian probabilistic model of a corpus based on two-parameter Poisson-Dirichlet process (PDP). The basic idea is that documents are composed of meaningful segments, each of which has a topic distribution generated based on document topic distribution by using PDP. We develop an efficient collapsed Gibbs sampling algorithm for approximate inference and parameter estimation. We will report the results in document modeling, comparing it with previous topic models, ie Latent Dirichlet Allocation (LDA) and Latent Dirichlet Co-clustering (LDCC).
BIO:
PhD student in Computer Science.
