Skip navigation
The Australian National University

Student research opportunities

Unsupervised Part-of-Speech Training with Non-parametric Methods

Project Code: CECS_818

This project is available at the following levels:
Honours, Masters

Keywords:

non-parametric statistics, part of speech models, machine learning

Supervisor:

Dr Wray Buntine

Outline:

One of the classic problems in natural language processing is unsupervised inference of parts of speech, i.e., attempting to infer noun/verb classes etc. without having tagged text. Recent research here by Blunsom and Cohn uses Bayesian non-parametric methods specifically Pitman-Yor processes. Implement and test these methods and try various extensions using the techniques from our group.

Requirements/Prerequisites

Gibbs sampling. Good programming skills. Basic exposure to natural language processing.

Background Literature

"A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction", Phil Blunsom and Trevor Cohn. ACL 2011, Portland, Oregon.

Links

Blunsom and Cohn's paper
My website with pointers to literature

Contact:



Updated:  8 May 2013 / Responsible Officer:  JavaScript must be enabled to display this email address. / Page Contact:  JavaScript must be enabled to display this email address.