Student research opportunities
Patent collections mining
Project Code: CECS_909
This project is available at the following levels:
Masters, PhD
Please note that this project is only for higher degree (postgraduate) applicants.
Keywords:
Natural Language Processing, Machine Learning, Patent Mining.
Supervisor:
Dr Gabriela FerraroOutline:
Several patent retrieval tools offer users the possibility to save query results into collections.
Moreover, users may perform several queries using different criteria, keywords, etc. and
save the results in the same collection because they share some properties in which the user
is interested in. Although collections are personalized, they still require expert evaluation to
assess the relevance of the retrieved patents. This process is now being done manually,
which means that users has to read and manually analyze each document. The research
questions underlaying this problem are, how we can speed up the comprehension of big
text collections? How we can reduce human manual intervention? At to what extent it is
possible to simulate what a patent user manually do?
Goals of this project
The goal of this project is to develop and evaluate tools for assessing patent collections by
applying Natural Language Processing and Machine learning techniques.
Requirements/Prerequisites
Good coding skills in Java.
Background Literature
Foundations of Statistical Natural Language Processing by Christopher D. Manning and Hinrich Schutze



