Automated Data Cleaning Via Logic

Research areas


Data is usually collected as tables, but such tables usually contain many errors due to mistyping or just misunderstanding of questions. For example, a record may claim that a 5 year old is married. The Fellegi-Holt method of data cleaning is a standard way to find the minimal changes required to correct a record. We have shown that the essence of the Fellegi-Holt method of data cleaning is an old method from automated deduction called propositional resolution.


The project is to implement a prototype for the Fellgi-Holt method of data cleaning using fast SAT solvers or fast consequence finders.


A good background in maths will be useful.


There is a high chance that this could lead to a conference publication and/or a Phd here working on data cleaning via logic.


data cleaning, constraint satisfaction, resolution

Updated:  10 February 2019/Responsible Officer:  Head of School/Page Contact:  CECS Marketing