Context-Oriented Information Integration
Mukesh Mohania (IBM India Research Lab)
CSIRO ICTDATE: 2006-05-22
TIME: 13:30:00 - 14:30:00
LOCATION: CSIRO ICT Centre Seminar Room Level 2, Building 108, North Road, ANU
CONTACT: JavaScript must be enabled to display this email address.
ABSTRACT:
With critical business information distributed across both structured and unstructured data sources,enterprises are ncreasingly realizing the importance of seamlessly integrating relevant structured and unstructured data. Existing information integration solutions typically address this issue by providing a single point of access for both structured and unstructured data sources. This is not enough, since the application still needs to formulate the SQL logic to retrieve the needed structured data on one hand, and identify a set of keywords to retrieve the related unstructured data on the other. This is a limitation since (a) the same information need needs to be formulated using two disparate paradigms, which is redundant effort, and (b) in many cases, it is hard (even impossible) for the application to identify appropriate keywords needed as above to retrieve related unstructured data. The SCORE project addresses this limitation by following a novel approach to information integration. In this approach, the application specifies its information needs using only a SQL query on the structured data, and the system automatically "translates" this query into a set of keywords that can be used to retrieve relevant unstructured data. In this talk, we describe the techniques used in SCORE for this query translation, and also present an experimental study that illustrates the effectiveness of these techniques.
BIO:
Mukesh Mohania received his Ph.D. in Computer Science & Engineering from Indian Institute of Technology, Bombay, India in 1995. He was a faculty member in University of South Australia, Western Michigan University from 1995-2001. He was also associated with Kyoto University and Purdue University as Senior Research Fellow from 1996-2001. Currently, he is a manager in IBM India Research Lab and leading a database and autonomic computing research group. He has worked extensively in the areas of rule processing in distributed databases, data warehousing, semi/unstructured databases, XML data integration, data mining and autonomic computing. He was awarded Technical Achievement Award in the area of Web Database Management and Data Warehousing by Association of Database and Expert Systems Applications in Greenwich, U.K., 2000. His work on dutch auction, XML data mining, and context-oriented information integration received the best paper award in EC-Web 2002, CIKM 2004, and CIKM 2005, respectively.
