This is an event in the CSIRO IR & Friends series, in conjunction with RSCS HCC & Friends.
About three quarters of a billion people are functionally illiterate, meaning that they have no more than a very basic ability to read or write. Modern search engines are powerful tools for much of the world’s population, but if we are to build search engines for illiterate and low-literacy users we will need to come at the problem differently. I’ll begin by describing two lines of work on this problem in the broad area known as Information and Communication Technology for Development (variously, ICTD or ICT4D), one that seeks to leverage visual interfaces, numeracy, and limited literacy, and a second that seeks to leverage speech. I’ll then focus the rest of the talk on the work that we have been doing on speech-to-speech retrieval.
The key challenge that we have sought to address is that most illiterate and low-literacy users don’t speak any language for which we have the sorts of highly engineered Large-Vocabulary Continuous Speech Recognition (LVCSR) systems on which much of the recent work on speech retrieval depends. A shared-task evaluation in MediaEval started to tackle that challenge in 2011 using a Spoken Term Detection (STD) evaluation. The results there were promising, showing that systems could often recognize single terms in continuous speech based on examples, without any foreknowledge of the language. In our work, we have sought to build on one of these MediaEval systems to apply this STD capability to perform ad hoc ranked retrieval (i.e., finding recorded content that is most likely to satisfy a user’s information need).
I’ll describe the “Query by Babbling” interaction paradigm that we have been exploring, in which we are exploring what would happen if instead of short queries and long result sets, as is appropriate for text, we had long queries and short result sets, perhaps a better approach for speech.
I’ll then describe a test collection we have built using spoken content from a voice forum site used by farmers in Gujarat, India (speaking in Gujarati), some ranked retrieval systems that we have evaluated using that collection, and the results that we have obtained.
I’ll finish up with a few thoughts on where the remaining hard spots are with this technology, and what I see as next steps to address those challenges. This is joint work with Jerome White (NYU Abu Dhabi), Nintendra Rajput (IBM India Research Lab) and Aren Jansen (at the time at the Johns Hopkins HLTCOE).
Douglas Oard is a Professor at the University of Maryland, College Park, with joint appointments in the College of Information Studies (Maryland’s iSchool) and the University of Maryland Institute for Advanced Computer Studies (UMIACS). Dr. Oard earned his Ph.D. in Electrical Engineering from the University of Maryland. His research interests center around the use of emerging technologies to support information seeking by end users. Additional information is available at http://terpconnect.umd.edu/~oard/.