Skip navigation
The Australian National University

Unsupervised Semantic Feature Discovery for Object-Based Image Application

Dr. Wen-Huang Cheng (Academia Sinica, Taiwan)

ARTIFICIAL INTELLIGENCE SEMINAR

DATE: 2012-07-03
TIME: 11:00:00 - 12:00:00
LOCATION: RSISE Seminar Room, ground floor, building 115, cnr. North and Daley Roads, ANU
CONTACT: JavaScript must be enabled to display this email address.

ABSTRACT:
Most of us have been used to sharing personal photos on the social services (or media) such as Flickr and Facebook. More and more users are also willing to contribute related tags (e.g. geo-locations and taken-time) and comments, on the photos for photo management and social communication. Such user-contributed contextual information provides promising research opportunities for understanding the images in social media. To obtain desired images, users usually issue a query to a search engine using either an image or keywords. Therefore, the existing solutions for image retrieval rely on either the image contents (e.g., low-level features) or the surrounding texts (e.g., descriptions, tags) only. Those solutions usually suffer from low recall rates because small changes in lighting conditions, viewpoints, occlusions or (missing) noisy tags can degrade the performance significantly. In this talk, we tackle the problem by leveraging both the image contents and associated textual information in the social media and propose a general framework to augment each image with relevant semantic (visual and textual) features by exploiting textual and visual image graphs in an unsupervised manner. The proposed framework can be directly applied to various applications, such as keyword-based image search, image object retrieval, and tag refinement. Meanwhile, observing that visual objects in real-world images often exhibit great visual diversity (e.g. appearing in arbitrary size) accompanied with complex environmental conditions (e.g. foreground and background clutter), and such real-world characteristics are lacking in most of the existing image data sets, in this talk, we also propose a large-scale image data set of real-world objects (commercial advertising business signs) collected from Googleas Street View, containing more than 4,500 images of 62 different businesses with pixel-level object annotations, for promoting the related multimedia research.
BIO:
Wen-Huang Cheng is an Assistant Professor (Assistant Research Fellow) at the Research Center for Information Technology Innovation (CITI), Academia Sinica, Taipei, Taiwan. He is the founding leader of Multimedia Computing Laboratory (MCLab) in CITI: http://mclab.citi.sinica.edu.tw/. His current research interest includes multimedia content analysis, computer vision, mobile multimedia applications, and human computer interaction. His research has received best paper awards from IEEE Consumer Electronics Society, Taipei Chapter, and IEEE Signal Processing Society, Taipei Chapter. He received the B.S. and M.S. degrees in computer science and information engineering from National Taiwan University, Taipei, Taiwan, in 2002 and 2004, respectively, where he received the Ph.D. degree from the Graduate Institute of Networking and Multimedia in 2008. From 2009 to 2010, he was a principal researcher at the MagicLabs, HTC Corporation, Taoyuan, Taiwan. He has been serving as the technical program committees of various conferences including ACM MM, CVPR, ECCV, and ICMR.

Updated:  26 June 2012 / Responsible Officer:  JavaScript must be enabled to display this email address. / Page Contact:  JavaScript must be enabled to display this email address.