Skip navigation
The Australian National University

Modelling spatial layout for image classification

Jakob Verbeek (LEAR of INRIA Rhone Alpes, Grenoble)

NICTA SML SEMINAR

DATE: 2011-05-26
TIME: 11:00:00 - 12:00:00
LOCATION: NICTA - 7 London Circuit
CONTACT: JavaScript must be enabled to display this email address.

ABSTRACT:
In this talk I'll present some recent work on modelling the spatial layout of local image descriptors for image classification. Current state-of-the-art methods for image classification are based on bag-of-word image representations. These are obtained by quantizing local image descriptors (features of small image patches) and representing an image by aggregating statistics of local appearances. In the simplest case a frequency histogram that codes how many local descriptors are assigned to each visual word. In such a representation all information on the spatial layout of the image patches is lost. To encode layout, Lazebnik et al in 2006 proposed Spatial Pyramid Matching, an approach that concatenates visual word histograms computed over different spatial cells. In our work, instead, we define Gaussian (mixture) models over the spatial locations of visual words in the image. We use the Fisher kernel framework of Jaakkola& Haussler to compute the image representation as the gradient vector with respect to these generative models given the appearance of local image patches and their locations. Our experimental results show that our image representation are more effective than the spatial pyramid approach for bag-of-word appearance modelling.

Updated:  24 May 2011 / Responsible Officer:  JavaScript must be enabled to display this email address. / Page Contact:  JavaScript must be enabled to display this email address.