A Multi-Scale Generalization of the HoG and HMAX Image Descriptors for Object Detection

Unknown author (2008-04-09)

Recently, several powerful image features have been proposed whichcan be described as spatial histograms of oriented energy. Forinstance, the HoG, HMAX C1, SIFT, and Shape Context feature allrepresent an input image using with a discrete set of bins whichaccumulate evidence for oriented structures over a spatial regionand a range of orientations. In this work, we generalize thesetechniques to allow for a foveated input image, rather than arectilinear raster. It will be shown that improved object detectionaccuracy can be achieved via inputting a spectrum of imagemeasurements, from sharp, fine-scale image sampling within a smallspatial region within the target to coarse-scale sampling of a widefield of view around the target. Several alternative featuregeneration algorithms are proposed and tested which suitably makeuse of foveated image inputs. In the experiments we show thatfeatures generated from the foveated input format produce detectorsof greater accuracy, as measured for four object types from commonlyavailable data-sets. Finally, a flexible algorithm for generatingfeatures is described and tested which is independent of inputtopology and uses ICA to learn appropriate filters.