Pixel Club: PLAYING 20 QUESTIONS WITH CORRUPTED ANSWERS: Applications to Object Localization and Tracking

Speaker:
Raphael Sznitman (Ecole Politechnique Federale de Lausanne, Switzerland)
Date:
Tuesday, 19.6.2012, 11:30
Place:
EE Meyer Building 1061

The problem of search is ubiquitous in computer vision, and perhaps most common in object detection and localization. While the last few decades have produced a wealth of methods to evaluate the presence of a target at a given location, the search problem remains particularly difficult. This is in large part due to detection methods being far from perfect and that solutions with ever more challenging constraints are demanded. For example, more complex objects need to be found in increasingly larger images, or real-time solutions are needed on computationally limited systems.

Towards this end, we present a Bayesian formulation of the traditional “twenty questions” game, to locate an object in images. By sequentially asking a knowledgeable oracle “questions”, and considering that the received answers are noisy, our goal is to determine a policy, or sequence of questions, that reduces the uncertainty of the target location as much as possible. We will show that principals in dynamic programming and information theory can be used to characterize an optimal policy when minimizing the expected entropy of the distribution of target locations. In particular, we show one solution to this problem that is a greedy, Bayes-optimal and simple to compute. We will then present an embodiment of this greedy and optimal policy in the context of two applications: (i) tool tracking during retinal microsurgery, and (ii) face detection and localization. In both cases, we show significant speedups over state-of-the-art methods, while maintaining similar accuracy levels.

Back to the index of events