Ehud Barnea (Ben-Gurion University)
Tuesday, 28.11.2017, 11:30
The recurring context in which objects appear holds valuable information that can be employed to predict their existence. This intuitive observation indeed led many researchers to endow appearance-based detection results with explicit reasoning about context. The underlying thesis suggests that with stronger contextual relations, the better improvement in detection capacity one can expect from such a combined approach. In practice, however, the observed improvement in many case is modest at best, and often only marginal. In this work we seek to understand this phenomenon better, in part by pursuing an opposite approach. Instead of going from context to detection score, we try to formulate the score as a function of standard detector results and a contextual relation, an approach that allows to treat the utility of context as an optimization problem in order to obtain the largest gain possible from considering context in the first place. Analyzing different types of context reveals the most helpful ones and shows that in many cases including context can help while in other cases a great improvement is simply impossible or impractical. To better understand these results we then analyze the ability of context to separate correct detections from different types of false detections, revealing that contextual information cannot ameliorate localization errors, which in turn also diminishes the observed improvement obtained by correcting others types of errors. These insights provide further explanations and better understanding regarding the success or failure of utilizing context for object detection.