Idan Schwartz, Ph.D. Thesis Seminar
Wednesday, 10.3.2021, 16:30
For password to lecture, please contact: firstname.lastname@example.org
Advisor: Prof. Tamir Hazan, Prof. Alexander Schwing
The quest for algorithms that enable cognitive abilities is an integral part of machine learning and appears in many facets, such as virtual assistant and visual reasoning. A cognitive system requires an effective approach to extract details and nuances from the multiple sensors that pound the devices' computational engine. To this end, we propose a novel form of attention mechanism, namely Factor Graph Attention, that operates on any data utilities and differentiates useful signals from distracting ones.
Our model won the Visual Dialog challenge and showed a state-of-the-art performance on various tasks, such as Visual Question Answering (VQA), Video Dialog, and Visual Storytelling. Despite the substantial improvements the attention mechanism has been permitting, strong classifiers are prone to exploit biases and find shortcuts. As a consequence, current methods may solve the dataset but not directly the task. To address this concern, we introduce perceptual scores that assess the degree to which a model relies on the input features' different subsets (i.e., modalities). We also propose methods to increase a model's perceptiveness, such as sample re-weighting and an information-based regularization. We validate our methods' efficacy on various datasets, such as VQA, SocialIQ, and SNLI.