Ofer Givoli, M.Sc. Thesis Seminar
Wednesday, 26.7.2017, 10:00
Semantic parsing is the task of mapping natural language sentences into a formal representation of their meaning, often defined as logical forms. One of the prominent uses of semantic parsing is parsing natural language instructions in the context of natural language interfaces (NLIs) for various types of software applications. In this work, we present a novel task: parsing instructions in simple domains that are unseen during training, into logical forms with deep compositionality. Previous work on parsing natural language instructions either did not support unseen domains or did not support mapping instructions to logical forms with deep compositionality.
We constructed a new dataset for this task, covering linguistic phenomena such as superlatives, comparatives and spatial and temporal language. The dataset includes annotated examples from 7 simple domains (e.g. a toy calendar application). To facilitate the collection and usage of the dataset, we developed a framework attempting to minimize the effort of adding an NLI to simple Java applications.
Using our new dataset, we evaluate a log-linear model tailored to this task, implemented using a toolkit called SEMPRE. We present a novel training approach where the AdaGrad weight updates are conditioned on evidence indicating the update is beneficial for multiple domains. Also, we present a training method in which AdaGrad training is separated into two steps, using examples from different domains in each and learning different weight subsets.