Morphological Disambiguation of Hebrew Using a Combination of Simple Classifiers

Speaker: Danny Shacham (Haifa University)

Abstract:

Morphological analysis is a crucial stage in a variety of natural language processing applications. When languages with complex morphology are concerned, even shallow applications, such as search and information retrieval engines, require morphological analysis and disambiguation as a first step. The unique word formation machinery, along with the standard Hebrew orthography, which leaves most of the vowels unspecified, make orphological disambiguation of Hebrew a much more complex endeavor than the parallel POS tagging task for English.

We propose a machine learning approach to complex morphological problems involving a large number of targets, where a structure can clearly be imposed on the tags. Given the large number of potential targets, we address the problem as one of combining several classifiers, each predicting the value of one of components of the analysis. The results of the na¨ıve classifiers are combined in a sophisticated manner, taking into account the constraints that hold among the various components.




Back to ISCOL'05 homepage