Technical Report CS760

TR#:CS760
Title: MORPHOLOGICAL DISAMBIGUATION IN HEBREW USING A PRIORI PROBABILITIES.
Authors: M. Levinger, U. Ornan and A. Itai
PostScriptNot Available
Abstract:

This paper describes a new approach for morphological disambiguation in Hebrew using an untagged corpus. This approach demonstrates a way to extract very useful and nontrivial information from an untagged corpus, which otherwise would require laborious tagging of large corpora. The suggested method depends primarily on the following property: a lexical entry in Hebrew may have many different word forms, some of which are ambiguous while others are not. Thus, disambiguation of a given word can be achieved using other word forms of the same lexical entry. Even though it was originally devised and implemented for dealing with the problem in Hebrew, the basic idea can be extended and used to handle similar problems in other languages with rich morphology.

CopyrightThe above paper is copyright by the Technion, Author(s), or others. Please contact the author(s) for more information

Remark: Any link to this technical report should be to this page (http://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-info.cgi/1992/CS/CS0760), rather than to the URL of the PDF or PS files directly. The latter URLs may change without notice.

To the list of the CS technical reports of 1992
To the main CS technical reports page

Computer science department, Technion