ALIGNMENT

Next: LEXICAL DISAMBIGUATION Up: Papers Previous: TAGGING

ALIGNMENT

1.: Simard Foster Isabelle
Using Cognates to Align Sentences in Bilingual Corpora
Finds strings that are obvious translations (numbers, punctuation and long words that share the first 4 letters). Alignment based on sharing cognates works poorly. It helps in finding alignments where length counts are inconclusive.
2.: Kay Roscheisen
Text-Translation Alignment
Computational Linguistics 19(1) 121-142
Aligns some sentences first, then aligns by words that appear in aligned sentences, then using the words as anchors, aligns additional sentences,...
3.: Gale and Church
A Program for Aligning Sentences in Bilingual Corpora
Aligns English-German bilingual texts, by word count or letter count. Includes the C-program.
4.: Dagan Church Gale
Robust Bilingual Word Alignment for Machine Aided Translation
Refines char_align to get word_align. Does not rely on sentence alignment. Works well on noisy texts.
5.: Gale and Church
Identifying Word Correspondence in Parallel Texts
Looks for word correspondence favoring high precision and low recall, using an alternative to the EM program (that program requires too much space).
6.: Church Dagan Dale Fung Helfman Satish
Aligning Parallel Texts: Do Methods Developed for English-French Generalize to Asian Languages
Difficulties of Japanese. Uses char_align and refinements
7.: Kupiec
An algorithm for finding noun phrase correspondences in bilingual corpora
ACL 93,
English-French noun correspondences, finds the noun phrases from tagged corpus, using FSA. Then applies an alg similar to Baum-Welch. Works also for sparse data

Next: LEXICAL DISAMBIGUATION Up: Papers Previous: TAGGING

Alon &
2002-04-11