SODA

Creating Bilingual Lexica Using Reference Wordlists for Alignment of Monolingual Semantic Vector Spaces

Holmlund, Jon and Sahlgren, Magnus and Karlgren, Jussi (2005) Creating Bilingual Lexica Using Reference Wordlists for Alignment of Monolingual Semantic Vector Spaces. In: 15th Nordic Conference of Computational Linguistics, 20-21 May 2005, Joensuu.

Full text not available from this repository.

Abstract

This paper proposes a novel method for automatically acquiring multi-lingual lexica from non-parallel data and reports some initial experiments to prove the viability of the approach. Using established techniques for building mono-lingual vector spaces two independent semantic vector spaces are built from textual data. These vector spaces are related to each other using a small {\em reference word list} of manually chosen reference points taken from available bi-lingual dictionaries. Other words can then be related to these reference points first in the one language and then in the other. In the present experiments, we apply the proposed method to comparable but non-parallel English-German data. The resulting bi-lingual lexicon is evaluated using an online English-German lexicon as gold standard. The results clearly demonstrate the viability of the proposed methodology.

Item Type:Conference or Workshop Item (Poster)
Subjects:I. Computing Methodologies > I.2 ARTIFICIAL INTELLIGENCE
I. Computing Methodologies > I.7 DOCUMENT AND TEXT PROCESSING (H.4, H.5)
ID Code:28
Deposited By:Userware Researcher
Deposited On:20 Oct 2005
Last Modified:18 Nov 2009 15:51

Repository Staff Only: item control page