SODA

Vector-based semantic analysis: representing word meanings based on random labels

Sahlgren, Magnus (2001) Vector-based semantic analysis: representing word meanings based on random labels. In: Semantic Knowledge Acquisition and Categorisation Workshop at ESSLLI XIII (European Summer School in Logic, Language and Information), 13-17 Aug 2001, Helsinki, Finland.

Full text not available from this repository.

Abstract

Vector-based semantic analysis is the practice of representing word meanings as semantic vectors, calculated from the co-occurrence statistics of words in large text data. This paper discusses the theoretical presumptions behind this practice, and a representational scheme based on the Distributional Hypothesis is identified as the rationale for vector-based semantic analysis. A new method for calculating semantic word vectors is then described. The method uses random labelling of words in narrow context windows to calculate semantic context vectors for each word type in the text data. The method is evaluated with a standardised synonym test, and it is shown that incorporating linguistic information in the context vectors can enhance the results.

Item Type:Conference or Workshop Item (Paper)
ID Code:3094
Deposited By:INVALID USER
Deposited On:16 Jul 2008
Last Modified:18 Nov 2009 16:17

Repository Staff Only: item control page