Karlgren, Jussi and Sahlgren, Magnus and Cöster, Rickard (2006) Weighting Query Terms Based on Distributional Statistics. In: Accessing Multilingual Information Repositories: 6th Workshop of the Cross-Language Evalution Forum, CLEF 2005, Vienna, Austria, 21-23 September, 2005, Revised Papers, September, 2005, Wien, Austria. (In Press)
Full text not available from this repository.
Official URL: http://www.springeronline.com/lncs
This year, the SICS team has concentrated on query processing and on the internal topical structure of the query, specifically compound translation. Compound translation is non-trivial due to dependencies between compound elements. This year, we have investigated topical dependencies between query terms: if a query term happens to be non-topical or noise, it should be discarded or given a low weight when ranking retrieved documents; if a query term shows high topicality its weight should be boosted. The two experiments described here are based on the analysis of the distributional character of query terms: one using similarity of occurrence context between query terms globally across the entire collection; the other using the likelihood of individual terms to appear topically in individual texts. Both -- complementary -- boosting schemes tested delivered improved results.
|Item Type:||Conference or Workshop Item (Paper)|
|Deposited By:||Userware Researcher|
|Deposited On:||20 Nov 2005|
|Last Modified:||18 Nov 2009 15:54|
Repository Staff Only: item control page