Karlgren, Jussi and Sahlgren, Magnus and Cöster, Rickard (2006) Weighting Query Terms Based on Distributional Statistics. In: Accessing Multilingual Information Repositories: 6th Workshop of the Cross-Language Evalution Forum, CLEF 2005, Vienna, Austria, 21-23 September, 2005, Revised Papers, September, 2005, Wien, Austria. (In Press)
Full text not available from this repository.
Official URL: http://www.springeronline.com/lncs
Abstract
This year, the SICS team has concentrated on query processing and on the internal topical structure of the query, specifically compound translation. Compound translation is non-trivial due to dependencies between compound elements. This year, we have investigated topical dependencies between query terms: if a query term happens to be non-topical or noise, it should be discarded or given a low weight when ranking retrieved documents; if a query term shows high topicality its weight should be boosted. The two experiments described here are based on the analysis of the distributional character of query terms: one using similarity of occurrence context between query terms globally across the entire collection; the other using the likelihood of individual terms to appear topically in individual texts. Both -- complementary -- boosting schemes tested delivered improved results.
| Item Type: | Conference or Workshop Item (Paper) |
|---|---|
| ID Code: | 151 |
| Deposited By: | Userware Researcher |
| Deposited On: | 20 Nov 2005 |
| Last Modified: | 18 Nov 2009 15:54 |
Repository Staff Only: item control page

