Exploiting Syntax when Detecting Protein Names in Text

Eriksson, Gunnar and Franzén, Kristofer and Olsson, Fredrik and Asker, Lars and Lidén, Per (2002) Exploiting Syntax when Detecting Protein Names in Text. In: EFMI Workshop on Natural Language Processing in Biomedical Applications, March 8-9, 2002, Nicosia, Cyprus.


Official URL: http://


This paper presents work on a method to detect names of proteins in running text. Our system - Yapex - uses a combination of lexical and syntactic knowledge, heuristic filters and a local dynamic dictionary. The syntactic information given by a general-purpose off-the-shelf parser supports the correct identification of the boundaries of protein names, and the local dynamic dictionary finds protein names in positions incompletely analysed by the parser. We present the different steps involved in our approach to protein tagging, and show how combinations of them influence recall and precision. We evaluate the system on a corpus of MEDLINE abstracts and compare it with the KeX system (Fukuda et al., 1998) along four different notions of correctness.

Item Type:Conference or Workshop Item (Paper)
ID Code:2596
Deposited On:10 Dec 2007
Last Modified:18 Nov 2009 16:11

Repository Staff Only: item control page