SODA

A literature survey of active machine learning in the context of natural language processing

Olsson, Fredrik (2009) A literature survey of active machine learning in the context of natural language processing. [SICS Report]

[img]
Preview
PDF
459Kb

Abstract

Active learning is a supervised machine learning technique in which the learner is in control of the data used for learning. That control is utilized by the learner to ask an oracle, typically a human with extensive knowledge of the domain at hand, about the classes of the instances for which the model learned so far makes unreliable predictions. The active learning process takes as input a set of labeled examples, as well as a larger set of unlabeled examples, and produces a classifier and a relatively small set of newly labeled data. The overall goal is to create as good a classifier as possible, without having to mark-up and supply the learner with more data than necessary. The learning process aims at keeping the human annotation effort to a minimum, only asking for advice where the training utility of the result of such a query is high. Active learning has been successfully applied to a number of natural language processing tasks, such as, information extraction, named entity recognition, text categorization, part-of-speech tagging, parsing, and word sense disambiguation. This report is a literature survey of active learning from the perspective of natural language processing.

Item Type:SICS Report
ID Code:3600
Deposited By:Vicki Carleson
Deposited On:06 May 2009
Last Modified:18 Nov 2009 16:24

Repository Staff Only: item control page