ScholarGate
Asistent

Part-of-Speech Tagging and Sequence Labeling

Assigning a label to each token in a sentence — its part of speech, named-entity type, or chunk tag — using probabilistic sequence models such as hidden Markov models and conditional random fields.

Găsește o temă cu PaperMindÎn curândFind papers & topics
Tools & resources
Descarcă prezentarea
Learn & explore
VideoÎn curând

Definition

Sequence labeling is the task of assigning a categorical label to each element of an input sequence, with part-of-speech tagging as its canonical instance.

Scope

Covers sequence-labeling tasks central to shallow analysis: part-of-speech tagging, named-entity recognition, and chunking. It includes the standard models — hidden Markov models, maximum-entropy Markov models, conditional random fields, and neural sequence taggers — and tagsets such as the Penn Treebank and Universal POS. Full parsing is covered in sibling topics.

Core questions

  • How do hidden Markov models assign the most likely tag sequence?
  • Why do conditional random fields outperform locally normalized models?
  • How are tagsets designed and standardized across languages?
  • How does sequence labeling support downstream parsing and extraction?

Key concepts

  • part-of-speech tag
  • hidden Markov model
  • Viterbi algorithm
  • conditional random field
  • named-entity recognition
  • chunking
  • tagset
  • BIO encoding

Key theories

Hidden Markov model tagging
Modeling a tag sequence as a Markov chain emitting observed words, with the Viterbi algorithm recovering the most probable tag sequence efficiently.
Conditional random fields
Globally normalized discriminative models for sequence labeling that condition on the whole input and avoid the label bias of locally normalized models.

History

POS tagging was an early success of statistical NLP once the Penn Treebank (1993) provided large annotated data. Hidden Markov model taggers gave way to discriminative maximum-entropy and conditional-random-field models around 2001, which were in turn absorbed into neural sequence labelers in the 2010s.

Debates

Generative versus discriminative sequence models
Whether to model the joint distribution of words and tags (HMMs) or to condition labels directly on the input (CRFs); discriminative models generally win on accuracy when rich features are available.

Key figures

  • Mitchell Marcus
  • John Lafferty
  • Andrew McCallum
  • Fernando Pereira

Related topics

Seminal works

  • marcus1993
  • lafferty2001

Frequently asked questions

Why is part-of-speech tagging not trivial?
Many words are ambiguous — 'book' can be a noun or a verb — so the correct tag depends on context. Sequence models resolve this by considering surrounding words and tags jointly.

Methods for this concept

Related concepts