ScholarGate
Assistant

Speech and Language Applications

The applied face of computational linguistics: converting between speech and text, extracting structured information from documents, and building systems that answer questions and hold conversations.

Definition

Speech and language applications are end-user systems that perceive, understand, or produce human language, built by composing the methods of computational linguistics.

Scope

Covers the major application areas of speech and language technology — automatic speech recognition, text-to-speech synthesis, information extraction, and question answering and dialogue systems. It situates these as integrative tasks that combine the field's foundations, parsing, semantics, and learning methods. Component techniques are covered in their respective areas.

Sub-topics

Core questions

  • How is spoken language converted to and from text?
  • How is structured information extracted from unstructured documents?
  • How do systems answer natural-language questions and sustain dialogue?
  • How are application systems evaluated for real-world use?

Key concepts

  • automatic speech recognition
  • text-to-speech
  • information extraction
  • named-entity recognition
  • question answering
  • dialogue system
  • acoustic model
  • evaluation

Key theories

Noisy-channel speech recognition
Framing recognition as recovering the most probable word sequence given an acoustic signal by combining an acoustic model and a language model.
Pipeline of language understanding
Applications compose tokenization, parsing, semantics, and retrieval into pipelines or end-to-end models that map user input to useful responses.

History

Speech recognition drove much of early statistical NLP, with shared corpora such as the Wall Street Journal collection enabling rigorous comparison. Information extraction and question answering grew through evaluation campaigns in the 1990s and 2000s, and dialogue systems became consumer products as neural methods and large language models matured.

Debates

Pipelines versus end-to-end systems
Whether to build applications from modular linguistic components or to train end-to-end neural systems; end-to-end approaches dominate where data is plentiful but offer less interpretability.

Key figures

  • Daniel Jurafsky
  • James H. Martin
  • Frederick Jelinek
  • Janet Baker

Related topics

Seminal works

  • paul1992
  • manning1999
  • jurafsky2025

Frequently asked questions

Why group speech and text applications together?
They share the same probabilistic and neural foundations — language models, sequence modeling, and evaluation — so techniques developed for one, such as language modeling in speech recognition, transfer readily to the other.

Methods for this concept

Related concepts