ScholarGate
Assistent

Latent Class Analysis

Latent class analysis explains associations among categorical observed variables by positing an unobserved categorical variable whose classes account for the observed patterns.

Finn tema med PaperMindSnartFind papers & topics
Tools & resources
Last ned lysbilder
Learn & explore
VideoSnart

Definition

Latent class analysis is a latent variable model in which a categorical latent variable with a small number of classes accounts for the joint distribution of observed categorical indicators, which are assumed independent given class membership.

Scope

This topic covers the latent class model as a finite mixture for categorical data, the assumption of conditional independence of indicators within a class, estimation of class sizes and item-response probabilities by maximum likelihood through the expectation-maximization algorithm, posterior classification of cases into classes, and selection of the number of classes.

Core questions

  • How can categorical indicators be explained by an unobserved grouping?
  • What does conditional independence within classes imply?
  • How are class probabilities and item-response probabilities estimated?
  • How is the number of latent classes chosen?

Key theories

Conditional independence within classes
Latent class analysis assumes that, given the latent class, the observed categorical indicators are independent, so all of their observed association is attributed to the latent class structure.
Categorical finite mixture
The model is a finite mixture over categorical responses, estimated by maximum likelihood via the expectation-maximization algorithm, yielding class proportions and class-conditional response probabilities.

Clinical relevance

Latent class analysis is used to identify unobserved subgroups from categorical survey or diagnostic data, such as symptom profiles or response typologies, and underlies latent transition models for change over time.

History

Latent class analysis originated in Lazarsfeld's mid-twentieth-century work on latent structure in attitude measurement and was placed on a maximum-likelihood footing by Goodman, becoming the categorical counterpart of factor analysis and a standard mixture-based clustering tool for discrete data.

Debates

Choosing the number of classes
Selection of the number of latent classes relies on information criteria and likelihood-ratio tests whose behavior is nonstandard, so the chosen number can be sensitive to criterion and to local maxima of the likelihood.

Key figures

  • Paul Lazarsfeld
  • Leo Goodman

Related topics

Seminal works

  • bartholomew2011
  • collins2010
  • mclachlan2000

Frequently asked questions

How does latent class analysis relate to clustering?
It is a model-based clustering method for categorical data, where each latent class is a cluster and cases receive posterior probabilities of belonging to each class.
What is the local independence assumption?
It is the assumption that the observed indicators are statistically independent within each latent class, so that any observed association between them is explained entirely by the latent classes.

Methods for this concept

Related concepts