Logistic Discrimination
Logistic discrimination classifies observations by modeling the posterior probability of each class directly as a logistic function of the features.
Definition
Logistic discrimination is a classification approach that models the conditional probability of class membership given the features through a logistic (or softmax) link, fitting the model by maximum likelihood without assuming a distribution for the features.
Scope
This topic covers binary and multinomial logistic models as discriminative classifiers, maximum-likelihood estimation of their coefficients, the linearity of the resulting log-odds and decision boundary, the contrast with generative discriminant analysis, and the interpretation of coefficients as log-odds effects.
Core questions
- How can class-membership probabilities be modeled directly from features?
- What is the shape of the decision boundary implied by the logistic model?
- How does logistic discrimination differ from Gaussian discriminant analysis?
- How are estimated coefficients interpreted?
Key theories
- Direct modeling of posterior probabilities
- Logistic discrimination specifies the log-odds of class membership as a linear function of the features and estimates it by maximum likelihood, making no assumption about the marginal distribution of the features.
- Generative-discriminative correspondence
- Under equal-covariance Gaussian classes the posterior log-odds are exactly linear, so logistic regression and linear discriminant analysis posit the same boundary form but estimate it under different assumptions and likelihoods.
Clinical relevance
Logistic discrimination is among the most widely used classifiers in applied research because it provides calibrated class probabilities and interpretable coefficients, and is robust to departures from feature normality.
History
The logistic model for binary outcomes was developed in mid-twentieth-century statistics and adapted to the classification setting as logistic discrimination, providing a discriminative counterpart to the generative discriminant-analysis tradition.
Debates
- Discriminative versus generative estimation
- Logistic discrimination optimizes conditional likelihood and tends to be more robust to feature-distribution misspecification, whereas generative discriminant analysis can be more efficient when its Gaussian assumptions hold.
Key figures
- David Cox
- Geoffrey McLachlan
Related topics
Seminal works
- hastie2009
- mclachlan1992
- johnson2007
Frequently asked questions
- Does logistic discrimination assume the features are normally distributed?
- No. It models the conditional probability of the class given the features and makes no distributional assumption about the features themselves, which is one reason for its robustness.
- How is logistic discrimination extended to more than two classes?
- Through the multinomial (softmax) logistic model, which specifies the probability of each class relative to a baseline as a normalized exponential of linear feature combinations.