What is the difference between dimension reduction and variable selection?

Variable selection keeps a subset of the original variables, whereas dimension reduction typically constructs new derived variables (such as components or factors) that are combinations of all the originals.

Is dimension reduction always linear?

No. The classical core methods are linear, but the same goals are pursued by nonlinear manifold-learning and embedding techniques; the linear methods remain foundational and interpretable.

Dimension Reduction

Dimension reduction comprises the multivariate methods that summarize many correlated variables with a small number of derived quantities, preserving as much structure as possible while easing interpretation and visualization.

Trova un argomento con PaperMindIn arrivoFind papers & topics

Tools & resources

Scarica le diapositive

Learn & explore

VideoIn arrivo

Definition

Dimension reduction is the construction of a lower-dimensional representation of multivariate data that retains a chosen criterion of information, such as variance, reconstruction error, pairwise distance, or inter-set correlation.

Scope

This area covers techniques that map high-dimensional observations into a lower-dimensional space. It includes variance-maximizing linear projections (principal component analysis), latent-factor models for shared covariance (factor analysis), distance-preserving embeddings (multidimensional scaling), and methods that reduce two variable sets jointly by maximizing cross-correlation (canonical correlation analysis). The emphasis is on linear and classical approaches that form the foundation of the discipline; nonlinear manifold learning is treated as an extension.

Sub-topics

Core questions

How can a large set of correlated measurements be replaced by a few uncorrelated derived variables with minimal loss of information?
When should variance preservation, distance preservation, or latent-factor explanation be the reduction criterion?
How many dimensions are needed to adequately represent the data?
How do reduced representations support visualization, denoising, and downstream modeling?

Key theories

Variance-maximizing linear projection: The leading principal axes are the orthonormal directions that successively capture maximal variance, equivalent to the eigenvectors of the covariance matrix and to the best low-rank least-squares approximation of the data.
Latent common-factor model: Observed correlations among variables are explained by a smaller number of unobserved common factors plus variable-specific uniqueness, decomposing the covariance structure into shared and unique parts.

Clinical relevance

Dimension reduction underpins exploratory data analysis, data visualization, signal denoising, compression, and the preprocessing of features for regression and classification across fields from genomics to econometrics and image analysis.

History

The variance-maximizing view originated with Pearson's 1901 geometric formulation of lines and planes of closest fit, and was developed into the modern statistical method of principal components by Hotelling in 1933. Factor analysis grew in parallel from psychometrics, and distance-based scaling and canonical correlation followed, consolidating into the unified treatment of dimension reduction found in mid-twentieth-century multivariate texts.

Key figures

Karl Pearson
Harold Hotelling

Seminal works

pearson1901
mardia1979
johnson2007

Frequently asked questions

What is the difference between dimension reduction and variable selection?: Variable selection keeps a subset of the original variables, whereas dimension reduction typically constructs new derived variables (such as components or factors) that are combinations of all the originals.
Is dimension reduction always linear?: No. The classical core methods are linear, but the same goals are pursued by nonlinear manifold-learning and embedding techniques; the linear methods remain foundational and interpretable.