What is the difference between a prognostic and a predictive marker?

A prognostic marker estimates the likely course of disease independent of treatment, while a predictive marker estimates whether a patient is likely to benefit from a particular therapy; a single signature may be one, the other, or both, and this must be established by appropriate study designs.

Why must a gene signature be validated in independent cohorts?

Signatures are derived by fitting models to data, which can capture noise specific to the training set; testing the signature in separate cohorts shows whether its association with outcome is real and reproducible.

Gene Expression Signatures and Prognostic Markers

A gene expression signature is a defined set of genes whose combined expression pattern carries information about a biological state, such as the likely course of a disease. In oncology, multigene signatures derived from quantitative expression measurements are used as prognostic markers that summarise tumour biology into a risk estimate.

מציאת נושא עם PaperMindבקרובFind papers & topics

Tools & resources

הורדת מצגת

Learn & explore

וידאובקרוב

Definition

A gene expression signature is a multigene pattern of expression, measured quantitatively, that is associated with a defined outcome or phenotype; when used to estimate the natural course of disease it serves as a prognostic marker.

Scope

This topic covers how multigene signatures are derived from expression data, the distinction between prognostic and predictive markers, the validation steps required before a signature is clinically credible, and landmark breast-cancer signatures as illustrations of the concept. It addresses signatures as a methodological and evidence topic, not as a source of individual treatment decisions.

Core questions

How is a multigene signature derived and reduced to a risk score?
What is the difference between a prognostic and a predictive marker?
What validation is needed before a signature is trustworthy?
How do signatures relate to the underlying expression-measurement platform?

Key concepts

Multigene signature
Prognostic versus predictive markers
Risk score and classification
Training and independent validation
Overfitting and reproducibility

Mechanisms

Signatures are built by measuring expression across many genes in samples with known outcomes, then using statistical learning to select a subset of genes and weights that best separate outcome groups; the result is a classifier or continuous risk score. Because such models can fit noise in the training data, independent validation in separate cohorts is essential, and a marker that estimates outcome regardless of therapy (prognostic) is distinguished from one that predicts benefit from a specific treatment (predictive). The 70-gene breast-cancer signature (van 't Veer et al., 2002) and the 21-gene recurrence score (Paik et al., 2004) are foundational examples derived this way, and prospective evaluation (Cardoso et al., 2016) illustrates the level of evidence needed to support clinical use.

Clinical relevance

Expression signatures are reported as prognostic information in some cancers, and understanding their derivation and validation is part of appraising such reports. This entry explains the methodology and evidence behind signatures; it does not recommend specific tests, thresholds, or treatments, which depend on validated assays and clinical guidelines applied by clinicians.

Evidence & guidelines

The concept is anchored by the 70-gene signature (van 't Veer et al., 2002) and the 21-gene recurrence score (Paik et al., 2004), with prospective evaluation of a signature-guided strategy reported in a randomised trial (Cardoso et al., 2016). These works also illustrate the progression from discovery to independent and prospective validation.

History

The idea that expression patterns could predict outcome was demonstrated in breast cancer in the early 2000s, first by the 70-gene signature (van 't Veer et al., 2002) and then by the 21-gene recurrence score (Paik et al., 2004). Subsequent prospective trials, such as the evaluation of the 70-gene signature (Cardoso et al., 2016), addressed whether acting on a signature improves decision-making.

Debates

How much validation is enough before a signature is reliable?: Signatures derived by statistical learning risk overfitting and can fail to reproduce in new cohorts, so independent and ideally prospective validation is required before a signature is considered credible; how much evidence suffices remains an active question.

Key figures

Laura van 't Veer
Soonmyung Paik
Fatima Cardoso

Seminal works

vantveer-2002
paik-2004
cardoso-2016

Frequently asked questions

What is the difference between a prognostic and a predictive marker?: A prognostic marker estimates the likely course of disease independent of treatment, while a predictive marker estimates whether a patient is likely to benefit from a particular therapy; a single signature may be one, the other, or both, and this must be established by appropriate study designs.
Why must a gene signature be validated in independent cohorts?: Signatures are derived by fitting models to data, which can capture noise specific to the training set; testing the signature in separate cohorts shows whether its association with outcome is real and reproducible.