Metagenomic and Whole-Genome Pathogen Identification
Metagenomic and whole-genome approaches use high-throughput sequencing to characterise pathogens at genome scale. Metagenomic sequencing reads nucleic acids directly from clinical samples without targeting a specific organism, while whole-genome sequencing reads the complete genome of a cultured isolate, supporting high-resolution identification, typing, and surveillance.
Definition
Metagenomic sequencing is the untargeted sequencing of all nucleic acids in a clinical sample to detect any organism present, whereas whole-genome sequencing is the sequencing of a single organism's complete genome, typically from a cultured isolate, for detailed characterisation.
Scope
The topic covers culture-independent metagenomic next-generation sequencing for unbiased pathogen detection and whole-genome sequencing of isolates for identification, typing, and outbreak investigation. It also notes the analytical, interpretive, and cost considerations these methods raise. It is presented as a laboratory and reference topic without treatment guidance.
Core questions
- What organisms are present in a sample when the cause is unknown or culture has failed?
- What does the complete genome of an isolate reveal about its identity, typing, and resistance?
- How are sequencing reads interpreted to separate true pathogens from background and contamination?
- When do the benefits of genome-scale sequencing justify its cost and complexity?
Key concepts
- Metagenomic next-generation sequencing (mNGS)
- Whole-genome sequencing (WGS)
- Culture-independent (untargeted) detection
- Genomic epidemiology
- Read interpretation, background, and contamination
- Bioinformatic pipelines and reference databases
- Cost-effectiveness of genome-scale methods
Mechanisms
Metagenomic sequencing extracts and sequences nucleic acids directly from a clinical sample, then uses bioinformatic pipelines to assign reads to organisms, in principle detecting bacteria, viruses, fungi, and parasites without prior hypothesis — including agents that culture poorly, as in the diagnosis of neuroleptospirosis from cerebrospinal fluid (Wilson et al., 2014). Because samples also contain host and environmental nucleic acids, interpretation must distinguish genuine pathogens from background and contamination, a central challenge in clinical use (Miller & Chiu, 2020). Whole-genome sequencing instead reads the full genome of a cultured isolate, providing the highest resolution for identification, typing, and resistance characterisation and underpinning genomic epidemiology of outbreaks (Deng et al., 2016).
Clinical relevance
Genome-scale sequencing describes how laboratories can detect unexpected or unculturable pathogens and reconstruct outbreaks at high resolution, informing diagnosis of difficult cases and infection-prevention surveillance. The topic explains how this evidence is generated and is not a basis for individual diagnostic or treatment decisions.
Epidemiology
Whole-genome sequencing has become a primary tool of genomic epidemiology, enabling fine-grained surveillance and outbreak investigation of bacterial pathogens, including foodborne and healthcare-associated organisms (Deng et al., 2016). Economic evaluations have examined whether such surveillance is cost-effective relative to traditional methods (Price et al., 2023).
Evidence & guidelines
Evidence on these methods includes proof-of-concept clinical applications of metagenomic sequencing (Wilson et al., 2014), critical appraisals of its clinical role (Miller & Chiu, 2020), reviews of whole-genome surveillance (Deng et al., 2016), and systematic review of its economic evaluations (Price et al., 2023). Validation and reporting standards for clinical sequencing assays are set by professional and regulatory bodies and are not reproduced here.
History
Genome-scale microbiology followed the falling cost of high-throughput sequencing. Whole-genome sequencing of isolates was adopted for surveillance and outbreak investigation (Deng et al., 2016), and untargeted metagenomic sequencing demonstrated its diagnostic potential in cases such as the identification of an unculturable pathogen from cerebrospinal fluid (Wilson et al., 2014), prompting ongoing debate about how and when to deploy it clinically (Miller & Chiu, 2020).
Debates
- Should metagenomic sequencing be used routinely in the clinical laboratory?
- Metagenomic sequencing can detect pathogens that other methods miss, but high cost, interpretive complexity, and the difficulty of separating true signal from background keep its routine clinical role contested.
- Is whole-genome surveillance cost-effective?
- Whole-genome sequencing offers superior resolution for surveillance, but its value relative to cheaper conventional methods depends on setting and pathogen, and economic evidence is still being assembled.
Related topics
Seminal works
- wilson-2014
- deng-2016
- miller-2020
Frequently asked questions
- How does metagenomic sequencing differ from whole-genome sequencing?
- Metagenomic sequencing reads all nucleic acids in a sample to detect any organism present without targeting one, while whole-genome sequencing reads the complete genome of a single organism, usually a cultured isolate, for detailed characterisation.
- Why is interpreting metagenomic results challenging?
- Clinical samples contain host, environmental, and contaminant nucleic acids alongside any pathogen, so distinguishing a true causative organism from background requires careful bioinformatic and clinical interpretation.