ScholarGate
Explore
LibraryBookshelfDeskPreflightAssistant
Your tools
Compare
Build your library

Save methods, organize collections, and carry them to your desk.

Create account
Library / BrowseSearch the library…⌘K
Sign in
The library

Explore science by method, field & evidence.

One catalogue of research methods — learn how each one works, when to use it, and what it can’t do.

Search methods, fields, techniques…
8,178 methods11 fields7 method families40 languages
Science atlasMap the structure of science before you use it.Fields · methods · evidence routesExplore the map
FieldHealth & Medicine716Psychology570Business & Finance410Engineering330Life Sciences263Education261Research Practice
ScholarGate

A content-first reference library for research methods — what each one is, how it works, and where it comes from.

Open data (CC-BY)

Explore

  • Library
  • Search the library…
  • Browse by field
  • Fields
  • Journey
  • Compare
  • Which method?

Reference

  • Subjects
  • Atlas
  • Glossary
  • Methodology
  • Philosophy

Your tools

  • Bookshelf
  • Desk
  • Chat

Company

  • About
  • Pricing
  • Contact
  • Suggest a method

Entries are compiled from published sources for reference. Verifying the accuracy and suitability of any information for your own use remains your responsibility.

© 2026 ScholarGate · A research-method reference library
  • Privacy
  • Cookies
  • Terms
  • Delete account
248
Natural Sciences236
Social Sciences185
Environment & Sustainability160
Law30
MethodStatistics1,836AI & ML1,661Decision Sciences932Research Methods1,354Measurement1,745Causal & Evidence532Research Practice118
112 methods in Life Sciences · AI & MLClear
Methods at the intersection of your two filters.
SortPopularityA–ZZ–ANewest
genetics

Admixture Analysis

Admixture analysis is a population genetics method that infers population structure and individual ancestry from multilocus genotype data. Originally developed by Pritchard, Stephens, and Donnelly (2000) and refined by Alexander, Novembre, and Lange (2009), admixture analysis reveals how genetic variation is distribute

3 sources2009
genetics

Ancestral State Reconstruction

Ancestral state reconstruction (ASR) is a phylogenetic method that infers the character states (trait values or evolutionary features) of extinct ancestors by analyzing patterns of variation in extant (living) species. Developed by Wayne Maddison and colleagues in the 1990s, ASR uses the phylogenetic tree and observed

3 sources1991
genetics

ATAC-seq Analysis

ATAC-seq (Assay for Transposase-Accessible Chromatin using sequencing) is a method for profiling the landscape of chromatin accessibility genome-wide. Developed by Buenrostro and colleagues in 2013, ATAC-seq uses hyperactive transposase to tag open, accessible chromatin regions, enabling rapid and sensitive identificat

3 sources2013
bioinformatics

ChIP-seq Peak Calling

ChIP-seq peak calling is a computational pipeline that identifies genomic regions where a protein of interest — a transcription factor or histone modification — is enriched, based on sequencing reads from chromatin immunoprecipitation experiments. It converts raw sequencing data into a set of high-confidence binding or

2 sources2007
genetics

Coalescent Theory

Coalescent theory is a probabilistic framework that traces the genealogical history of DNA sequences backward in time to their most recent common ancestor. Developed by John Kingman in 1982, this method forms the foundation of modern population genetics, enabling researchers to understand demographic events, estimate g

3 sources1982
bioinformatics

Copy Number Variation Analysis

Copy number variation (CNV) analysis is a genomic pipeline for detecting regions where individuals carry fewer or more copies of a DNA segment than the reference genome. CNVs span kilobases to megabases and are a major class of structural variation implicated in cancer, neurodevelopmental disorders, and population dive

2 sources1998
bioinformatics

CRISPR Screen Analysis

CRISPR screen analysis processes data from pooled genetic screens using CRISPR-Cas9 to identify genes required for cell growth, survival, or phenotype in specific conditions. Developed by Zhang, Sanjana, and others, this computational pipeline transforms sequencing readouts of guide RNA abundances into ranked lists of

3 sources2013
bioinformatics

Cryo-EM Reconstruction

Cryo-electron microscopy (cryo-EM) determines three-dimensional macromolecular structures at atomic or near-atomic resolution by imaging proteins frozen in vitreous ice. Pioneered by Frank, Henderson, and others, this technique has revolutionized structural biology by enabling visualization of large, non-crystallizable

3 sources1975
bioinformatics

De Novo Transcriptome Assembly

De novo transcriptome assembly reconstructs full-length messenger RNA sequences directly from sequencing reads without requiring a reference genome. Pioneered by Regev, Haas, and colleagues, this pipeline enables transcript discovery in non-model organisms and detection of novel isoforms, fusion genes, and splice varia

3 sources2011
bioinformatics

Differential ChIP-seq peak calling

Differential ChIP-seq peak calling identifies genomic loci where a protein of interest — typically a transcription factor or histone mark — shows significantly altered binding or occupancy between two or more biological conditions. By combining standard ChIP-seq peak detection with count-based statistical testing, the

2 sources2011
bioinformatics

Differential Copy Number Variation Analysis

Differential copy number variation (dCNV) analysis identifies genomic regions where DNA copy numbers differ systematically between two conditions — such as tumor versus normal tissue, case versus control cohorts, or treated versus untreated cells. By combining probe-level read-depth or array-intensity data with statist

2 sources2004
bioinformatics

Differential Epigenome-Wide Association Study

A Differential Epigenome-Wide Association Study (Differential EWAS) scans hundreds of thousands of CpG methylation sites across the genome to identify those whose methylation levels differ significantly between two or more comparison groups — such as cases vs. controls, exposed vs. unexposed, or distinct developmental

2 sources2009
bioinformatics

Differential eQTL Analysis

Differential eQTL analysis identifies genetic variants — expression quantitative trait loci — whose regulatory effect on gene expression varies systematically across biological conditions such as tissue types, disease states, developmental stages, or treatment groups. By testing for statistical interactions between gen

2 sources2007
bioinformatics

Differential Metabolomics Analysis

Differential metabolomics analysis is a computational pipeline that identifies metabolites whose abundance levels differ significantly between two or more biological conditions — such as disease versus control, treated versus untreated, or different developmental stages. By integrating mass spectrometry or NMR data wit

2 sources2000
bioinformatics

Differential pathway enrichment analysis

Differential pathway enrichment analysis identifies biological pathways whose enrichment signals differ significantly between two or more experimental conditions — for example, between two diseases, two treatments, or two cell types. Rather than asking which pathways are enriched in one condition, it asks which pathway

2 sources2004
bioinformatics

Differential proteomics analysis

Differential proteomics analysis is a quantitative pipeline that identifies proteins whose abundance levels change significantly between two or more biological conditions — such as healthy versus diseased tissue, treated versus untreated cells, or different developmental stages. By combining mass spectrometry-based det

2 sources1990
bioinformatics

Differential single-cell RNA-seq analysis

Differential single-cell RNA-seq (scRNA-seq) analysis is a computational pipeline that compares transcriptomic profiles across biological conditions — such as treated versus untreated, disease versus healthy, or time points — at single-cell resolution. It identifies which genes, cell types, and cell states change betwe

2 sources2015
bioinformatics

Differential Variant Calling

Differential variant calling is a bioinformatics pipeline that identifies genetic variants — single nucleotide variants (SNVs), small insertions/deletions (indels), and structural variants — that are present in one biological sample or condition but absent (or significantly enriched) in a paired reference sample. The c

2 sources2009
bioinformatics

Epigenome-wide association study

An epigenome-wide association study (EWAS) is a hypothesis-free, genome-scale method that systematically tests whether epigenetic marks — predominantly CpG-site DNA methylation — differ between individuals with and without a trait, disease, or exposure. By scanning hundreds of thousands of genomic positions simultaneou

2 sources2008
bioinformatics

Epigenome-wide association study in educational research

An epigenome-wide association study (EWAS) applied to educational research scans DNA methylation levels at hundreds of thousands of CpG sites across the genome to identify loci whose methylation is statistically associated with educational attainment, cognitive ability, or related learning outcomes. By linking blood- o

2 sources2011
bioinformatics

eQTL Analysis

eQTL analysis identifies genomic loci (variants, typically SNPs) whose genotype statistically associates with variation in the expression level of one or more genes. By jointly profiling DNA-level variation and RNA-level expression in the same individuals, eQTL studies decode the regulatory grammar of the genome — reve

2 sources2001
genetics

F-statistics (FST)

F-statistics are a family of measures developed by Sewall Wright to quantify population genetic structure and the degree of genetic differentiation between populations. FST, the most widely used F-statistic, measures the proportion of total genetic variation attributable to differences between populations versus within

3 sources1951
genetics

GCTA

GCTA (Genome-wide Complex Trait Analysis) is a computational toolkit for estimating heritability and genetic correlations from genome-wide genotype and phenotype data. Developed by Yang and Visscher in 2011, GCTA uses genome-wide restricted maximum likelihood (GREML) to partition phenotypic variance into components exp

3 sources2011
bioinformatics

Gene Set Enrichment Analysis

Gene Set Enrichment Analysis (GSEA) is a computational method that determines whether a predefined set of genes — representing a biological pathway, process, or function — shows statistically significant, coordinated differences between two biological conditions. Unlike simple fold-change filtering, GSEA operates on al

2 sources2005
bioinformatics

Genome-wide association study

A genome-wide association study (GWAS) systematically tests hundreds of thousands to millions of single-nucleotide polymorphisms (SNPs) across the human genome for statistical association with a trait or disease. By comparing allele frequencies between cases and controls — or by regressing SNP genotypes on a quantitati

2 sources2005
bioinformatics

Genome-wide association study in educational research

A genome-wide association study (GWAS) applied to educational research scans millions of single-nucleotide polymorphisms (SNPs) across the human genome to identify genetic variants statistically associated with educational outcomes such as years of schooling, degree attainment, or cognitive test scores. Large consortia

2 sources2013
genetics

Hi-C Analysis

Hi-C (High-Chromosome Conformation Capture) is a technique and associated computational methods for mapping the 3D architecture of the genome within cells. Developed by Lieberman-Aiden and Dekker in 2009, Hi-C identifies physical interactions between genomic regions that may be distant in linear sequence but spatially

3 sources2009
genetics

HKA Test

The Hudson-Kreitman-Aguade (HKA) test is a statistical method that tests for neutral evolution by comparing levels of within-population polymorphism and between-population divergence at multiple loci. Developed by Hudson, Kreitman, and Aguade in 1987, this test uses the principle that neutral loci should show expected

3 sources1987
bioinformatics

HMMER Profile Search

HMMER profile search identifies distant protein sequence homologs using probabilistic models of protein families, known as profile Hidden Markov Models (HMMs). Developed by Eddy and colleagues, this method captures sequence variation patterns within protein families and detects homologs with far greater sensitivity tha

3 sources1994
bioinformatics

Homology Modeling

Homology modeling, also called comparative modeling, predicts the three-dimensional structure of a protein using an experimentally-solved structure of a homologous protein as a template. Introduced by Sali and Blundell in 1993, this method exploits the principle that homologous proteins share similar spatial structures

3 sources1993
genetics

IBD Mapping

Identity-by-descent (IBD) mapping is a genetic mapping technique that identifies disease loci in consanguineous families or isolated populations by detecting homozygous chromosomal segments shared among affected individuals. Developed by Lander and Botstein in 1987, this method exploits the fact that rare disease allel

3 sources1987
genetics

LD Block Analysis

Linkage disequilibrium (LD) block analysis is a genomic method that partitions the human genome into distinct haplotype blocks—regions of limited recombination where variants are in strong statistical association. First systematically described by Gabriel and colleagues in 2002, this approach reveals the underlying str

3 sources2002
bioinformatics

Machine learning-assisted ChIP-seq peak calling

Machine learning-assisted ChIP-seq peak calling extends classical statistical peak detection with supervised or unsupervised learning models that distinguish genuine protein-binding sites from background noise. By training on sequence composition, read coverage profiles, and epigenomic features, these methods improve s

2 sources2008
bioinformatics

Machine learning-assisted copy number variation analysis

Machine learning-assisted CNV analysis applies supervised, unsupervised, or deep learning algorithms to detect genomic regions that are duplicated or deleted relative to a reference genome. Rather than relying on fixed statistical thresholds, ML models learn discriminative patterns from read-depth signals, allele frequ

2 sources2010
bioinformatics

Machine learning-assisted epigenome-wide association study

Machine learning-assisted EWAS integrates conventional epigenome-wide association testing with machine learning models to identify DNA methylation sites associated with a phenotype of interest. By combining the statistical rigour of EWAS with the pattern-recognition power of algorithms such as elastic net, random fores

2 sources2010
bioinformatics

Machine learning-assisted expression quantitative trait loci analysis

Machine learning-assisted eQTL analysis integrates supervised learning models — ranging from elastic-net regression to deep neural networks — into the classical eQTL framework to predict and map genetic variants that regulate gene expression. By training predictive models on reference panels (e.g., GTEx), the approach

2 sources2015
bioinformatics

Machine learning-assisted gene set enrichment analysis

Machine learning-assisted gene set enrichment analysis (ML-GSEA) extends the classical GSEA framework by incorporating supervised or unsupervised ML models — such as random forests, neural networks, or deep learning architectures — to improve the detection, ranking, and biological interpretation of enriched gene sets f

2 sources2005
bioinformatics

Machine learning-assisted genome-wide association study

Machine learning-assisted GWAS integrates classical genome-wide association testing with machine learning models to improve the detection of genetic variants associated with complex traits. Where traditional GWAS tests each single nucleotide polymorphism (SNP) independently using linear or logistic regression, ML-GWAS

2 sources2015
bioinformatics

Machine learning-assisted metabolomics analysis

Machine learning-assisted metabolomics analysis is an integrative bioinformatics pipeline that couples untargeted or targeted metabolite profiling — via mass spectrometry or NMR — with supervised and unsupervised ML algorithms to discover biomarkers, classify phenotypes, and model metabolic states. By handling the extr

2 sources2000
bioinformatics

Machine learning-assisted microbiome diversity analysis

Machine learning-assisted microbiome diversity analysis integrates classical alpha and beta diversity metrics with supervised or unsupervised ML models to classify host phenotypes, identify discriminant taxa, and uncover community-level signatures from 16S rRNA or shotgun metagenomic data. It extends traditional divers

2 sources2011
bioinformatics

Machine learning-assisted pathway enrichment analysis

Machine learning-assisted pathway enrichment analysis integrates classical statistical pathway enrichment methods — such as over-representation analysis or gene set enrichment analysis — with machine learning algorithms to improve sensitivity, handle high-dimensional omics data, and uncover non-linear biological patter

2 sources2010
bioinformatics

Machine learning-assisted phylogenetic analysis

Machine learning-assisted phylogenetic analysis integrates supervised, unsupervised, or deep learning models into the evolutionary tree inference workflow to improve speed, accuracy, or scalability beyond what classical maximum-likelihood and Bayesian methods achieve alone. Applications range from substitution model se

2 sources2000
bioinformatics

Machine learning-assisted RNA-seq differential expression

Machine learning-assisted RNA-seq differential expression analysis augments classical statistical DE testing (DESeq2, edgeR, limma-voom) with ML models — including neural networks, random forests, and variational autoencoders — to better handle the high dimensionality, zero-inflation, and batch effects inherent in RNA-

2 sources2015
bioinformatics

Machine learning-assisted sequence alignment

Machine learning-assisted sequence alignment uses statistical learning models — including deep neural networks and protein language models — to compute biologically meaningful alignments between nucleotide or amino acid sequences. By learning substitution patterns and structural constraints from large training corpora,

2 sources2010
bioinformatics

Machine learning-assisted single-cell RNA-seq analysis

Machine learning-assisted single-cell RNA sequencing (scRNA-seq) analysis integrates supervised, unsupervised, and deep generative models into the standard scRNA-seq workflow to handle the unique challenges of single-cell data: extreme sparsity, high dimensionality, technical noise, and batch effects across experiments

2 sources2015
bioinformatics

Machine learning-assisted variant calling

Machine learning-assisted variant calling uses statistical learning models — most notably convolutional neural networks — to distinguish genuine genomic variants (SNPs, indels) from sequencing artifacts in aligned short- or long-read data. Unlike heuristic callers that rely on hand-crafted filters, ML-based approaches

2 sources2018
genetics

McDonald-Kreitman Test

The McDonald-Kreitman (MK) test is a statistical method for detecting adaptive evolution by comparing ratios of synonymous and nonsynonymous substitutions within and between species. Developed by James McDonald and Martin Kreitman in 1991, this test exploits the key insight that neutral mutations accumulate at similar

3 sources1991
bioinformatics

Metabolomics analysis

Metabolomics analysis is the large-scale, systematic measurement of small-molecule metabolites in a biological sample to characterise the metabolome — the complete set of metabolic intermediates and products present under defined conditions. By coupling high-throughput analytical platforms such as mass spectrometry (MS

2 sources1998
bioinformatics

Metagenomic Binning

Metagenomic binning partitions assembled contigs from complex microbial communities into distinct genome bins, each representing an individual organism or strain. Pioneered by Banfield and colleagues, this pipeline isolates single-organism genomes (metagenome-assembled genomes or MAGs) from environmental samples withou

3 sources2011
bioinformatics

Molecular Docking

Molecular docking predicts the preferred binding orientation and affinity of a ligand (small molecule) within a protein binding pocket. Pioneered by Kuntz and colleagues in 1982, this computational method searches conformational space to find energetically favorable ligand-protein complexes, enabling rapid screening of

3 sources1982
bioinformatics

Multi-omics epigenome-wide association study

A multi-omics epigenome-wide association study (multi-omics EWAS) systematically scans the entire epigenome — typically DNA methylation at CpG sites — for associations with a phenotype of interest, then integrates findings across additional omics layers such as transcriptomics, genomics, proteomics, or metabolomics. By

2 sources2011
bioinformatics

Multi-omics eQTL analysis

Multi-omics eQTL analysis maps genetic variants (SNPs or structural variants) to molecular phenotypes simultaneously across multiple omics layers — transcriptome, epigenome, proteome, and metabolome — in the same cohort. By linking genotype to gene expression and then tracing those effects through downstream molecular

2 sources2010
bioinformatics

Multi-omics gene set enrichment analysis

Multi-omics gene set enrichment analysis (multi-omics GSEA) is a computational pipeline that applies GSEA logic simultaneously across two or more molecular measurement layers — such as transcriptomics, proteomics, and metabolomics — to identify biological pathways or gene sets that are coordinately dysregulated across

2 sources2005
bioinformatics

Multi-omics metabolomics analysis

Multi-omics metabolomics analysis integrates metabolite profiling data — derived from mass spectrometry or NMR spectroscopy — with genomic, transcriptomic, and/or proteomic datasets to build a system-level view of biological phenotypes. By anchoring integration on the metabolome, which reflects the downstream functiona

2 sources2000
bioinformatics

Multi-omics microbiome diversity analysis

Multi-omics microbiome diversity analysis integrates two or more omic data layers — such as metagenomics, metatranscriptomics, metabolomics, and metaproteomics — to characterise both the composition and functional activity of microbial communities. By linking taxonomic diversity metrics with molecular phenotype data, t

2 sources2010
bioinformatics

Multi-omics Pathway Enrichment Analysis

Multi-omics pathway enrichment analysis is a bioinformatics pipeline that integrates molecular data from two or more omics layers — such as transcriptomics, proteomics, metabolomics, and epigenomics — and tests whether the combined signal from those layers converges on specific biological pathways more than expected by

2 sources2014
bioinformatics

Multi-omics Phylogenetic Analysis

Multi-omics phylogenetic analysis reconstructs evolutionary relationships among organisms by integrating sequence data from multiple molecular layers — genomes, transcriptomes, and proteomes — rather than relying on a single marker gene. By combining thousands of orthologous loci across omics layers, the approach drama

2 sources1990
bioinformatics

Multi-omics proteomics analysis

Multi-omics proteomics analysis integrates protein abundance data from mass spectrometry with at least one additional omics layer — such as genomics, transcriptomics, or metabolomics — to build a systems-level view of biological regulation. Rather than analyzing proteins in isolation, this approach correlates proteomic

2 sources2010
bioinformatics

Multi-omics RNA-seq differential expression

Multi-omics RNA-seq differential expression analysis combines transcript-level count data from RNA sequencing with one or more additional omics layers — such as proteomics, metabolomics, epigenomics, or genomic variant data — to identify genes, proteins, or metabolites that differ systematically between biological cond

2 sources2010
bioinformatics

Multi-omics single-cell RNA-seq analysis

Multi-omics single-cell RNA-seq analysis integrates two or more molecular layers — such as gene expression (scRNA-seq), chromatin accessibility (scATAC-seq), or surface protein abundance (CITE-seq) — measured simultaneously or co-profiled in the same individual cells. By aligning these modalities in a shared low-dimens

2 sources2015