ScholarGate
Explore
LibraryBookshelfDeskPreflightAssistant
Your tools
Compare
Build your library

Save methods, organize collections, and carry them to your desk.

Create account
Library / BrowseSearch the library…⌘K
Sign in
The library

Explore science by method, field & evidence.

One catalogue of research methods — learn how each one works, when to use it, and what it can’t do.

Search methods, fields, techniques…
8,178 methods11 fields7 method families40 languages
Science atlasMap the structure of science before you use it.Fields · methods · evidence routesExplore the map
FieldHealth & Medicine716Psychology570Business & Finance410Engineering330Life Sciences263Education261Research Practice248Natural Sciences
ScholarGate

A content-first reference library for research methods — what each one is, how it works, and where it comes from.

Open data (CC-BY)

Explore

  • Library
  • Search the library…
  • Browse by field
  • Fields
  • Journey
  • Compare
  • Which method?

Reference

  • Subjects
  • Atlas
  • Glossary
  • Methodology
  • Philosophy

Your tools

  • Bookshelf
  • Desk
  • Chat

Company

  • About
  • Pricing
  • Contact
  • Suggest a method

Entries are compiled from published sources for reference. Verifying the accuracy and suitability of any information for your own use remains your responsibility.

© 2026 ScholarGate · A research-method reference library
  • Privacy
  • Cookies
  • Terms
  • Delete account
236
Social Sciences185
Environment & Sustainability160
Law30
MethodStatistics1,836AI & ML1,661Decision Sciences932Research Methods1,354Measurement1,745Causal & Evidence532Research Practice118
261 methods in EducationClear
Real methods matching your filter.
SortPopularityA–ZZ–ANewest
psychometrics

2PL IRT

The two-parameter logistic item response model, formalised by Frederic Lord (1980), describes the probability that a respondent answers a binary test item correctly as a smooth S-shaped function of the respondent's latent ability. By estimating a separate discrimination parameter for each item alongside a difficulty pa

2 sources1980
psychometrics

3PL IRT

The three-parameter logistic (3PL) model, introduced by Allan Birnbaum in 1968, is an item response theory model that describes the probability of a correct response to a binary test item as a function of three item-level parameters — difficulty, discrimination, and a lower asymptote representing guessing — and one per

2 sources1968
educational psychology

Academic Burnout Scale

The Academic Burnout Scale measures three dimensions of student burnout: emotional exhaustion, cynicism toward studies, and reduced academic efficacy. Developed by Schaufeli and colleagues in 2002, it adapts the Maslach Burnout Inventory framework to the academic context, providing researchers and educators with a vali

2 sources2002
educational psychology

Academic Help-Seeking Scale

The Academic Help-Seeking Scale measures students' inclination to seek academic help, their preferred sources of assistance (instructors, peers, tutors), and barriers that inhibit help-seeking (fear of judgment, embarrassment, preference for independence). Developed by Karabenick and colleagues in the 1990s, the AHSS r

2 sources1990
educational psychology

Academic Integrity Scale

The Academic Integrity Scale measures students' attitudes, values, and likelihood of engaging in academic dishonesty including cheating, plagiarism, and unauthorized collaboration. Multiple validated versions exist, each assessing different facets of academic integrity such as personal integrity commitment, perceived c

2 sources2000
educational psychology

Academic Motivation Scale

The Academic Motivation Scale (AMS) is a 28-item self-report instrument developed by Vallerand et al. (1992) to assess the quality of students' academic motivation. It distinguishes between intrinsic motivation (motivation for knowledge, accomplishment, and stimulation), extrinsic motivation (external regulation, intro

2 sources1992
educational psychology

Academic Resilience Scale

The Academic Resilience Scale measures the capacity of students to withstand and recover from academic adversity, including setbacks, failures, and difficult transitions. Developed by Cassidy in 2016, the ARS-30 conceptualizes resilience as a dynamic, multidimensional process involving perseverance, adaptive help-seeki

2 sources2016
educational psychology

Academic Self-Efficacy Scale

The Academic Self-Efficacy Scale (ASES) measures students' beliefs about their capability to succeed in academic tasks. Grounded in Bandura's social cognitive theory, the instrument assesses perceived competence in diverse academic domains—understanding lectures, completing assignments, performing on exams, and engagin

2 sources1977
psychometrics

Anchor-Based Minimal Important Difference

The anchor-based method for establishing Minimal Clinically Important Difference (MCID) is a technique for determining the smallest change in a patient-reported outcome (PRO) that patients or clinicians perceive as meaningful or important. Pioneered by Guyatt, Jaeschke, and Singer in 1989, this approach anchors changes

3 sources1989
education

Angoff Standard Setting

The Angoff method is a test-centered procedure for establishing a passing score (cut score) on an examination. A panel of content experts conceptualizes a 'borderline' or minimally competent examinee and, for each item, estimates the probability that such an examinee would answer it correctly. Summing those probabiliti

2 sources1971
psychometrics

Bayesian Confirmatory Factor Analysis

Bayesian confirmatory factor analysis tests a pre-specified factor structure using Bayesian inference. Instead of point estimates with p-values, it produces full posterior distributions for loadings, factor correlations, and residual variances, allowing the researcher to incorporate prior knowledge and propagate parame

2 sources2007
psychometrics

Bayesian Construct Validity

Bayesian construct validity assessment uses Bayesian confirmatory factor analysis and related Bayesian structural equation models to evaluate whether a scale or test measures the intended latent construct. It yields full posterior distributions for factor loadings, structural coefficients, and model-fit indices rather

2 sources1955
psychometrics

Bayesian Convergent Validity

Bayesian convergent validity applies Bayesian statistical inference to assess whether different measures of the same construct converge as theory predicts. Rather than a single-point correlation estimate, it yields a full posterior distribution over the convergent correlation, enabling probability statements about the

2 sources2000
psychometrics

Bayesian Cronbach's alpha

Bayesian Cronbach's alpha applies Bayesian inference to estimate the classical internal-consistency coefficient, yielding a full posterior distribution over alpha rather than a single point estimate. This allows researchers to quantify uncertainty with credible intervals and incorporate prior knowledge, making reliabil

2 sources2011
psychometrics

Bayesian Differential Item Functioning

Bayesian differential item functioning analysis detects whether a test item behaves differently across demographic or cultural groups — such as males vs. females — after accounting for the underlying ability or trait being measured. It applies Bayesian IRT estimation to obtain posterior distributions of item parameters

2 sources1990
psychometrics

Bayesian Discriminant Validity

Bayesian discriminant validity assessment evaluates whether two theoretically distinct latent constructs are empirically separable, using posterior distributions and credible intervals rather than single-point null-hypothesis tests. It is applied within Bayesian confirmatory factor analysis or via the Bayesian heterotr

2 sources2020
psychometrics

Bayesian EFA

Bayesian exploratory factor analysis applies a full probabilistic framework to the common factor model. By placing prior distributions over factor loadings and unique variances, it yields posterior distributions rather than point estimates, quantifies uncertainty around every loading, and can treat the number of factor

2 sources2004
psychometrics

Bayesian Item Analysis

Bayesian item analysis applies Bayesian inference to estimate item-level statistics — difficulty, discrimination, and distractor effectiveness — by combining observed response data with prior knowledge. It produces full posterior distributions over item parameters rather than single point estimates, providing richer un

2 sources1990
education

Bayesian Knowledge Tracing

Bayesian knowledge tracing (BKT) is a model that estimates, after each problem a student attempts, the probability that the student has mastered the underlying skill. Introduced by Corbett and Anderson for intelligent tutoring systems, it is a two-state hidden Markov model: the latent variable is whether the skill is l

2 sources1994
psychometrics

Bayesian McDonald's omega

Bayesian McDonald's omega applies Bayesian statistical estimation to the omega reliability coefficient, yielding a full posterior distribution over omega rather than a single point estimate. This provides credible intervals and probabilistic uncertainty quantification for the reliability of a composite or scale score,

2 sources1999
psychometrics

Bayesian Measurement Invariance

Bayesian measurement invariance testing evaluates whether a scale's factor loadings and item intercepts are equivalent across groups, using a Bayesian framework that allows parameters to deviate from strict equality by a small, probabilistically specified amount rather than imposing an exact constraint.

2 sources2013
psychometrics

Bayesian Scale Development

Bayesian scale development applies Bayesian statistical inference to the construction and evaluation of psychometric scales. Rather than relying on single point estimates of item and person parameters, it produces full posterior distributions that quantify uncertainty, incorporate prior knowledge, and support principle

2 sources1990
psychometrics

Bifactor Model

The bifactor measurement model specifies that every indicator loads simultaneously on a single general factor and on one of several specific (group) factors. Formally introduced by Holzinger and Swineford in 1937 and brought into mainstream psychometrics by Reise (2012), it is now the standard tool for evaluating wheth

2 sources1937
education

Bookmark Standard Setting

The Bookmark method is an item-response-theory-based standard-setting procedure in which test items are arranged in a booklet ordered from easiest to hardest. Panelists page through this ordered item booklet and place a 'bookmark' at the point separating items a borderline examinee would likely master from those they w

2 sources2001
psychometrics

Case-Cohort Design

Case-cohort design is an epidemiological study design developed by Prentice (1986) that efficiently combines features of case-control and cohort studies. Researchers enroll an entire cohort, follow it for outcomes, then measure exposures only on cases and a random subcohort, reducing measurement costs while maintaining

3 sources1986
psychometrics

CAT Cronbach's Alpha

Cronbach's alpha applied to computerized adaptive test (CAT) data estimates internal consistency reliability under the special condition that different examinees receive different subsets of items. Because the classic formula assumes every respondent answers the same items, its direct application to CAT data violates c

2 sources1984
psychometrics

CAT Generalizability Theory

Generalizability theory (G-theory) applied to computerized adaptive testing (CAT) evaluates the dependability of adaptive test scores by decomposing score variance across measurement facets such as persons, items, and occasions. Unlike classical test theory, G-theory quantifies multiple simultaneous sources of measurem

2 sources1972
psychometrics

CAT McDonald's Omega

McDonald's omega adapted for computerized adaptive testing (CAT) quantifies the reliability of ability or trait estimates when different examinees answer different subsets of items. Unlike Cronbach's alpha, omega is grounded in a factor model, making it suitable for the heterogeneous item pools and variable test length

2 sources1999
psychometrics

CAT Scale Development

Computerized adaptive test (CAT) scale development is the process of constructing, calibrating, and validating a large item bank such that the assessment algorithm can select items tailored to each examinee's estimated ability or trait level in real time. The result is a measurement instrument that achieves high precis

2 sources1970
psychometrics

CAT Test-Retest Reliability

Computerized adaptive test (CAT) test-retest reliability quantifies the consistency of ability estimates obtained when the same examinees complete a CAT on two separate occasions. Because adaptive algorithms tailor each examinee's item set individually, traditional reliability frameworks must be adapted to account for

2 sources1970
psychometrics

CAT-DIF

CAT-DIF identifies items in a computerized adaptive test that behave differently across demographic or group subpopulations after controlling for overall ability. Because adaptive algorithms select items non-randomly based on each examinee's estimated proficiency, standard DIF detection methods require adjustment befor

2 sources1990
psychometrics

CFA — Scale Validation

Confirmatory factor analysis is a measurement modelling technique that tests whether a hypothesised factor structure — typically derived from theory or an earlier exploratory analysis — fits observed data from a new sample. Developed by Karl Jöreskog in 1969, it became the dominant tool for validating psychological sca

2 sources1969
educational psychology

Classroom Environment Scale

The Classroom Environment Scale is a comprehensive instrument measuring the social, emotional, and organizational climate of educational settings. Developed by Moos and Trickett in 1974, the CES assesses students' or teachers' perceptions of classroom relationships, instructional climate, and classroom management. By p

2 sources1974
education

Classroom Observation Protocol

A classroom observation protocol is a standardized instrument for measuring teaching by having trained observers rate lessons against defined dimensions of practice. Unlike informal walkthroughs, validated protocols such as the Classroom Assessment Scoring System (CLASS) and the Danielson Framework specify what to look

2 sources2009
health education

CLES+T

The CLES+T is a 34-item self-report questionnaire measuring nursing students' perceptions of their clinical learning environment and the quality of supervision received from their clinical preceptor or teacher. Originally developed by Saarikoski and colleagues in 2007 and expanded in 2008 to include a specific teacher

2 sources2007
psychometrics

Cognitive Diagnosis Model

Cognitive Diagnosis Models (CDMs) are a family of latent variable models designed to classify examinees according to their mastery of a set of discrete cognitive attributes or skills. The Generalized DINA (G-DINA) framework, introduced by Jimmy de la Torre in 2011, provides a unifying structure that encompasses many sp

1 source2011
psychometrics

Cognitive Diagnostic Computerized Adaptive Testing

Cognitive Diagnostic Computerized Adaptive Testing (CD-CAT) combines computerized adaptive testing (CAT) with cognitive diagnostic models (CDMs) to efficiently assess students' specific skill profiles. Rather than producing a single overall ability score, CD-CAT adaptively selects items to quickly identify which skills

3 sources2007
education

Cognitive Diagnostic Modeling

Cognitive diagnostic models (CDMs), also called diagnostic classification models, are restricted latent class models that report not a single ability score but a profile of which discrete skills or attributes a student has mastered. Each item is linked to the attributes it requires through a Q-matrix, and the model cla

2 sources2010
psychometrics

Computerized adaptive test construct validity

Construct validity in computerized adaptive testing evaluates whether the latent trait estimates produced by a CAT instrument genuinely measure the intended psychological or educational construct. Because adaptive algorithms select items individually for each examinee, the validity evidence gathered must account for th

2 sources1989
psychometrics

Computerized Adaptive Test Content Validity

Content validity in computerized adaptive testing (CAT) ensures that an adaptively administered assessment adequately samples the intended content domain despite delivering only a subset of items to each examinee. It integrates classical content validity methods with CAT-specific item bank design and content balancing

2 sources1975
psychometrics

Computerized Adaptive Test Convergent Validity

Convergent validity assessment for computerized adaptive tests (CATs) examines whether the ability or trait estimates produced by an adaptive algorithm correlate substantially with scores from other measures of the same construct. Because each examinee receives a different subset of items in a CAT, demonstrating that t

2 sources1989
psychometrics

Computerized adaptive test discriminant validity

Discriminant validity in computerized adaptive testing (CAT) is the evaluation process confirming that a CAT-administered scale measures its intended construct distinctly from related but conceptually different constructs. Despite the adaptive item-selection mechanism varying each respondent's item set, evidence must b

2 sources1959
psychometrics

Computerized adaptive test item analysis

Computerized adaptive test item analysis evaluates and calibrates items intended for use in adaptive testing environments. Unlike fixed-form analysis, it accounts for the non-random item exposure inherent in adaptive administration, using item response theory to estimate item parameters, information functions, and expo

2 sources1970
psychometrics

Computerized adaptive test item response theory

Computerized adaptive testing based on item response theory is a sequential measurement procedure in which a computer algorithm selects successive test items tailored to each examinee's estimated ability level. Drawing on IRT to model item characteristics and ability estimation, CAT delivers precise scores with far few

2 sources1970
psychometrics

Computerized adaptive test measurement invariance

Computerized adaptive test measurement invariance evaluates whether a CAT instrument measures the same latent construct with the same psychometric properties across different groups (e.g., gender, language, clinical vs. community) or time points. It combines IRT-based adaptive test frameworks with measurement equivalen

2 sources1990
psychometrics

Computerized adaptive test Rasch model

Computerized adaptive testing with the Rasch model selects items in real time based on each examinee's evolving ability estimate, so that every person receives a test precisely calibrated to their proficiency level. The result is a shorter, more efficient measurement instrument that loses none of the precision of a ful

2 sources1960
psychometrics

Computerized adaptive test reliability analysis

CAT reliability analysis quantifies measurement precision in computerized adaptive tests where each examinee receives a unique, individually tailored subset of items. Rather than a single classical coefficient, it uses item response theory to express precision as conditional standard error of measurement at each abilit

2 sources1970
psychometrics

Computerized Adaptive Testing

Computerized Adaptive Testing (CAT) is an individualized assessment methodology in which a computer algorithm selects successive test items based on a running estimate of each examinee's latent ability. Grounded in Item Response Theory, CAT dynamically tailors the item sequence so that each question is optimally inform

1 source2000
education

Concept Mapping Assessment

Concept mapping assessment uses student-generated diagrams of concepts and their relationships to evaluate the structure of knowledge, not just its quantity. A concept map represents ideas as labeled nodes connected by labeled links that form meaningful propositions, often arranged hierarchically with cross-links betwe

2 sources1984
education

Conditional Standard Error of Measurement

The conditional standard error of measurement (CSEM) describes how much measurement error a test score carries at each point along the score scale, rather than as a single average. A test typically measures more precisely in some score ranges than others — often best near the middle and worst at the extremes — and the

2 sources1980
psychometrics

Confirmatory factor analysis

Confirmatory factor analysis tests a researcher-specified factor structure against observed data. Unlike exploratory approaches, the researcher decides in advance which indicators load on which latent factor, and the model is evaluated by how closely the implied covariance matrix reproduces the sample covariance matrix

2 sources1969
psychometrics

Confirmatory Factor Analysis for Scales

Confirmatory Factor Analysis (CFA) is a statistical method for testing whether a hypothesized factorial structure fits empirical data. Developed by Karl G. Jöreskog in 1969, CFA is the standard approach for validating psychometric scales by evaluating whether items load onto theoretically specified latent factors as ex

3 sources1969
psychometrics

Construct Validity

Construct validity is the degree to which a test or scale actually measures the theoretical construct it is intended to measure. Introduced by Cronbach and Meehl in 1955, it is the central validity concern in psychological and educational measurement, evaluated by accumulating multiple lines of empirical and logical ev

2 sources1955
psychometrics

Content Validity

Content validity is evidence that a measurement instrument adequately samples the full domain of the construct it is intended to measure. It is established through systematic expert review and quantified with indices such as Lawshe's Content Validity Ratio (CVR) and Lynn's Content Validity Index (CVI), making it the fo

2 sources1975
psychometrics

Content Validity Ratio

The Content Validity Ratio (CVR) is a quantitative method developed by Charles Lawshe in 1975 for evaluating the extent to which items in a measurement instrument are relevant and representative of a target construct. The method aggregates expert panel judgments into a single validity coefficient for each item, enablin

3 sources1975
psychometrics

Convergent Validity

Convergent validity is the degree to which multiple indicators that are theoretically expected to measure the same construct actually correlate with one another. It is one of the two complementary forms of construct validity identified by Campbell and Fiske (1959) and is now routinely assessed via factor loadings and t

2 sources1959
educational psychology

Course Experience Questionnaire

The Course Experience Questionnaire (CEQ) is an institutional assessment tool measuring students' perceptions of their learning environment and educational experience in a course. Developed by Wilson, Lizzio, and Ramsden (1997), it assesses dimensions including good teaching, clear goals, appropriate workload, appropri

2 sources1997
educational psychology

Critical Thinking Dispositions Scale

The Critical Thinking Dispositions Scale (CTDS), exemplified by the California Critical Thinking Disposition Inventory (CCTDI), measures the extent to which individuals exhibit cognitive dispositions conducive to critical thinking. Developed by Facione (1992), it assesses dimensions including truth-seeking, open-minded

2 sources1992
education

Cross-Classified Multilevel Models in Education

Cross-classified multilevel models extend hierarchical linear modeling to situations where units belong to two or more groupings that do not nest neatly inside one another. In education, students are often classified by both school and neighborhood, or by primary and secondary school across time — classifications that

2 sources1993
health education

CTQS

The CTQS is a self-report questionnaire measuring students' perceptions of their clinical educator's (preceptor, clinical instructor, or mentor) teaching quality and effectiveness. Developed by Ohrling, Hallberg, and Gaberson in the early 2000s, the CTQS evaluates dimensions of clinical teaching including role modeling

2 sources2001