Grey Literature and Searching Beyond Databases
Unpublished and hard-to-find sources
A substantial portion of scientific evidence resides outside indexed journals: theses, conference papers, government and NGO reports, working papers, preprints, and datasets collectively form grey literature. Relying solely on databases introduces publication bias; a thorough search must also draw on institutional repositories, trial registries, and forward-backward citation tracking to capture the full picture.
What Is Grey Literature?
Grey literature encompasses any document produced and distributed outside the control of commercial publishers. Doctoral and master's theses, conference proceedings, government policy reports, NGO finding reports, technical documents, working papers, preprints, and research datasets all fall into this category. Although less visible than peer-reviewed journals, these sources can contain original and current findings; they are especially valuable in applied policy research as primary data sources.
How to Search and Use It
An effective grey literature search has several layers. Thesis databases such as ProQuest Dissertations, DART-Europe, and the national thesis center provide access to dissertations. Official websites of government ministries and international organizations (WHO, World Bank, OECD) host policy reports. Trial registries such as ClinicalTrials.gov and PROSPERO make completed but unpublished studies visible. arXiv, SSRN, and OSF Preprints serve preprints. Additionally, key journals should be hand-searched issue by issue, and citation snowballing should trace references backward and forward from each relevant study.
A Concrete Example
A researcher wants to conduct a meta-analysis on the effectiveness of school-based nutrition interventions. Relying only on PubMed and Web of Science searches, she misses a large share of small-sample studies with null or negative findings, since those studies often go unpublished. By additionally searching ProQuest for theses, WHO and UNICEF reports, a school health association conference proceedings, and ClinicalTrials.gov registrations, she retrieves 12 additional studies. Including them substantially shifts the effect-size estimate and improves the generalizability of the findings.
Common Pitfalls and Best Practice
The most common mistake is ignoring grey literature entirely and assuming this does not affect research quality; doing so directly feeds publication bias, the overrepresentation of positive results. At the same time, it must be remembered that grey sources have not undergone peer review, so quality appraisal becomes the researcher's responsibility. Best practice requires reporting the search strategy transparently according to a protocol such as PRISMA-S, recording access dates, and applying systematic inclusion and exclusion criteria to grey sources just as to journal articles.
Key terms
- Grey literature
- Documents such as theses, reports, and preprints produced outside commercial publishing channels.
- Publication bias
- The tendency for studies with positive results to be published at higher rates than null or negative findings.
- Citation snowballing
- Discovering new studies by tracing a work's references backward and works citing it forward.
- Trial registry
- An official platform where studies are registered before they begin and outcomes are tracked.
- Hand-searching
- Manually reviewing journal issues one by one rather than relying on automated database searches.