Literature Search Strategies
Searching databases systematically
A good literature search is systematic and documented. It involves translating a research question into keywords and controlled vocabulary, selecting appropriate databases such as PubMed, Scopus, Web of Science, and Google Scholar, and combining terms using Boolean operators, truncation, and field tags. Recording the exact search strings, databases consulted, search dates, and result counts makes the process transparent and reproducible — a requirement for any rigorous literature review or systematic review.
What Is a Literature Search?
A literature search is the systematic process of identifying published studies that can help answer a specific research question. It differs from casual browsing or typing terms into a general search engine; it requires pre-defined keywords, specific databases, and a reproducible procedure. A systematic approach ensures that the researcher does not miss important evidence, that the search is free from undue bias, and that the process can be verified by others. While valuable for all types of literature review, it is an absolute requirement for systematic reviews and meta-analyses.
How It Works: Core Steps
The first step is structuring the research question with a framework such as PICO, then generating a list of keywords and synonyms for each component. Next, appropriate databases are selected: PubMed/MEDLINE for medicine, PsycINFO or Scopus for social sciences, IEEE Xplore for engineering, and so on. The search string is built with Boolean operators — AND to narrow between concepts, OR to combine synonyms, NOT to exclude off-topic content. Truncation (for example random*) captures different word endings. Subject headings such as MeSH controlled vocabulary and field tags such as [tiab] for title and abstract searching increase precision. Every search must be logged with its date and result count.
A Concrete Example
Suppose the research question is: Does aerobic exercise reduce depression in adults? Using the PICO framework, P = adults, I = aerobic exercise, C = control group, O = depression. A search string might read: (aerobic exercise OR physical activity OR running) AND (depression OR depressive disorder OR major depressive disorder) AND (adult* OR grown-up). This string is run in PubMed separately with MeSH terms (Depression [MeSH], Exercise [MeSH]) and with free-text terms, and results are combined. Notes such as search conducted 15 March 2025, 847 records retrieved are recorded and reported in the methods section or a PRISMA flow diagram.
Common Pitfalls and Good Practice
Common mistakes include searching only a single database, restricting the search to English-language publications (language bias), failing to record search strings, and not noting the date between searching and accessing articles. Poor keyword choices can produce too many irrelevant results (low precision) or miss key studies (low sensitivity). For good practice: use database-specific thesauri and controlled vocabulary, supplement with hand-searching and citation tracking, collaborate with a librarian for complex topics, and make the full search transparent by registering it in PROSPERO or OSF or by including it as a supplementary file in the published article.
Key terms
- Boolean Operators
- Logical connectors — AND, OR, NOT — used to combine or exclude search terms.
- Controlled Vocabulary
- Standardized database-specific terms; MeSH headings in PubMed are a prime example.
- Truncation
- Adding * to a word stem to retrieve multiple endings in a single search.
- Search String
- The exact query entered into a database, comprising keywords and operators.
- Reproducibility
- The ability of another researcher to replicate the search and obtain the same results.