"> Systematic Literature Review: Steps & Examples - ResearchProspect
Home > Library > Research Methodology > Systematic Literature Review: Steps & Examples

Published by at June 17th, 2026 , Revised On June 17, 2026

A systematic literature review is a rigorous, protocol-driven and reproducible method for answering one focused research question by identifying, appraising and synthesising all relevant studies according to pre-specified, transparent rules. Unlike an ordinary literature review, a systematic review (often shortened to “systematic review”) follows a written protocol, uses an exhaustive and documented search strategy, applies explicit inclusion and exclusion criteria, and appraises the quality of every included study — so that another researcher could repeat your process and reach the same body of evidence.

Use a systematic literature review when you need a trustworthy, defensible answer to a precise question — for example, “Does cognitive behavioural therapy reduce exam anxiety in undergraduates?” — rather than a broad overview of a topic. Its strength is that it minimises bias and cherry-picking; its cost is time, structure and discipline.

What is a systematic literature review?

A systematic literature review is a secondary research method: your “data” are the published (and sometimes unpublished) studies that already exist. What makes it systematic — rather than just a long reading list — is that every decision is planned in advance, documented, and reproducible. You decide before you start which databases you will search, which search terms you will use, which studies qualify, and how you will judge their quality. You then report exactly how many records you found, screened, excluded and included, so the reader can audit your reasoning.

This rigour is what separates it from a traditional literature review, where the author selects sources at their own discretion and the search process is rarely documented. A systematic review treats the search-and-selection process itself as data to be reported. Because it draws exclusively on existing studies, it is a flagship example of secondary research — no new participants are recruited, yet a well-conducted review can sit at the very top of the evidence hierarchy.

“A systematic review attempts to collate all empirical evidence that fits pre-specified eligibility criteria in order to answer a specific research question. It uses explicit, systematic methods that are selected with a view to minimizing bias.” (Source: Higgins & Green, Cochrane Handbook)

When should you use a systematic review?

A systematic literature review is the right method when:

  • Your question is focused and answerable — ideally a question about effectiveness, prevalence, association or experience, not a sprawling topic.
  • A meaningful body of primary studies already exists, so there is something to synthesise.
  • You need a defensible, low-bias answer — for a dissertation chapter, a clinical guideline, a policy decision or a journal article.
  • Decisions or further research will rest on the totality of the evidence, not on one or two convenient studies.

It is the wrong choice when the literature is too sparse to synthesise, when your question is genuinely exploratory (a scoping review may fit better), or when you have time only for a quick orientation (a narrative review will do).

Systematic review vs narrative, scoping review and meta-analysis

These four terms are routinely confused. They differ in their purpose, their method and how reproducible they are. The table below is the clearest way to keep them apart.

Feature Narrative review Scoping review Systematic review Meta-analysis
Question Broad topic overview “What evidence exists?” / map the field One focused, pre-specified question One focused question, quantified
Protocol None Recommended Required (e.g. PROSPERO) Required
Search Selective, undocumented Systematic, documented Exhaustive, documented Exhaustive, documented
Quality appraisal Rarely Optional Mandatory (risk of bias) Mandatory
Synthesis Narrative, author’s judgement Descriptive map / themes Narrative or statistical Statistical pooling (effect size)
Reproducible? No Largely Yes Yes

Note the key relationship: a meta-analysis is the statistical synthesis step that can sit inside a systematic review when the included studies are similar enough to pool numerically. A systematic review that pools results becomes a systematic review with meta-analysis; one that synthesises in words is a systematic review with narrative synthesis. The two are not alternatives at the same level — see our dedicated guide to meta-analysis for the pooling mathematics.

The systematic review process: step by step

A systematic review is procedural, and the order matters. Follow these eight steps.

  1. Define a focused question (PICO / PICOS). Frame your question with a structured tool so it is precise and searchable. For effectiveness questions, PICO works well: Population, Intervention, Comparison, Outcome. Adding S for Study design gives PICOS. For example: in undergraduates (P), does mindfulness training (I) compared with no intervention (C) reduce exam anxiety (O), in randomised or controlled studies (S)? A vague question dooms the whole review, so this step deserves real care — it also defines your research aims and objectives.
  2. Write a protocol and register it (PRISMA / PROSPERO). Before you search, write a protocol that states your question, search strategy, eligibility criteria, appraisal method and synthesis plan. Registering it — for health and social-care topics, on PROSPERO — time-stamps your intentions and guards against changing the rules after seeing the results. Plan to report against the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) checklist from the outset.
  3. Design the search strategy. Choose your databases (e.g. MEDLINE/PubMed, Scopus, Web of Science, PsycINFO, ERIC, CINAHL), translate each PICO concept into keywords plus controlled vocabulary (such as MeSH terms), and combine them with Boolean operators. Within a concept use OR (“anxiety” OR “worry” OR “stress”); between concepts use AND. Use truncation (anxiet*) and record the exact strings and dates so the search is repeatable. This is a specialised form of data collection — only your data are studies, not people.
  4. Set inclusion and exclusion criteria. Specify, in advance, what qualifies: population, intervention, study designs, outcomes, language, publication years and publication status. Anything that does not meet a criterion is excluded, and you record the reason at full-text stage.
  5. Screen in two stages. First, screen titles and abstracts against the criteria to remove the obviously irrelevant. Then retrieve the full texts of the survivors and screen those in full. Ideally two reviewers screen independently and resolve disagreements, reducing selection bias.
  6. Appraise quality / risk of bias. Critically appraise every included study using a recognised tool — for example the Cochrane Risk of Bias 2 (RoB 2) tool for trials, the Newcastle–Ottawa Scale for observational studies, or CASP checklists. This tells the reader how much weight the evidence can bear.
  7. Extract the data. Using a piloted extraction form, pull the same fields from each study: design, sample size, setting, intervention details, outcomes, effect estimates and any limitations. Consistent extraction makes the studies comparable.
  8. Synthesise the evidence. If the studies are similar enough, pool them statistically in a meta-analysis (calculating a combined effect size). If they are too heterogeneous — different designs, outcomes or measures — synthesise narratively, grouping findings by theme and explaining the pattern. Either way, the synthesis must answer the original question.

Reporting with a PRISMA flow diagram

The hallmark of a transparent systematic review is the PRISMA flow diagram: a simple chart that accounts for every record from the initial search through to final inclusion, showing how many were removed at each stage and why. Reviewers and examiners look for it first, because it proves your selection process was honest and reproducible. The figure below shows the standard flow.

PRISMA Flow: Selection of StudiesRecords identifiedn = 1,240After duplicates removedn = 940Title / abstract screenedn = 940Excludedn = 868Full text assessedn = 72Excludedn = 54Studies included in reviewn = 18ScreeningEligibilityIncluded
Figure 1: A PRISMA-style flow diagram tracing 1,240 identified records down to 18 included studies, with exclusions reported at each stage.
Example: Suppose a health-psychology student asks: “In undergraduate students (P), does mindfulness training (I) versus no intervention (C) reduce exam anxiety (O)?” Working the funnel with the numbers in Figure 1: the searches of PubMed, PsycINFO and Scopus return 1,240 records. Removing duplicate hits across the three databases leaves 940 unique records (1,240 − 300 duplicates). Screening titles and abstracts against the criteria excludes 868 that are clearly off-topic (wrong population, wrong outcome, not empirical), leaving 72 for full-text reading (940 − 868 = 72). At full text, 54 are excluded with documented reasons — e.g. 21 had no control group, 18 measured the wrong outcome, 9 were conference abstracts with no extractable data, 6 were not in English. That leaves 72 − 54 = 18 studies in the final review. Of those 18, 12 report comparable standardised anxiety scores, so they are pooled in a meta-analysis; the remaining 6, using different measures, are synthesised narratively. Every number in this chain is reported, so an examiner can reconstruct exactly how 1,240 became 18.

Building and documenting the search strategy

The search is the engine of a systematic review: if it misses studies, every later step inherits the gap. Aim for a search that is sensitive (it finds nearly all the relevant studies) even at the cost of returning some irrelevant ones you will screen out later. Start by searching at least three major bibliographic databases so that no single index’s coverage limits you. Scopus and Web of Science are broad, multidisciplinary citation databases; PubMed/MEDLINE is essential for health and biomedical topics; and subject databases such as PsycINFO, CINAHL, ERIC or IEEE Xplore add depth in psychology, nursing, education and engineering respectively. Each database indexes a partly different set of journals, so searching several is what makes the review comprehensive rather than convenient.

Translate every PICO concept into two layers of search terms: free-text keywords (the words authors actually use in titles and abstracts) and controlled vocabulary (a database’s own subject headings, such as MeSH in PubMed or the Thesaurus in PsycINFO). Combine them with Boolean operators: use OR to gather synonyms within one concept (“adolescent” OR “teenager” OR “youth”), AND to require that separate concepts co-occur, and NOT sparingly to exclude a clearly unwanted set — NOT is risky because it can silently discard relevant records. Truncation and wildcards widen a term efficiently: a stem with an asterisk (depress* → depression, depressive, depressed) captures word variants, while a wildcard inside a word (behavio?r) catches British and American spellings. Use phrase quotation marks for fixed expressions (“exam anxiety”) so the database does not split them.

Database searching alone is not enough. Chase the grey literature — theses, conference proceedings, government and NGO reports, trial registries and preprints — because findings that never reach a journal are exactly the ones most affected by publication bias. Hand-search the reference lists of the studies you include (backward citation chasing) and check which newer papers cite them (forward citation chasing). Finally, document the search so it can be repeated: record the full search string for every database, the interface and date searched, any filters applied, and the number of hits returned. A reader who reruns your strings should obtain the same set of records — that reproducibility is the whole point.

Appraising quality and risk of bias

Once you have your included studies, you must judge how much trust each one can bear — a study can be relevant yet methodologically weak. Critical appraisal examines whether a study’s design and conduct could have biased its results, and the right tool depends on the study design. Use a recognised instrument rather than your own impression so the judgement is consistent and reportable:

  • Cochrane Risk of Bias 2 (RoB 2) — the standard for randomised controlled trials, assessing bias across domains such as randomisation, deviations from intended interventions, missing data, outcome measurement and selective reporting.
  • CASP checklists — the Critical Appraisal Skills Programme offers a friendly, design-specific checklist (for trials, cohort studies, qualitative research and more) that is popular in student and nursing reviews.
  • JBI critical appraisal tools — the Joanna Briggs Institute provides a comprehensive suite covering a wide range of designs, widely used for both quantitative and qualitative evidence.
  • AMSTAR 2 — used to appraise the quality of other systematic reviews, which matters when your evidence base includes existing reviews (an overview of reviews).
  • Newcastle–Ottawa Scale — a star-based tool for non-randomised observational studies (cohort and case-control).

Record each appraisal result, ideally in a table, and let it inform — not just decorate — your synthesis: studies at high risk of bias should carry less weight in your conclusions, and a sensitivity analysis that removes them tests how robust your findings really are. As with screening, two reviewers appraising independently reduces subjective error.

Data extraction and synthesis in depth

Consistent data extraction turns a pile of papers into comparable evidence. Build a piloted extraction form — test it on a handful of studies first — and pull the same fields from every paper: citation details, study design, setting and country, sample size and population, the intervention and comparator, the outcomes measured and how, the effect estimates with their confidence intervals, funding source and noted limitations. Extracting into a structured spreadsheet (and, ideally, having a second reviewer verify it) prevents the selective, memory-based summarising that creeps into ordinary reviews.

How you synthesise depends on how similar the studies are. When studies measure the same outcome in compatible ways, a meta-analysis pools their results into a single weighted effect size, with a forest plot displaying each study’s estimate and the combined result, and a statistic such as I² quantifying heterogeneity. When studies are too diverse in design, population or outcome measures to pool numerically, use narrative synthesis: group findings by theme or outcome, tabulate them, and explain the pattern, the inconsistencies and the likely reasons in structured prose. Narrative synthesis is a legitimate result, not a fallback — forcing incomparable studies into a meta-analysis produces a precise-looking number that means nothing.

Whichever route you take, report it against the PRISMA 2020 statement. The updated checklist (a 27-item list plus the flow diagram) reflects modern practice — including the now-standard expectation that authors report their search strategies in full, declare protocol registration, and assess the certainty of the overall body of evidence (for example with the GRADE approach). Reporting transparently against PRISMA 2020 is what lets a reader move from trusting your conclusion to verifying it.

Strengths and limitations

A systematic literature review earns its place near the top of the evidence hierarchy, but it is not a magic wand. Weigh both sides.

Strengths

  • Minimises bias. Pre-specified criteria and an exhaustive search stop you from cherry-picking studies that suit your argument.
  • Reproducible and transparent. The protocol, search strings and PRISMA diagram let anyone audit or repeat your work.
  • Comprehensive. It aims to capture all relevant evidence, not a convenient sample of it.
  • Strong basis for decisions. Synthesised evidence is more reliable than any single study for guidelines, policy or a dissertation argument.

Limitations

  • Time- and labour-intensive — a full review can take months, and dual screening doubles the effort.
  • Only as good as the underlying studies; pooling weak or biased trials yields a precise but misleading answer (“garbage in, garbage out”).
  • Vulnerable to publication bias — positive results are published more readily, so the visible literature can overstate an effect.
  • Heterogeneous studies may resist meaningful synthesis, forcing a narrative account that is harder to interpret.

Common mistakes to avoid

  • A vague or shifting question. Without a tight PICO, the search balloons and the criteria become arbitrary.
  • No protocol, or changing it after seeing results. Post-hoc tweaks reintroduce the very bias the method exists to remove.
  • Searching one database. Relying on Google Scholar or a single index misses large swathes of the literature; search several and document each.
  • Skipping quality appraisal. Treating a flawed study as equal to a robust one corrupts the synthesis.
  • No PRISMA diagram or counts. Without the numbers, the review is not reproducible and reviewers cannot trust it.
  • Forcing a meta-analysis on incomparable studies. Pooling apples and oranges produces a tidy number that means nothing.

How to do it well

Start narrow and stay disciplined. Pin down a focused PICO question, write and register the protocol before you touch a database, and keep a dated log of every search string. Use two reviewers wherever you can, appraise every study with a recognised risk-of-bias tool, and report your flow honestly in a PRISMA diagram. If the studies are comparable, pool them; if not, synthesise narratively and say why. Done properly, a systematic literature review is among the most respected pieces of evidence a student researcher can produce — and a model chapter for any literature review in a thesis.

Need a watertight systematic review chapter?

Our subject-expert researchers design the protocol, run the search and write the synthesis — fully referenced and PRISMA-compliant.

Frequently Asked Questions

What is the difference between a literature review and a systematic literature review?

A traditional literature review surveys a topic at the author’s discretion, with a selective and usually undocumented search. A systematic literature review answers one focused question using a pre-registered protocol, an exhaustive and documented search, explicit inclusion and exclusion criteria, formal quality appraisal and a reported PRISMA flow — so the whole process is transparent and reproducible. In short, the systematic review treats the search and selection itself as auditable data.

No. A systematic review is the whole method of finding, appraising and synthesising studies to answer a question. A meta-analysis is one possible synthesis step within it — the statistical pooling of comparable studies into a single effect size. A systematic review can be done with a meta-analysis (when studies are similar enough to pool) or with a narrative synthesis (when they are not).

PICO is a framework for building a focused review question: Population (who), Intervention (what is done), Comparison (versus what), and Outcome (what is measured). Adding S for Study design gives PICOS. Framing your question this way makes it precise, searchable and directly translatable into database keywords.

PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) is a reporting standard, and its flow diagram is a chart that accounts for every record from the initial search to the final included studies. It shows how many records were identified, how many remained after duplicates were removed, how many were screened, how many full texts were assessed, and how many were excluded at each stage and why — making the selection process transparent.

Search several, not one. For most health, psychology and social-science reviews you would search a combination such as MEDLINE/PubMed, Scopus, Web of Science, PsycINFO, ERIC or CINAHL, depending on the topic, and often supplement with reference-list checking and grey-literature sources. Relying on a single database (or only Google Scholar) misses relevant studies and weakens the review’s claim to be comprehensive.

A full systematic review is labour-intensive and commonly takes several months, because exhaustive searching, dual independent screening, quality appraisal and synthesis all take time. For a tighter timeframe, students sometimes conduct a more limited rapid review or scoping review, but these trade some rigour for speed and should be described honestly as such.

About Aadam Mae

Avatar for Aadam MaeAadam Mae, an academic researcher and author with a PhD in NLP (Natural Language Processing) at ResearchProspect. Mae's work delves into the intricacies of language and technology, delivering profound insights in concise prose. Pioneering the future of communication through scholarship.

WhatsApp Live Chat