Data Extraction Forms: What to Include and How to Design Them

By Angel Reyes Last updated March 24, 2026 4 min read

Data extraction is where good systematic reviews are made or broken. A weak form produces inconsistent data; inconsistent data produces synthesis you cannot trust. But over-engineered forms — fifty fields no reviewer fills in consistently — fail in the opposite direction. This article covers what to actually include on an extraction form, how to pilot it, and how to run dual extraction without losing your mind.

What extraction is for

Extraction converts each included study into structured data. Done well, you end up with a dataset you can synthesize narratively, tabularly, or statistically (via meta-analysis). See our data extraction process page for the full method.

The Cochrane Handbook (Chapter 5) and JBI methodology both specify extraction as a dual-reviewer process for systematic reviews. Extraction is not optional. If you did not extract, you did not review.

Core fields every form needs

Every extraction form — regardless of review type — should include:

Study ID (first author + year + letter for multiple reports, e.g., Smith 2020a)
Full citation
Country and setting
Study design
Aim or research question
Population (inclusion criteria, sample size, demographics)
Intervention or exposure (for intervention reviews)
Comparator (for comparative designs)
Outcomes measured (with measurement tool and time point)
Results (effect estimates with confidence intervals; for qualitative, key themes)
Funding source and declared conflicts
Risk of bias judgments (per the tool you are using)
Reviewer notes (anything unusual, ambiguous, or requiring clarification)

Use our data extraction form template as a starting point.

Add review-specific fields

Beyond the core, add fields that match your synthesis plan:

Meta-analysis: effect size, variance, sample size per arm, outcome scale, time point, adjustment variables
Qualitative synthesis: analytic approach, theoretical framework, raw themes, author interpretations
Scoping review: concepts mapped, gaps noted, stakeholder involvement
Intervention fidelity review: dose, duration, setting, provider training

Do not include fields "in case we need them." If you have no plan to use a field in synthesis, cut it.

Form format: paper, Word, Excel, or software?

Paper or Word: only for pilot and single-study examples. Unusable at scale.
Excel: fine for small reviews (< 40 studies), cheap, portable, no vendor lock-in.
Covidence or EPPI-Reviewer: purpose-built, supports dual extraction with automatic conflict flagging, recommended for most systematic reviews.
REDCap: excellent for large teams; requires institutional setup.

Choose based on review size and team size, not familiarity.

Pilot before you extract

Pilot the form on three to five included studies. Both reviewers extract the same studies independently. Then:

Compare every field, every study
Where you disagreed, diagnose: was the field ambiguous? The source data ambiguous? The operational definition loose?
Revise the form: rename fields, add response options, tighten definitions
Re-pilot if the first round produced major disagreement

A two-hour pilot saves two weeks of re-extraction.

Dual extraction in practice

For systematic reviews, the Cochrane Handbook requires dual extraction: two independent reviewers extract each study, then compare and reconcile. In practice:

Both reviewers extract independently into their own forms
A third reviewer (or the original two together) compares each field
Disagreements are resolved by discussion, consulting the full text, or consulting a third reviewer
Final locked form is the reconciled version

Document every resolved disagreement. Meta-review of your extraction process can catch systematic errors you did not notice.

Extracting results: the hardest part

Results extraction is where most errors happen. Practical tips:

Record what the paper reports, not what you wish it reported
If the paper reports a median and IQR and your synthesis needs mean and SD, note both and flag the need to convert
Extract outcomes by time point, not averaged across time points
When an intention-to-treat and per-protocol analysis are both reported, extract both and note which you will use
Record any non-reported data as "not reported" — never leave blank

Contacting authors

For missing data critical to synthesis, email the corresponding author. Keep the email short, specific, and polite:

Dear Dr. X, I am leading a systematic review of [topic]. Your 2022 paper reported [outcome] at 12 weeks. Could you share the standard deviation for the intervention arm? I will acknowledge the data in our review.

Expect a 30–50% response rate. Document every attempt and outcome in your PRISMA flow.

Five design principles

One question per field. "Population and setting" is two fields.
Use closed response options where you can. "Design: [RCT / quasi-experiment / cohort / case-control / ...]" beats a free text box.
Define every field operationally. Attach a one-line definition and an example.
Order fields to match the paper's order. Usually: methods, sample, intervention, outcomes, results.
Leave a comments field per section. Reviewers will have context that does not fit the structure.

A form that takes 15 minutes per study is achievable with pilot and practice. A form that takes 45 minutes per study usually has too many fields or too-loose definitions — and is probably also producing inconsistent data.