11. Risk of bias in individual studies

Specify the methods used to assess risk of bias in the included studies, including details of the tool(s) used, how many reviewers assessed each study and whether they worked independently, and, if applicable, details of automation tools used in the process.

Essential elements

  • Specify the tool(s) (and version) used to assess risk of bias in the included studies.

  • Specify the methodological domains/components/items of the risk of bias tool(s) used.

  • Report whether an overall risk of bias judgment that summarised across domains/components/items was made, and if so, what rules were used to reach an overall judgment.

  • If any adaptations to an existing tool to assess risk of bias in studies were made (such as omitting or modifying items), specify the adaptations.

  • If a new risk of bias tool was developed for use in the review, describe the content of the tool and make it publicly accessible.

  • Report how many reviewers assessed risk of bias in each study, whether multiple reviewers worked independently (such as assessments performed by one reviewer and checked by another), and any processes used to resolve disagreements between assessors.

  • Report any processes used to obtain or confirm relevant information from study investigators.

  • If an automation tool was used to assess risk of bias in studies, report how the automation tool was used (such as machine learning models to extract sentences from articles relevant to risk of bias1), how the tool was trained, and details on the tool’s performance and internal validation.

Explanation

Users of reviews need to know the risk of bias in the included studies to appropriately interpret the evidence. Numerous tools have been developed to assess study limitations for various designs.2 However, many tools have been criticised because of their content (which may extend beyond assessing study limitations that have the potential to bias findings) and the way in which the items are combined (such as scales where items are combined to yield a numerical score) (see below).3 Reporting details of the selected tool enables readers to assess whether the tool focuses solely on items that have the potential to bias findings. Reporting details of how studies were assessed (such as by one or two authors) allows readers to assess the potential for errors in the assessments.4 Reporting how risk of bias assessments were incorporated into the analysis is addressed in Items 13e and 13f.

Assessment of risk of bias in studies and bias due to missing results

Terminology

The terms “quality assessment” and “critical appraisal” are often used to describe the process of evaluating the methodological conduct or reporting of studies.2 In PRISMA 2020, we distinguish “quality” from “risk of bias” and have focused the relevant items and elaborations on the latter. Risk of bias refers to the potential for study findings to systematically deviate from the truth due to methodological flaws in the design, conduct or analysis.3 Quality is not well defined, but has been shown to encompass constructs beyond those that may bias the findings, including, for example, imprecision, reporting completeness, ethics, and applicability.56 7 In systematic reviews, focus should be given to the design, conduct, and analysis features that may lead to important bias in the findings.

Different types of risk of bias

In PRISMA 2020, two aspects of risk of bias are considered. The first aspect is risk of bias in the results of the individual studies included in a systematic review. Empirical evidence and theoretical considerations suggest that several features of study design are associated with larger intervention effect estimates in studies; these features include inadequate generation and concealment of a random sequence to assign participants to groups, substantial loss to follow-up of participants, and unblinded outcome assessment.8

The second aspect is risk of bias in the result of a synthesis (such as meta-analysis) due to missing studies or results within studies. Missing studies/results may introduce bias when the decision to publish a study/result is influenced by the observed P value or magnitude or direction of the effect.9 For example, studies with statistically non-significant results may not have been submitted for publication (publication bias), or particular results that were statistically non-significant may have been omitted from study reports (selective non-reporting bias).1011

Tools for assessing risk of bias

Many tools have been developed to assess the risk of bias in studies26 7 or bias due to missing results.12 Existing tools typically take the form of composite scales and domain-based tools.613 Composite scales include multiple items which each have a numeric score attached, from which an overall summary score might be calculated. Domain-based tools require users to judge risk of bias within specific domains, and to record the information on which each judgment was based.314 15 Specifying the components/domains in the tool used in the review can help readers determine whether the tool focuses on risk of bias only or addresses other “quality” constructs. Presenting assessments for each component/domain in the tool is preferable to reporting a single “quality score” because it enables users to understand the specific components/domains that are at risk of bias in each study.

Incorporating assessments of risk of bias in studies into the analysis

The risk of bias in included studies should be considered in the presentation and interpretation of results of individual studies and syntheses. Different analytic strategies may be used to examine whether the risks of bias of the studies may influence the study results: (i) restricting the primary analysis to studies judged to be at low risk of bias (sensitivity analysis); (ii) stratifying studies according to risk of bias using subgroup analysis or meta-regression; or (iii) adjusting the result from each study in an attempt to remove the bias. Further details about each approach are available elsewhere.3

Example

“We assessed risk of bias in the included studies using the revised Cochrane ‘Risk of bias’ tool for randomised trials (RoB 2.0) (Higgins 2016a), employing the additional guidance for cluster-randomised and cross-over trials (Eldridge 2016; Higgins 2016b). RoB 2.0 addresses five specific domains: (1) bias arising from the randomisation process; (2) bias due to deviations from intended interventions; (3) bias due to missing outcome data; (4) bias in measurement of the outcome; and (5) bias in selection of the reported result. Two review authors independently applied the tool to each included study, and recorded supporting information and justifications for judgements of risk of bias for each domain (low; high; some concerns). Any discrepancies in judgements of risk of bias or justifications for judgements were resolved by discussion to reach consensus between the two review authors, with a third review author acting as an arbiter if necessary. Following guidance given for RoB 2.0 (Section 1.3.4) (Higgins 2016a), we derived an overall summary 'Risk of bias' judgement (low; some concerns; high) for each specific outcome, whereby the overall RoB for each study was determined by the highest RoB level in any of the domains that were assessed.”16

Training

The UK EQUATOR Centre runs training on how to write using reporting guidelines.

Discuss this item

Visit this items’ discussion page to ask questions and give feedback.

References

1.
Marshall IJ, Kuiper J, Wallace BC. RobotReviewer: Evaluation of a system for automatically assessing bias in clinical trials. Journal of the American Medical Informatics Association. 2015;23(1):193-201. doi:10.1093/jamia/ocv044
2.
Bai a shukla VK bak g . Quality assessment tools project report. Canadian agency for drugs and technologies in health, 2012.
3.
Boutron I, Page MJ, Higgins JP, Altman DG, Lundh A, Hróbjartsson A. Considering bias and conflicts of interest among the included studies. Cochrane Handbook for Systematic Reviews of Interventions. Published online September 2019:177-204. doi:10.1002/9781119536604.ch7
4.
Robson RC, Pham B, Hwee J, et al. Few studies exist examining methods for selecting studies, abstracting data, and appraising quality in a systematic review. Journal of Clinical Epidemiology. 2019;106:121-135. doi:10.1016/j.jclinepi.2018.10.003
5.
Büttner F, Winters M, Delahunt E, et al. Identifying the “incredible”! Part 1: Assessing the risk of bias in outcomes included in systematic reviews. British Journal of Sports Medicine. 2019;54(13):798-800. doi:10.1136/bjsports-2019-100806
6.
Moher D, Jadad AR, Nichol G, Penman M, Tugwell P, Walsh S. Assessing the quality of randomized controlled trials: An annotated bibliography of scales and checklists. Controlled Clinical Trials. 1995;16(1):62-73. doi:10.1016/0197-2456(94)00031-w
7.
Olivo SA, Macedo LG, Gadotti IC, Fuentes J, Stanton T, Magee DJ. Scales to assess the quality of randomized controlled trials: A systematic review. Physical Therapy. 2008;88(2):156-175. doi:10.2522/ptj.20070147
8.
Page MJ, Higgins JPT, Clayton G, Sterne JAC, Hróbjartsson A, Savović J. Empirical evidence of study design biases in randomized trials: Systematic review of meta-epidemiological studies. Scherer RW, ed. PLOS ONE. 2016;11(7):e0159267. doi:10.1371/journal.pone.0159267
9.
Page MJ, Higgins JP, Sterne JA. Assessing risk of bias due to missing results in a synthesis. Cochrane Handbook for Systematic Reviews of Interventions. Published online September 2019:349-374. doi:10.1002/9781119536604.ch13
10.
Chan AW, Song F, Vickers A, et al. Increasing value and reducing waste: Addressing inaccessible research. The Lancet. 2014;383(9913):257-266. doi:10.1016/s0140-6736(13)62296-5
11.
Dwan K, Gamble C, Williamson PR, Kirkham JJ. Systematic review of the empirical evidence of study publication bias and outcome reporting bias — an updated review. Boutron I, ed. PLoS ONE. 2013;8(7):e66844. doi:10.1371/journal.pone.0066844
12.
Page MJ, McKenzie JE, Higgins JPT. Tools for assessing risk of reporting biases in studies and syntheses of studies: A systematic review. BMJ Open. 2018;8(3):e019703. doi:10.1136/bmjopen-2017-019703
13.
Whiting P, Wolff R, Mallett S, Simera I, Savović J. A proposed framework for developing quality assessment tools. Systematic Reviews. 2017;6(1). doi:10.1186/s13643-017-0604-6
14.
Sterne JAC, Savović J, Page MJ, et al. RoB 2: A revised tool for assessing risk of bias in randomised trials. BMJ. Published online August 2019:l4898. doi:10.1136/bmj.l4898
15.
Sterne JA, Hernán MA, Reeves BC, et al. ROBINS-i: A tool for assessing risk of bias in non-randomised studies of interventions. BMJ. Published online October 2016:i4919. doi:10.1136/bmj.i4919
16.
Hollands GJ, Carter P, Anwer S, et al. Altering the availability or proximity of food, alcohol, and tobacco products to change their selection and consumption. Cochrane Database of Systematic Reviews. Published online September 2019. doi:10.1002/14651858.cd012573.pub3

Citation

For attribution, please cite this work as:
Page MJ, Moher D, Bossuyt PM, et al. PRISMA 2020 explanation and elaboration: updated guidance and exemplars for reporting systematic reviews. BMJ. 372:n160. doi:10.1136/bmj.n160

Reporting Guidelines are recommendations to help describe your work clearly

Your research will be used by people from different disciplines and backgrounds for decades to come. Reporting guidelines list the information you should describe so that everyone can understand, replicate, and synthesise your work.

Reporting guidelines do not prescribe how research should be designed or conducted. Rather, they help authors transparently describe what they did, why they did it, and what they found.

Reporting guidelines make writing research easier, and transparent research leads to better patient outcomes.

Easier writing

Following guidance makes writing easier and quicker.

Smoother publishing

Many journals require completed reporting checklists at submission.

Maximum impact

From nobel prizes to null results, articles have more impact when everyone can use them.

Who reads research?

You work will be read by different people, for different reasons, around the world, and for decades to come. Reporting guidelines help you consider all of your potential audiences. For example, your research may be read by researchers from different fields, by clinicians, patients, evidence synthesisers, peer reviewers, or editors. Your readers will need information to understand, to replicate, apply, appraise, synthesise, and use your work.

Cohort studies

A cohort study is an observational study in which a group of people with a particular exposure (e.g. a putative risk factor or protective factor) and a group of people without this exposure are followed over time. The outcomes of the people in the exposed group are compared to the outcomes of the people in the unexposed group to see if the exposure is associated with particular outcomes (e.g. getting cancer or length of life).

Source.

Case-control studies

A case-control study is a research method used in healthcare to investigate potential risk factors for a specific disease. It involves comparing individuals who have been diagnosed with the disease (cases) to those who have not (controls). By analysing the differences between the two groups, researchers can identify factors that may contribute to the development of the disease.

An example would be when researchers conducted a case-control study examining whether exposure to diesel exhaust particles increases the risk of respiratory disease in underground miners. Cases included miners diagnosed with respiratory disease, while controls were miners without respiratory disease. Participants' past occupational exposures to diesel exhaust particles were evaluated to compare exposure rates between cases and controls.

Source.

Cross-sectional studies

A cross-sectional study (also sometimes called a "cross-sectional survey") serves as an observational tool, where researchers capture data from a cohort of participants at a singular point. This approach provides a 'snapshot'— a brief glimpse into the characteristics or outcomes prevalent within a designated population at that precise point in time. The primary aim here is not to track changes or developments over an extended period but to assess and quantify the current situation regarding specific variables or conditions. Such a methodology is instrumental in identifying patterns or correlations among various factors within the population, providing a basis for further, more detailed investigation.

Source

Systematic reviews

A systematic review is a comprehensive approach designed to identify, evaluate, and synthesise all available evidence relevant to a specific research question. In essence, it collects all possible studies related to a given topic and design, and reviews and analyses their results.

The process involves a highly sensitive search strategy to ensure that as much pertinent information as possible is gathered. Once collected, this evidence is often critically appraised to assess its quality and relevance, ensuring that conclusions drawn are based on robust data. Systematic reviews often involve defining inclusion and exclusion criteria, which help to focus the analysis on the most relevant studies, ultimately synthesising the findings into a coherent narrative or statistical synthesis. Some systematic reviews will include a meta-analysis.

Source

Systematic review protocols

TODO

Meta analyses of Observational Studies

TODO

Randomised Trials

A randomised controlled trial (RCT) is a trial in which participants are randomly assigned to one of two or more groups: the experimental group or groups receive the intervention or interventions being tested; the comparison group (control group) receive usual care or no treatment or a placebo. The groups are then followed up to see if there are any differences between the results. This helps in assessing the effectiveness of the intervention.

Source

Randomised Trial Protocols

TODO

Qualitative research

Research that aims to gather and analyse non-numerical (descriptive) data in order to gain an understanding of individuals' social reality, including understanding their attitudes, beliefs, and motivation. This type of research typically involves in-depth interviews, focus groups, or field observations in order to collect data that is rich in detail and context. Qualitative research is often used to explore complex phenomena or to gain insight into people's experiences and perspectives on a particular topic. It is particularly useful when researchers want to understand the meaning that people attach to their experiences or when they want to uncover the underlying reasons for people's behavior. Qualitative methods include ethnography, grounded theory, discourse analysis, and interpretative phenomenological analysis.

Source

Case Reports

TODO

Diagnostic Test Accuracy Studies

Diagnostic accuracy studies focus on estimating the ability of the test(s) to correctly identify subjects with a predefined target condition, or the condition of interest (sensitivity) as well as to clearly identify those without the condition (specificity).

Prediction Models

Prediction model research is used to test the accurarcy of a model or test in estimating an outcome value or risk. Most models estimate the probability of the presence of a particular health condition (diagnostic) or whether a particular outcome will occur in the future (prognostic). Prediction models are used to support clinical decision making, such as whether to refer patients for further testing, monitor disease deterioration or treatment effects, or initiate treatment or lifestyle changes. Examples of well known prediction models include EuroSCORE II for cardiac surgery, the Gail model for breast cancer, the Framingham risk score for cardiovascular disease, IMPACT for traumatic brain injury, and FRAX for osteoporotic and hip fractures.

Source

Animal Research

TODO

Quality Improvement in Healthcare

Quality improvement research is about finding out how to improve and make changes in the most effective way. It is about systematically and rigourously exploring "what works" to improve quality in healthcare and the best ways to measure and disseminate this to ensure positive change. Most quality improvement effectiveness research is conducted in hospital settings, is focused on multiple quality improvement interventions, and uses process measures as outcomes. There is a great deal of variation in the research designs used to examine quality improvement effectiveness.

Source

Economic Evaluations in Healthcare

TODO

Meta Analyses

A meta-analysis is a statistical technique that amalgamates data from multiple studies to yield a single estimate of the effect size. This approach enhances precision and offers a more comprehensive understanding by integrating quantitative findings. Central to a meta-analysis is the evaluation of heterogeneity, which examines variations in study outcomes to ensure that differences in populations, interventions, or methodologies do not skew results. Techniques such as meta-regression or subgroup analysis are frequently employed to explore how various factors might influence the outcomes. This method is particularly effective when aiming to quantify the effect size, odds ratio, or risk ratio, providing a clearer numerical estimate that can significantly inform clinical or policy decisions.

How Meta-analyses and Systematic Reviews Work Together

Systematic reviews and meta-analyses function together, each complementing the other to provide a more robust understanding of research evidence. A systematic review meticulously gathers and evaluates all pertinent studies, establishing a solid foundation of qualitative and quantitative data. Within this framework, if the collected data exhibit sufficient homogeneity, a meta-analysis can be performed. This statistical synthesis allows for the integration of quantitative results from individual studies, producing a unified estimate of effect size. Techniques such as meta-regression or subgroup analysis may further refine these findings, elucidating how different variables impact the overall outcome. By combining these methodologies, researchers can achieve both a comprehensive narrative synthesis and a precise quantitative measure, enhancing the reliability and applicability of their conclusions. This integrated approach ensures that the findings are not only well-rounded but also statistically robust, providing greater confidence in the evidence base.

Why Don't All Systematic Reviews Use a Meta-Analysis?

Systematic reviews do not always have meta-analyses, due to variations in the data. For a meta-analysis to be viable, the data from different studies must be sufficiently similar, or homogeneous, in terms of design, population, and interventions. When the data shows significant heterogeneity, meaning there are considerable differences among the studies, combining them could lead to skewed or misleading conclusions. Furthermore, the quality of the included studies is critical; if the studies are of low methodological quality, merging their results could obscure true effects rather than explain them.

Protocol

A plan or set of steps that defines how something will be done. Before carrying out a research study, for example, the research protocol sets out what question is to be answered and how information will be collected and analysed.

Source

Systematic_review

A review that uses explicit, systematic methods to collate and synthesize findings of studies that address a clearly formulated question.

Source

Statistical synthesis

The combination of quantitative results of two or more studies. This encompasses meta-analysis of effect estimates (described below) and other methods, such as combining P values, calculating the range and distribution of observed effects, and vote counting based on the direction of effect (see McKenzie and Brennan for a description of each method)

Meta-analysis of effect estimates

A statistical technique used to synthesize results when study effect estimates and their variances are available, yielding a quantitative summary of results.

Source

Outcome

An event or measurement collected for participants in a study (such as quality of life, mortality).

Result

The combination of a point estimate (such as a mean difference, risk ratio or proportion) and a measure of its precision (such as a confidence/credible interval) for a particular outcome.

Reports

Documents (paper or electronic) supplying information about a particular study. A report could be a journal article, preprint, conference abstract, study register entry, clinical study report, dissertation, unpublished manuscript, government report, or any other document providing relevant information.

Record

The title or abstract (or both) of a report indexed in a database or website (such as a title or abstract for an article indexed in Medline). Records that refer to the same report (such as the same journal article) are “duplicates”; however, records that refer to reports that are merely similar (such as a similar abstract submitted to two different conferences) should be considered unique.

Study

An investigation, such as a clinical trial, that includes a defined group of participants and one or more interventions and outcomes. A “study” might have multiple reports. For example, reports could include the protocol, statistical analysis plan, baseline characteristics, results for the primary outcome, results for harms, results for secondary outcomes, and results for additional mediator and moderator analyses.