10a. Data items – Outcomes
List and define all outcomes for which data were sought. Specify whether all results that were compatible with each outcome domain in each study were sought (for example, for all measures, time points, analyses), and, if not, the methods used to decide which results to collect.
Essential elements
List and define the outcome domains and time frame of measurement for which data were sought.
Specify whether all results that were compatible with each outcome domain in each study were sought, and, if not, what process was used to select results within eligible domains.
If any changes were made to the inclusion or definition of the outcome domains or to the importance given to them in the review, specify the changes, along with a rationale.
If any changes were made to the processes used to select results within eligible outcome domains, specify the changes, along with a rationale.
Additional elements
- Consider specifying which outcome domains were considered the most important for interpreting the review’s conclusions (such as “critical” versus “important” outcomes) and provide rationale for the labelling (such as “a recent core outcome set identified the outcomes labelled ‘critical’ as being the most important to patients”).
Explanation
Defining outcomes in systematic reviews generally involves specifying outcome domains (such as pain, quality of life, adverse events such as nausea) and the time frame of measurement (such as less than six months).1 Included studies may report multiple results that are eligible for inclusion within the review outcome definition.23 For example, a study may report results for two measures of pain (such as the McGill Pain Questionnaire and the Brief Pain Inventory), at two time points (such as four weeks and eight weeks), all of which are compatible with a review outcome defined as “pain <6 months.” Multiple results compatible with an outcome domain in a study might also arise when study investigators report results based on multiple analysis populations (such as all participants randomised, all participants receiving a specific amount of treatment), methods for handling missing data (such as multiple imputation, last-observation-carried-forward), or methods for handling confounding (such as adjustment for different covariates).34 5
Reviewers might seek all results that were compatible with each outcome definition from each study or use a process to select a subset of the results.65 Examples of processes to select results include selecting the outcome definition that (a) was most common across studies, (b) the review authors considered “best” according to a prespecified hierarchy (for example, which prioritises measures included in a core outcome measurement set), or (c) the study investigators considered most important (such as the study’s primary outcome). It is important to specify the methods that were used to select the results when multiple results were available so that users are able to judge the appropriateness of those methods and whether there is potential for bias in the selection of results.
Reviewers may make changes to the inclusion or definition of the outcome domains or to the importance given to them in the review (for example, an outcome listed as “important” in the protocol is considered “critical” in the review). Providing a rationale for the change allows readers to assess the legitimacy of the change and whether it has potential to introduce bias in the review process.7
Example
Note: the following is an abridged version of an example presented in full in supplementary table S1 on bmj.com.
“Eligible outcomes were broadly categorised as follows:
- Cognitive function
- Global cognitive function
- Domain-specific cognitive function (especially domains that reflect specific alcohol-related neuropathologies, such as psychomotor speed and working memory)
- Clinical diagnoses of cognitive impairment
- Mild cognitive impairment (also referred to as mild neurocognitive disorders)
Any measure of cognitive function was eligible for inclusion. The tests or diagnostic criteria used in each study should have had evidence of validity and reliability for the assessment of mild cognitive impairment, but studies were not excluded on this basis…Results could be reported as an overall test score that provides a composite measure across multiple areas of cognitive ability (i.e. global cognitive function), sub-scales that provide a measure of domain-specific cognitive function or cognitive abilities (such as processing speed, memory), or both…Studies with a minimum follow-up of 6 months were eligible, a time frame chosen to ensure that studies were designed to examine more persistent effects of alcohol consumption…No restrictions were placed on the number of points at which the outcome was measured, but the length of follow-up and number of measurement points (including a baseline measure of cognition) was considered when interpreting study findings and in deciding which outcomes were similar enough to combine for synthesis.
We anticipated that individual studies would report data for multiple cognitive outcomes. Specifically, a single study may report results:
For multiple constructs related to cognitive function, for example, global cognitive function and cognitive ability on specific domains (e.g. memory, attention, problem-solving, language);
Using multiple methods or tools to measure the same or similar outcome, for example reporting measures of global cognitive function using both the Mini-Mental State Examination and the Montreal Cognitive Assessment;
At multiple time points, for example, at 1, 5, and 10 years.
Where multiple cognition outcomes were reported, we selected one outcome for inclusion in analyses and for reporting the main outcomes (e.g. for GRADEing), choosing the result that provided the most complete information for analysis. Where multiple results remained, we listed all available outcomes (without results) and asked our content expert to independently rank these based on relevance to the review question, and the validity and reliability of the measures used. Measures of global cognitive function were prioritised, followed by measures of memory, then executive function. In the circumstance where results from multiple multivariable models were presented, we extracted associations from the most fully adjusted model, except in the case where an analysis adjusted for a possible intermediary along the causal pathway (i.e. post-baseline measures of prognostic factors (e.g. smoking, drug use, hypertension)).”8
Training
The UK EQUATOR Centre runs training on how to write using reporting guidelines.
Discuss this item
Visit this items’ discussion page to ask questions and give feedback.