Meta-analyses of Biomarker Associations With Cancer Risk
Meta-analyses of Biomarker Associations With Cancer Risk
Background Numerous biomarkers have been associated with cancer risk. We assessed whether there is evidence for excess statistical significance in results of cancer biomarker studies, suggesting biases.
Methods We systematically searched PubMed for meta-analyses of nongenetic biomarkers and cancer risk. The number of observed studies with statistically significant results was compared with the expected number, based on the statistical power of each study under different assumptions for the plausible effect size. We also evaluated small-study effects using asymmetry tests. All statistical tests were two-sided.
Results We included 98 meta-analyses with 847 studies. Forty-three meta-analyses (44%) found nominally statistically significant summary effects (random effects). The proportion of meta-analyses with statistically significant effects was highest for infectious agents (86%), inflammatory (67%), and insulin-like growth factor (IGF)/insulin system (52%) biomarkers. Overall, 269 (32%) individual studies observed nominally statistically significant results. A statistically significant excess of the observed over the expected number of studies with statistically significant results was seen in 20 meta-analyses. An excess of observed vs expected was observed in studies of IGF/insulin (P ≤ .04) and inflammation systems (P ≤ .02). Only 12 meta-analyses (12%) had a statistically significant summary effect size, more than 1000 case patients, and no hints of small-study effects or excess statistical significance; only four of them had large effect sizes, three of which pertained to infectious agents (Helicobacter pylori, hepatitis and human papilloma viruses).
Conclusions Most well-documented biomarkers of cancer risk without evidence of bias pertain to infectious agents. Conversely, an excess of statistically significant findings was observed in studies of IGF/insulin and inflammation systems, suggesting reporting biases.
Since the 1980s, biomarkers have frequently been used to refine exposure assessment and evaluate more accurately potential associations with cancer incidence. Biomarkers may also further our understanding of the mechanisms of carcinogenesis. However, biomarker-based epidemiologic studies are not free from biases. Empirical evidence from diverse fields suggests that the literature of biomarker studies may sometimes highlight strong effects that are irreproducible or found to be smaller when larger studies are performed. One particular concern is that the literature of biomarkers seems to suffer from selective reporting biases favoring statistically significant ("positive") results. Studies for biomarkers of cancer prognosis almost always report some statistically significant result. However, it is unclear whether this also equally applies to studies that try to identify associations of biomarkers with cancer risk rather than prognosis.
Bias in favor of positive results may be generated with several different mechanisms. First, bias against the publication of negative results or publication of those results after considerable delay may exist. Second, selective analysis and outcome reporting bias may emerge when there are many analyses that can be performed (using, for example, different outcome definitions, different adjustments for confounders, or models with different statistical terms for exposures and confounders), but only the analysis with the "best" results is presented. Third, in theory, positive results may be totally faked, although fraud and data fabrication is unlikely to be nearly as common as the other mechanisms. All of these mechanisms eventually end up producing a literature where an inflated proportion of published studies with positive results exists.
Detecting these biases is not a straightforward process. There are several statistical methods that try to probe for publication bias in studies included in meta-analyses, the most popular of which are asymmetry tests evaluating whether small studies give different results than larger ones. However, these methods may not be very sensitive or specific for detecting such biases, especially when a limited number of studies is included in a meta-analysis. An alternative approach is to examine whether there are too many reported statistically significant results in single studies based on what would be expected under different assumptions about the plausible effect size of each association. An added advantage of this excess statistical significance test is that it can be applied not only to a single meta-analysis but also to many meta-analyses across a given field. Thus power is optimized to detect biases that pertain to larger fields and disciplines rather than just single associations. This test has been applied before and has found an excess of statistically significant findings in the fields of randomized trials for neuroleptic drugs, genetic association studies for Alzheimer's disease, and brain volume abnormalities.
The literature on biomarkers and risk of cancer is rapidly expanding and is expected to grow even more in the future with the incorporation of transcriptomics, proteomics, and metabolomics. It is important to understand the extent of potential biases in this field as multiple associations accumulate in its literature. Therefore, in this paper, we probed whether there is evidence for excess statistical significance in studies of biomarkers and risk of cancer, and we tried to evaluate how many of the previously studied associations that have been synthesized with meta-analyses have robust evidence for their presence.
Abstract and Introduction
Abstract
Background Numerous biomarkers have been associated with cancer risk. We assessed whether there is evidence for excess statistical significance in results of cancer biomarker studies, suggesting biases.
Methods We systematically searched PubMed for meta-analyses of nongenetic biomarkers and cancer risk. The number of observed studies with statistically significant results was compared with the expected number, based on the statistical power of each study under different assumptions for the plausible effect size. We also evaluated small-study effects using asymmetry tests. All statistical tests were two-sided.
Results We included 98 meta-analyses with 847 studies. Forty-three meta-analyses (44%) found nominally statistically significant summary effects (random effects). The proportion of meta-analyses with statistically significant effects was highest for infectious agents (86%), inflammatory (67%), and insulin-like growth factor (IGF)/insulin system (52%) biomarkers. Overall, 269 (32%) individual studies observed nominally statistically significant results. A statistically significant excess of the observed over the expected number of studies with statistically significant results was seen in 20 meta-analyses. An excess of observed vs expected was observed in studies of IGF/insulin (P ≤ .04) and inflammation systems (P ≤ .02). Only 12 meta-analyses (12%) had a statistically significant summary effect size, more than 1000 case patients, and no hints of small-study effects or excess statistical significance; only four of them had large effect sizes, three of which pertained to infectious agents (Helicobacter pylori, hepatitis and human papilloma viruses).
Conclusions Most well-documented biomarkers of cancer risk without evidence of bias pertain to infectious agents. Conversely, an excess of statistically significant findings was observed in studies of IGF/insulin and inflammation systems, suggesting reporting biases.
Introduction
Since the 1980s, biomarkers have frequently been used to refine exposure assessment and evaluate more accurately potential associations with cancer incidence. Biomarkers may also further our understanding of the mechanisms of carcinogenesis. However, biomarker-based epidemiologic studies are not free from biases. Empirical evidence from diverse fields suggests that the literature of biomarker studies may sometimes highlight strong effects that are irreproducible or found to be smaller when larger studies are performed. One particular concern is that the literature of biomarkers seems to suffer from selective reporting biases favoring statistically significant ("positive") results. Studies for biomarkers of cancer prognosis almost always report some statistically significant result. However, it is unclear whether this also equally applies to studies that try to identify associations of biomarkers with cancer risk rather than prognosis.
Bias in favor of positive results may be generated with several different mechanisms. First, bias against the publication of negative results or publication of those results after considerable delay may exist. Second, selective analysis and outcome reporting bias may emerge when there are many analyses that can be performed (using, for example, different outcome definitions, different adjustments for confounders, or models with different statistical terms for exposures and confounders), but only the analysis with the "best" results is presented. Third, in theory, positive results may be totally faked, although fraud and data fabrication is unlikely to be nearly as common as the other mechanisms. All of these mechanisms eventually end up producing a literature where an inflated proportion of published studies with positive results exists.
Detecting these biases is not a straightforward process. There are several statistical methods that try to probe for publication bias in studies included in meta-analyses, the most popular of which are asymmetry tests evaluating whether small studies give different results than larger ones. However, these methods may not be very sensitive or specific for detecting such biases, especially when a limited number of studies is included in a meta-analysis. An alternative approach is to examine whether there are too many reported statistically significant results in single studies based on what would be expected under different assumptions about the plausible effect size of each association. An added advantage of this excess statistical significance test is that it can be applied not only to a single meta-analysis but also to many meta-analyses across a given field. Thus power is optimized to detect biases that pertain to larger fields and disciplines rather than just single associations. This test has been applied before and has found an excess of statistically significant findings in the fields of randomized trials for neuroleptic drugs, genetic association studies for Alzheimer's disease, and brain volume abnormalities.
The literature on biomarkers and risk of cancer is rapidly expanding and is expected to grow even more in the future with the incorporation of transcriptomics, proteomics, and metabolomics. It is important to understand the extent of potential biases in this field as multiple associations accumulate in its literature. Therefore, in this paper, we probed whether there is evidence for excess statistical significance in studies of biomarkers and risk of cancer, and we tried to evaluate how many of the previously studied associations that have been synthesized with meta-analyses have robust evidence for their presence.