TY - JOUR
T1 - A pluralistic framework for measuring, interpreting and decomposing heterogeneity in meta-analysis
AU - Yang, Yefeng
AU - Noble, Daniel W.A.
AU - Spake, Rebecca
AU - Senior, Alistair M.
AU - Lagisz, Malgorzata
AU - Nakagawa, Shinichi
N1 - © 2025 The Author(s).
PY - 2025
Y1 - 2025
N2 - Measuring heterogeneity, or inconsistency, among effect sizes is a crucial step for interpreting meta-analytic evidence across diverse taxonomic groups and spatiotemporal contexts. However, ecologists and evolutionary biologists often interpret overall mean effects (mean population effects) as consistent across contexts, either explicitly or implicitly, without properly quantifying and interpreting heterogeneity. Here, we present a pluralistic approach that aims to quantify heterogeneity by introducing complementary metrics, each of which decomposes heterogeneity into within-study, between-study and between-species (species and phylogenetic) variances. These metrics include the traditional I2 (variance-standardized metric), the newly derived coefficient of variation for heterogeneity (CVH family; mean-standardized metric), the second-order coefficient of variation (M family; variance–mean-standardized metric) and their stratified variants. To demonstrate the benefits of the combined use of these measures, we synthesize heterogeneity estimates from 512 ecological and evolutionary meta-analyses. We show that total heterogeneity (variance of true effects) is, on average, 10 times larger than statistical noise (sampling error variance), contributing to 91% of the observed variance (median I2 = 91%). This amount of heterogeneity is nearly twice the size of the mean population effect (median CVH = 1.8 and M = 0.6), indicating substantial variation among studies within a meta-analysis. Moreover, different effect size types yield different values of heterogeneity metrics because they are inherently influenced by statistical properties of their effect size estimators. As such, comparisons of heterogeneity across effect size types should be made with caution, albeit the proposed heterogeneity metrics are unit-free. Our large-scale synthesis also provides new benchmarks for the interpretation of heterogeneity and recommendations on how to quantify and report heterogeneity. New extensions for stratifying heterogeneity metrics will clarify our understanding of the generalisability, and at what level of meta-analytic effects in ecology and evolution.
AB - Measuring heterogeneity, or inconsistency, among effect sizes is a crucial step for interpreting meta-analytic evidence across diverse taxonomic groups and spatiotemporal contexts. However, ecologists and evolutionary biologists often interpret overall mean effects (mean population effects) as consistent across contexts, either explicitly or implicitly, without properly quantifying and interpreting heterogeneity. Here, we present a pluralistic approach that aims to quantify heterogeneity by introducing complementary metrics, each of which decomposes heterogeneity into within-study, between-study and between-species (species and phylogenetic) variances. These metrics include the traditional I2 (variance-standardized metric), the newly derived coefficient of variation for heterogeneity (CVH family; mean-standardized metric), the second-order coefficient of variation (M family; variance–mean-standardized metric) and their stratified variants. To demonstrate the benefits of the combined use of these measures, we synthesize heterogeneity estimates from 512 ecological and evolutionary meta-analyses. We show that total heterogeneity (variance of true effects) is, on average, 10 times larger than statistical noise (sampling error variance), contributing to 91% of the observed variance (median I2 = 91%). This amount of heterogeneity is nearly twice the size of the mean population effect (median CVH = 1.8 and M = 0.6), indicating substantial variation among studies within a meta-analysis. Moreover, different effect size types yield different values of heterogeneity metrics because they are inherently influenced by statistical properties of their effect size estimators. As such, comparisons of heterogeneity across effect size types should be made with caution, albeit the proposed heterogeneity metrics are unit-free. Our large-scale synthesis also provides new benchmarks for the interpretation of heterogeneity and recommendations on how to quantify and report heterogeneity. New extensions for stratifying heterogeneity metrics will clarify our understanding of the generalisability, and at what level of meta-analytic effects in ecology and evolution.
KW - context dependence
KW - effect size
KW - heterogeneity
KW - linear models
KW - meta-analysis
KW - mixed effects model
UR - https://www.scopus.com/pages/publications/105016254775
U2 - 10.1111/2041-210x.70155
DO - 10.1111/2041-210x.70155
M3 - Article
AN - SCOPUS:105016254775
SN - 2041-210X
VL - 16
JO - Methods in Ecology and Evolution
JF - Methods in Ecology and Evolution
IS - 11
ER -