Lag length selection and p-hacking in Granger causality testing: prevalence and performance of meta-regression models

Stephan B. Bruns*, David I. Stern

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    22 Citations (Scopus)

    Abstract

    The academic system incentivizes p-hacking, where researchers select estimates and statistics with statistically significant p-values for publication. We analyze the complete process of Granger causality testing including p-hacking using Monte Carlo simulations. If the degrees of freedom of the underlying vector autoregressive model are small to moderate, information criteria tend to overfit the lag length and overfitted vector autoregressive models tend to result in false-positive findings of Granger causality. Researchers may p-hack Granger causality tests by estimating multiple vector autoregressive models with different lag lengths and then selecting only those models that reject the null of Granger non-causality for presentation in the final publication. We show that overfitted lag lengths and the corresponding false-positive findings of Granger causality can frequently occur in research designs that are prevalent in empirical macroeconomics. We demonstrate that meta-regression models can control for spuriously significant Granger causality tests due to overfitted lag lengths. Finally, we find evidence that false-positive findings of Granger causality may be prevalent in the large literature that tests for Granger causality between energy use and economic output, while we do not find evidence for a genuine relation between these variables as tested in the literature.

    Original languageEnglish
    Pages (from-to)797-830
    Number of pages34
    JournalEmpirical Economics
    Volume56
    Issue number3
    DOIs
    Publication statusPublished - 15 Mar 2019

    Fingerprint

    Dive into the research topics of 'Lag length selection and p-hacking in Granger causality testing: prevalence and performance of meta-regression models'. Together they form a unique fingerprint.

    Cite this