Hypothesis testing for topological data analysis

Andrew Robinson, Katharine Turner*

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    37 Citations (Scopus)

    Abstract

    Persistence homology is a vital tool for topological data analysis. Previous work has developed some statistical estimators for characteristics of collections of persistence diagrams. However, tools that provide statistical inference for observations that are persistence diagrams are limited. Specifically, there is a need for tests that can assess the strength of evidence against a claim that two samples arise from the same population or process. This expository paper provides an introduction to randomization-style null hypothesis significance tests (NHST) and shows how they can be used with sets of persistence diagrams. The hypothesis test is based on a loss function that comprises pairwise distances between the elements of each sample and all the elements in the other sample. We use this method to analyze a range of simulated and experimental data. Through these examples we experimentally explore the power of the p-values. Our results show that the randomization-style NHST based on pairwise distances can distinguish between samples from different processes, which suggests that its use for hypothesis tests upon persistence diagrams is reasonable. We demonstrate its application on a real dataset of fMRI data of patients with ADHD.

    Original languageEnglish
    Pages (from-to)241-261
    Number of pages21
    JournalJournal of Applied and Computational Topology
    Volume1
    Issue number2
    DOIs
    Publication statusPublished - Dec 2017

    Fingerprint

    Dive into the research topics of 'Hypothesis testing for topological data analysis'. Together they form a unique fingerprint.

    Cite this