Comparison and visualisation of agreement for paired lists of rankings

Margaret R. Donald*, Susan R. Wilson

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Output from analysis of a high-throughput 'omics' experiment very often is a ranked list. One commonly encountered example is a ranked list of differentially expressed genes from a gene expression experiment, with a length of many hundreds of genes. There are numerous situations where interest is in the comparison of outputs following, say, two (or more) different experiments, or of different approaches to the analysis that produce different ranked lists. Rather than considering exact agreement between the rankings, following others, we consider two ranked lists to be in agreement if the rankings differ by some fixed distance. Generally only a relatively small subset of the k top-ranked items will be in agreement. So the aim is to find the point k at which the probability of agreement in rankings changes from being greater than 0.5 to being less than 0.5. We use penalized splines and a Bayesian logit model, to give a nonparametric smooth to the sequence of agreements, as well as pointwise credible intervals for the probability of agreement. Our approach produces a point estimate and a credible interval for k. R code is provided. The method is applied to rankings of genes from breast cancer microarray experiments.

    Original languageEnglish
    Pages (from-to)31-45
    Number of pages15
    JournalStatistical Applications in Genetics and Molecular Biology
    Volume16
    Issue number1
    DOIs
    Publication statusPublished - 1 Mar 2017

    Fingerprint

    Dive into the research topics of 'Comparison and visualisation of agreement for paired lists of rankings'. Together they form a unique fingerprint.

    Cite this