TY - JOUR
T1 - Comparison and visualisation of agreement for paired lists of rankings
AU - Donald, Margaret R.
AU - Wilson, Susan R.
N1 - Publisher Copyright:
© 2017 Walter de Gruyter GmbH, Berlin/Boston.
PY - 2017/3/1
Y1 - 2017/3/1
N2 - Output from analysis of a high-throughput 'omics' experiment very often is a ranked list. One commonly encountered example is a ranked list of differentially expressed genes from a gene expression experiment, with a length of many hundreds of genes. There are numerous situations where interest is in the comparison of outputs following, say, two (or more) different experiments, or of different approaches to the analysis that produce different ranked lists. Rather than considering exact agreement between the rankings, following others, we consider two ranked lists to be in agreement if the rankings differ by some fixed distance. Generally only a relatively small subset of the k top-ranked items will be in agreement. So the aim is to find the point k at which the probability of agreement in rankings changes from being greater than 0.5 to being less than 0.5. We use penalized splines and a Bayesian logit model, to give a nonparametric smooth to the sequence of agreements, as well as pointwise credible intervals for the probability of agreement. Our approach produces a point estimate and a credible interval for k. R code is provided. The method is applied to rankings of genes from breast cancer microarray experiments.
AB - Output from analysis of a high-throughput 'omics' experiment very often is a ranked list. One commonly encountered example is a ranked list of differentially expressed genes from a gene expression experiment, with a length of many hundreds of genes. There are numerous situations where interest is in the comparison of outputs following, say, two (or more) different experiments, or of different approaches to the analysis that produce different ranked lists. Rather than considering exact agreement between the rankings, following others, we consider two ranked lists to be in agreement if the rankings differ by some fixed distance. Generally only a relatively small subset of the k top-ranked items will be in agreement. So the aim is to find the point k at which the probability of agreement in rankings changes from being greater than 0.5 to being less than 0.5. We use penalized splines and a Bayesian logit model, to give a nonparametric smooth to the sequence of agreements, as well as pointwise credible intervals for the probability of agreement. Our approach produces a point estimate and a credible interval for k. R code is provided. The method is applied to rankings of genes from breast cancer microarray experiments.
UR - http://www.scopus.com/inward/record.url?scp=85016051785&partnerID=8YFLogxK
U2 - 10.1515/sagmb-2016-0036
DO - 10.1515/sagmb-2016-0036
M3 - Article
SN - 1544-6115
VL - 16
SP - 31
EP - 45
JO - Statistical Applications in Genetics and Molecular Biology
JF - Statistical Applications in Genetics and Molecular Biology
IS - 1
ER -