TY - JOUR
T1 - Visualisation of "high p, small n" data
AU - Pittelkow, Y. E.
AU - Wilson, S. R.
PY - 2007/12
Y1 - 2007/12
N2 - Development of methods for visualisation of high-dimensional data where the number of observations, n, is small compared to the number of variables, p, is of increasing importance. One major application is the burgeoning field of microarray (gene expression) experiments. Because of their high cost, the number of chips (n) is O(10 - 102) while the number (p) of genes (including expressed sequence tags) on each chip is O(103 - 104). Based on synthetic data simulated in accord with current biological interpretation of microarray data, we have adapted the biplot that simultaneously plots the genes and the chips to display relevant experimental information. Other ordination techniques are also useful for visually exploring microarray data. The biological information that can be revealed by applying these exploratory, visual techniques is illustrated using data from gene expression experiments. When ordination methods, or dimension reduction methods such as PCA and its many variants, are used, in association with gene selection methods, it is well known that "selection bias" can result. We show an application of bootstrap methodology to ordination methods that can be used to account for this bias. Such methods are invaluable when visualization methods are used for pattern recognition, such as when identifying previously unknown sub-classes of tumours in molecular classification.
AB - Development of methods for visualisation of high-dimensional data where the number of observations, n, is small compared to the number of variables, p, is of increasing importance. One major application is the burgeoning field of microarray (gene expression) experiments. Because of their high cost, the number of chips (n) is O(10 - 102) while the number (p) of genes (including expressed sequence tags) on each chip is O(103 - 104). Based on synthetic data simulated in accord with current biological interpretation of microarray data, we have adapted the biplot that simultaneously plots the genes and the chips to display relevant experimental information. Other ordination techniques are also useful for visually exploring microarray data. The biological information that can be revealed by applying these exploratory, visual techniques is illustrated using data from gene expression experiments. When ordination methods, or dimension reduction methods such as PCA and its many variants, are used, in association with gene selection methods, it is well known that "selection bias" can result. We show an application of bootstrap methodology to ordination methods that can be used to account for this bias. Such methods are invaluable when visualization methods are used for pattern recognition, such as when identifying previously unknown sub-classes of tumours in molecular classification.
UR - http://www.scopus.com/inward/record.url?scp=36849077206&partnerID=8YFLogxK
U2 - 10.1007/s00180-007-0060-1
DO - 10.1007/s00180-007-0060-1
M3 - Article
SN - 0943-4062
VL - 22
SP - 533
EP - 541
JO - Computational Statistics
JF - Computational Statistics
IS - 4
ER -