Abstract
Motivation: A number of algorithms have been proposed for the processing of feature-level data from high-density oligonucleotide microarrays to give estimates of transcript abundance. Performance in the common task of detecting differential expression between samples can be quantified by the statistical concepts of sensitivity and specificity, and represented by the use of receiver operating characteristic curves. These have been previously presented for small numbers of genes known to be differentially present in spiked-in samples. We present here a study of performance over a large number (thousands) of transcripts for which there is strong evidence of differential expression, with corresponding false positive rates controlled by comparisons between replicates. Results: The straight-line regression analysis of a mixture series with replicates by five estimation algorithms produces a consensus set of 4462 transcripts with differential expression of agreed direction and high significance (p < 0.01) according to all algorithms. The more difficult task of two-sample tests between adjacent mixture levels produces performance curves of fraction true positive detected against significance level. Performance varies significantly between algorithms: at the p < 0.01 level, the detection rate varies between 41 and 66%. A control using comparisons between replicates at the same levels indicates that the tests produce empirical false positive rates closely matching the nominal p-values.
Original language | English |
---|---|
Pages (from-to) | 1060-1065 |
Number of pages | 6 |
Journal | Bioinformatics |
Volume | 20 |
Issue number | 7 |
DOIs | |
Publication status | Published - 1 May 2004 |