Robust principal component analysis for power transformed compositional data

J. L. Scealy, Patrice de Caritat, Eric C. Grunsky, Michail T. Tsagris, A. H. Welsh

    Research output: Contribution to journalArticlepeer-review

    36 Citations (Scopus)

    Abstract

    Geochemical surveys collect sediment or rock samples, measure the concentration of chemical elements, and report these typically either in weight percent or in parts per million (ppm). There are usually a large number of elements measured and the distributions are often skewed, containing many potential outliers. We present a new robust principal component analysis (PCA) method for geochemical survey data, that involves first transforming the compositional data onto a manifold using a relative power transformation. A flexible set of moment assumptions are made which take the special geometry of the manifold into account. The Kent distribution moment structure arises as a special case when the chosen manifold is the hypersphere. We derive simple moment and robust estimators (RO) of the parameters which are also applicable in high-dimensional settings. The resulting PCA based on these estimators is done in the tangent space and is related to the power transformation method used in correspondence analysis. To illustrate, we analyze major oxide data from the National Geochemical Survey of Australia. When compared with the traditional approach in the literature based on the centered log-ratio transformation, the new PCA method is shown to be more successful at dimension reduction and gives interpretable results.

    Original languageEnglish
    Pages (from-to)136-148
    Number of pages13
    JournalJournal of the American Statistical Association
    Volume110
    Issue number509
    DOIs
    Publication statusPublished - 2 Jan 2015

    Fingerprint

    Dive into the research topics of 'Robust principal component analysis for power transformed compositional data'. Together they form a unique fingerprint.

    Cite this