Model-based simultaneous clustering and ordination of multivariate abundance data in ecology

    Research output: Contribution to journalArticlepeer-review

    11 Citations (Scopus)

    Abstract

    When studying multivariate abundance data, one of the main patterns ecologists are often interested in is whether the sites exhibit clustering on the low-dimensional, ordination space representing species composition. A new model-based approach called CORAL (Clustering and Ordination Regression AnaLysis) is developed for tackling this question, based on performing simultaneous clustering and ordination using latent variable regression. By drawing the latent variables from a finite mixture density, CORAL probabilistically classifies sites based on their positions on an underlying signal space. This is similar to mixtures of factor analyzers, except CORAL is designed for non-normal responses and uses species-specific rather than cluster-specific factor loadings (regression coefficients). Estimation is performed via Bayesian MCMC sampling, with code provided in the Supplementary Material. Simulations demonstrate that, by utilizing the joint information available in the data for both classification and dimension reduction, CORAL outperforms several popular, algorithm-based methods for clustering and ordination in ecology. CORAL is applied to a dataset of presence–absence records collected at sites along the Doubs River near the France–Switzerland border, with results revealing two clusters or ecological regions partly resembling the spatial separation of upstream and downstream sites.

    Original languageEnglish
    Pages (from-to)1-10
    Number of pages10
    JournalComputational Statistics and Data Analysis
    Volume105
    DOIs
    Publication statusPublished - 1 Jan 2017

    Fingerprint

    Dive into the research topics of 'Model-based simultaneous clustering and ordination of multivariate abundance data in ecology'. Together they form a unique fingerprint.

    Cite this