Analysing spectroscopy data using two-step group penalized partial least squares regression

Le Chang, Jiali Wang*, William Woodgate

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    3 Citations (Scopus)

    Abstract

    A statistical challenge to analyse hyperspectral data is the multicollinearity between spectral bands. Partial least squares (PLS) has been extensively used as a dimensionality reduction technique through constructing lower dimensional latent variables from the spectral bands that correlate with the response variables. However, it does not take into account the grouping structure of the full spectrum where spectral subsets may exhibit distinct relationships with the response variables. We propose a two-step group penalized PLS regression approach by performing a PLS regression on each group of predictors identified from a clustering approach in the first step. In the second step, a group penalty is imposed on the latent components to select the group with the highest predictive power. Our proposed method demonstrated a superior prediction performance, higher R-squared value and faster computation time over other PLS variations when applied to simulations and a real-world observational data set. Interpretations of the model performance are illustrated using the real-world data example of leaf spectra to indirectly quantify leaf traits. The method is implemented in an R package called “groupPLS”, which is accessible from github.com/jialiwang1211/groupPLS.

    Original languageEnglish
    Pages (from-to)445-467
    Number of pages23
    JournalEnvironmental and Ecological Statistics
    Volume28
    Issue number2
    DOIs
    Publication statusPublished - Jun 2021

    Fingerprint

    Dive into the research topics of 'Analysing spectroscopy data using two-step group penalized partial least squares regression'. Together they form a unique fingerprint.

    Cite this