TY - JOUR
T1 - Do feature selection methods for selecting environmental covariables enhance genomic prediction accuracy?
AU - Montesinos-López, Osval A.
AU - Crespo-Herrera, Leonardo
AU - Saint Pierre, Carolina
AU - Bentley, Alison R.
AU - de la Rosa-Santamaria, Roberto
AU - Ascencio-Laguna, José Alejandro
AU - Agbona, Afolabi
AU - Gerard, Guillermo S.
AU - Montesinos-López, Abelardo
AU - Crossa, José
N1 - Publisher Copyright:
Copyright © 2023 Montesinos-López, Crespo-Herrera, Saint Pierre, Bentley, de la Rosa-Santamaria, Ascencio-Laguna, Agbona, Gerard, Montesinos-López and Crossa.
PY - 2023
Y1 - 2023
N2 - Genomic selection (GS) is transforming plant and animal breeding, but its practical implementation for complex traits and multi-environmental trials remains challenging. To address this issue, this study investigates the integration of environmental information with genotypic information in GS. The study proposes the use of two feature selection methods (Pearson’s correlation and Boruta) for the integration of environmental information. Results indicate that the simple incorporation of environmental covariates may increase or decrease prediction accuracy depending on the case. However, optimal incorporation of environmental covariates using feature selection significantly improves prediction accuracy in four out of six datasets between 14.25% and 218.71% under a leave one environment out cross validation scenario in terms of Normalized Root Mean Squared Error, but not relevant gain was observed in terms of Pearson´s correlation. In two datasets where environmental covariates are unrelated to the response variable, feature selection is unable to enhance prediction accuracy. Therefore, the study provides empirical evidence supporting the use of feature selection to improve the prediction power of GS.
AB - Genomic selection (GS) is transforming plant and animal breeding, but its practical implementation for complex traits and multi-environmental trials remains challenging. To address this issue, this study investigates the integration of environmental information with genotypic information in GS. The study proposes the use of two feature selection methods (Pearson’s correlation and Boruta) for the integration of environmental information. Results indicate that the simple incorporation of environmental covariates may increase or decrease prediction accuracy depending on the case. However, optimal incorporation of environmental covariates using feature selection significantly improves prediction accuracy in four out of six datasets between 14.25% and 218.71% under a leave one environment out cross validation scenario in terms of Normalized Root Mean Squared Error, but not relevant gain was observed in terms of Pearson´s correlation. In two datasets where environmental covariates are unrelated to the response variable, feature selection is unable to enhance prediction accuracy. Therefore, the study provides empirical evidence supporting the use of feature selection to improve the prediction power of GS.
KW - environmental covariables
KW - feature selection
KW - genomic prediction
KW - genomic selection
KW - genotype x environment interaction
UR - http://www.scopus.com/inward/record.url?scp=85167329824&partnerID=8YFLogxK
U2 - 10.3389/fgene.2023.1209275
DO - 10.3389/fgene.2023.1209275
M3 - Article
AN - SCOPUS:85167329824
SN - 1664-8021
VL - 14
JO - Frontiers in Genetics
JF - Frontiers in Genetics
M1 - 1209275
ER -