TY - JOUR
T1 - Explainable machine learning models of major crop traits from satellite-monitored continent-wide field trial data
AU - Newman, Saul Justin
AU - Furbank, Robert T.
N1 - Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer Nature Limited.
PY - 2021/10
Y1 - 2021/10
N2 - Four species of grass generate half of all human-consumed calories. However, abundant biological data on species that produce our food remain largely inaccessible, imposing direct barriers to understanding crop yield and fitness traits. Here, we assemble and analyse a continent-wide database of field experiments spanning 10 years and hundreds of thousands of machine-phenotyped populations of ten major crop species. Training an ensemble of machine learning models, using thousands of variables capturing weather, ground sensor, soil, chemical and fertilizer dosage, management and satellite data, produces robust cross-continent yield models exceeding R2 = 0.8 prediction accuracy. In contrast to ‘black box’ analytics, detailed interrogation of these models reveals drivers of crop behaviour and complex interactions predicting yield and agronomic traits. These results demonstrate the capacity of machine learning models to interrogate large datasets, generate new and testable outputs and predict crop behaviour, highlighting the powerful role of data in the future of food.
AB - Four species of grass generate half of all human-consumed calories. However, abundant biological data on species that produce our food remain largely inaccessible, imposing direct barriers to understanding crop yield and fitness traits. Here, we assemble and analyse a continent-wide database of field experiments spanning 10 years and hundreds of thousands of machine-phenotyped populations of ten major crop species. Training an ensemble of machine learning models, using thousands of variables capturing weather, ground sensor, soil, chemical and fertilizer dosage, management and satellite data, produces robust cross-continent yield models exceeding R2 = 0.8 prediction accuracy. In contrast to ‘black box’ analytics, detailed interrogation of these models reveals drivers of crop behaviour and complex interactions predicting yield and agronomic traits. These results demonstrate the capacity of machine learning models to interrogate large datasets, generate new and testable outputs and predict crop behaviour, highlighting the powerful role of data in the future of food.
UR - http://www.scopus.com/inward/record.url?scp=85116312182&partnerID=8YFLogxK
U2 - 10.1038/s41477-021-01001-0
DO - 10.1038/s41477-021-01001-0
M3 - Article
SN - 2055-026X
VL - 7
SP - 1354
EP - 1363
JO - Nature Plants
JF - Nature Plants
IS - 10
ER -