The impact of domain-driven and data-driven feature selection on the inverse design of nanoparticle catalysts

Sichao Li, Ting Jonathan Y.C, Amanda S. Barnard*

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    4 Citations (Scopus)

    Abstract

    Incorporating practical considerations into machine learning can make predictions more actionable. However, researcher interventions in the learning process may have negative impacts on model performance, leading to a trade-off between accuracy and utility. In this paper we use multi-target machine learning to predict the structure of platinum nanocatalysts based on property indicators and develop intervention scenarios using ratios of data-driven (optimal) and domain-driven (preferable) variables during feature selection. We show that minor interventions to data-driven feature selection can be tolerated, and even improve model performance, but aggressive domain-driven feature selection degrades performance, even if the mapping function is perfectly balanced.

    Original languageEnglish
    Article number101896
    Pages (from-to)1-14
    Number of pages14
    JournalJournal of Computational Science
    Volume65
    DOIs
    Publication statusPublished - Nov 2022

    Fingerprint

    Dive into the research topics of 'The impact of domain-driven and data-driven feature selection on the inverse design of nanoparticle catalysts'. Together they form a unique fingerprint.

    Cite this