Abstract
Incorporating practical considerations into machine learning can make predictions more actionable. However, researcher interventions in the learning process may have negative impacts on model performance, leading to a trade-off between accuracy and utility. In this paper we use multi-target machine learning to predict the structure of platinum nanocatalysts based on property indicators and develop intervention scenarios using ratios of data-driven (optimal) and domain-driven (preferable) variables during feature selection. We show that minor interventions to data-driven feature selection can be tolerated, and even improve model performance, but aggressive domain-driven feature selection degrades performance, even if the mapping function is perfectly balanced.
Original language | English |
---|---|
Article number | 101896 |
Pages (from-to) | 1-14 |
Number of pages | 14 |
Journal | Journal of Computational Science |
Volume | 65 |
DOIs | |
Publication status | Published - Nov 2022 |