TY - JOUR
T1 - Property Prediction for Complex Compounds Using Structure-Free Mendeleev Encoding and Machine Learning
AU - Zhuang, Zixin
AU - Barnard, Amanda S.
N1 - Publisher Copyright:
© 2024 American Chemical Society.
PY - 2024/12/12
Y1 - 2024/12/12
N2 - Predicting the properties for unseen materials exclusively on the basis of the chemical formula before synthesis and characterization has advantages for research and resource planning. This can be achieved using suitable structure-free encoding and machine learning methods, but additional processing decisions are required. In this study, we compare a variety of structure-free materials encodings and machine learning algorithms to predict the structure/property relationships of battery materials. It was found that the physical units used to measure the property labels have an important impact on the predictive ability of the models, regardless of the computational approach. Property labels with respect to weight give excellent performance, but property labels with respect to volume cannot be predicted with confidence using only chemical information, even when the underlying physical characteristics are the same. These results contrast with previous studies of unsupervised learning and classification, where structure-free encoding excelled, and highlight how the structural features or property labels of materials are represented plays an important role in the predictive ability of machine learning models.
AB - Predicting the properties for unseen materials exclusively on the basis of the chemical formula before synthesis and characterization has advantages for research and resource planning. This can be achieved using suitable structure-free encoding and machine learning methods, but additional processing decisions are required. In this study, we compare a variety of structure-free materials encodings and machine learning algorithms to predict the structure/property relationships of battery materials. It was found that the physical units used to measure the property labels have an important impact on the predictive ability of the models, regardless of the computational approach. Property labels with respect to weight give excellent performance, but property labels with respect to volume cannot be predicted with confidence using only chemical information, even when the underlying physical characteristics are the same. These results contrast with previous studies of unsupervised learning and classification, where structure-free encoding excelled, and highlight how the structural features or property labels of materials are represented plays an important role in the predictive ability of machine learning models.
UR - http://www.scopus.com/inward/record.url?scp=85212043843&partnerID=8YFLogxK
U2 - 10.1021/acs.jcim.4c01343
DO - 10.1021/acs.jcim.4c01343
M3 - Article
AN - SCOPUS:85212043843
SN - 1549-9596
VL - 64
SP - 9205
EP - 9214
JO - Journal of Chemical Information and Modeling
JF - Journal of Chemical Information and Modeling
IS - 24
ER -