TY - JOUR
T1 - Peering into the black box of artificial intelligence
T2 - Evaluation metrics of machine learning methods
AU - Handelman, Guy S.
AU - Kok, Hong Kuan
AU - Chandra, Ronil V.
AU - Razavi, Amir H.
AU - Huang, Shiwei
AU - Brooks, Mark
AU - Lee, Michael J.
AU - Asadi, Hamed
N1 - Publisher Copyright:
© American Roentgen Ray Society.
PY - 2019/1
Y1 - 2019/1
N2 - OBJECTIVE. Machine learning (ML) and artificial intelligence (AI) are rapidly becoming the most talked about and controversial topics in radiology and medicine. Over the past few years, the numbers of ML- or AI-focused studies in the literature have increased almost exponentially, and ML has become a hot topic at academic and industry conferences. However, despite the increased awareness of ML as a tool, many medical professionals have a poor understanding of how ML works and how to critically appraise studies and tools that are presented to us. Thus, we present a brief overview of ML, explain the metrics used in ML and how to interpret them, and explain some of the technical jargon associated with the field so that readers with a medical background and basic knowledge of statistics can feel more comfortable when examining ML applications. CONCLUSION. Attention to sample size, overfitting, underfitting, cross validation, as well as a broad knowledge of the metrics of machine learning, can help those with little or no technical knowledge begin to assess machine learning studies. However, transparency in methods and sharing of algorithms is vital to allow clinicians to assess these tools themselves.
AB - OBJECTIVE. Machine learning (ML) and artificial intelligence (AI) are rapidly becoming the most talked about and controversial topics in radiology and medicine. Over the past few years, the numbers of ML- or AI-focused studies in the literature have increased almost exponentially, and ML has become a hot topic at academic and industry conferences. However, despite the increased awareness of ML as a tool, many medical professionals have a poor understanding of how ML works and how to critically appraise studies and tools that are presented to us. Thus, we present a brief overview of ML, explain the metrics used in ML and how to interpret them, and explain some of the technical jargon associated with the field so that readers with a medical background and basic knowledge of statistics can feel more comfortable when examining ML applications. CONCLUSION. Attention to sample size, overfitting, underfitting, cross validation, as well as a broad knowledge of the metrics of machine learning, can help those with little or no technical knowledge begin to assess machine learning studies. However, transparency in methods and sharing of algorithms is vital to allow clinicians to assess these tools themselves.
KW - Artificial intelligence
KW - Machine learning
KW - Medicine
KW - Supervised machine learning
KW - Unsupervised machine learning
UR - http://www.scopus.com/inward/record.url?scp=85058871811&partnerID=8YFLogxK
U2 - 10.2214/AJR.18.20224
DO - 10.2214/AJR.18.20224
M3 - Review article
SN - 0361-803X
VL - 212
SP - 38
EP - 43
JO - American Journal of Roentgenology
JF - American Journal of Roentgenology
IS - 1
ER -