Nir Nissim, Mary Regina Boland, Robert Moskovitch, Nicholas Tatonetti, Yuval Elovici, Yuval Shahar, George Hripcsak
Conference on Artificial Intelligence in Medicine in Europe AIME 2015: Artificial Intelligence in Medicine pp 13-24
Understanding condition severity, as extracted from Electronic Health Records (EHRs), is important for many public health purposes. Methods requiring physicians to annotate condition severity are time-consuming and costly. Previously, a passive learning algorithm called CAESAR was developed to capture severity in EHRs. This approach required physicians to label conditions manually, an exhaustive process. We developed a framework that uses two Active Learning (AL) methods (Exploitation and Combination_XA) to decrease manual labeling efforts by selecting only the most informative conditions for training. We call our approach CAESAR-Active Learning Enhancement (CAESAR-ALE). As compared to passive methods,CAESAR-ALE’s first AL method, Exploitation, reduced labeling efforts by 64% and achieved an equivalent true positive rate, while CAESAR-ALE’s second AL method, Combination_XA, reduced labeling efforts by 48% and achieved equivalent accuracy. In addition, both these AL methods outperformed the traditional AL method (SVM-Margin). These results demonstrate the potential of AL methods for decreasing the labeling efforts of medical experts, while achieving greater accuracy and lower costs.