Evaluation of Various Open-Set Medical Imaging Tasks with Deep Neural Networks
The current generation of deep neural networks has achieved close-to-human results on "closed-set" image recognition; that is, the classes being evaluated overlap with the training classes. Many recent methods attempt to address the importance of the unknown, which are termed "open-set" recognition algorithms, try to reject unknown classes as well as maintain high recognition accuracy on known classes. However, it is still unclear how different general domain-trained open-set methods from ImageNet would perform on a different but more specific domain, such as the medical domain. Without principled and formal evaluations to measure the effectiveness of those general open-set methods, artificial intelligence (AI)-based medical diagnostics would experience ineffective adoption and increased risks of bad decision making. In this paper, we conduct rigorous evaluations amongst state-of-the-art open-set methods, exploring different open-set scenarios from "similar-domain" to "different-domain" scenarios and comparing them on various general and medical domain datasets. We summarise the results and core ideas and explain how the models react to various degrees of openness and different distributions of open classes. We show the main difference between general domain-trained and medical domain-trained open-set models with our quantitative and qualitative analysis of the results. We also identify aspects of model robustness in real clinical workflow usage according to confidence calibration and the inference efficiency.
READ FULL TEXT