A multi-site study of a breast density deep learning model for full-field digital mammography and digital breast tomosynthesis exams
Purpose: To develop a Breast Imaging Reporting and Data System (BI-RADS) breast density DL model in a multi-site setting for synthetic 2D mammography (SM) images derived from 3D DBT exams using FFDM images and limited SM data. Materials and Methods: A DL model was trained to predict BI-RADS breast density using FFDM images acquired from 2008 to 2017 (Site 1: 57492 patients, 187627 exams, 750752 images) for this retrospective study. The FFDM model was evaluated using SM datasets from two institutions (Site 1: 3842 patients, 3866 exams, 14472 images, acquired from 2016 to 2017; Site 2: 7557 patients, 16283 exams, 63973 images, 2015 to 2019). Adaptation methods were investigated to improve performance on the SM datasets and the effect of dataset size on each adaptation method is considered. Statistical significance was assessed using confidence intervals (CI), estimated by bootstrapping. Results: Without adaptation, the model demonstrated close agreement with the original reporting radiologists for all three datasets (Site 1 FFDM: linearly-weighted κ_w = 0.75, 95% CI: [0.74, 0.76]; Site 1 SM: κ_w = 0.71, CI: [0.64, 0.78]; Site 2 SM: κ_w = 0.72, CI: [0.70, 0.75]). With adaptation, performance improved for Site 2 (Site 1: κ_w = 0.72, CI: [0.66, 0.79], Site 2: κ_w = 0.79, CI: [0.76, 0.81]) using only 500 SM images from each site. Conclusion: A BI-RADS breast density DL model demonstrated strong performance on FFDM and SM images from two institutions without training on SM images and improved using few SM images.
READ FULL TEXT