USD: Unknown Sensitive Detector Empowered by Decoupled Objectness and Segment Anything Model
Open World Object Detection (OWOD) is a novel and challenging computer vision task that enables object detection with the ability to detect unknown objects. Existing methods typically estimate the object likelihood with an additional objectness branch, but ignore the conflict in learning objectness and classification boundaries, which oppose each other on the semantic manifold and training objective. To address this issue, we propose a simple yet effective learning strategy, namely Decoupled Objectness Learning (DOL), which divides the learning of these two boundaries into suitable decoder layers. Moreover, detecting unknown objects comprehensively requires a large amount of annotations, but labeling all unknown objects is both difficult and expensive. Therefore, we propose to take advantage of the recent Large Vision Model (LVM), specifically the Segment Anything Model (SAM), to enhance the detection of unknown objects. Nevertheless, the output results of SAM contain noise, including backgrounds and fragments, so we introduce an Auxiliary Supervision Framework (ASF) that uses a pseudo-labeling and a soft-weighting strategies to alleviate the negative impact of noise. Extensive experiments on popular benchmarks, including Pascal VOC and MS COCO, demonstrate the effectiveness of our approach. Our proposed Unknown Sensitive Detector (USD) outperforms the recent state-of-the-art methods in terms of Unknown Recall, achieving significant improvements of 14.3%, 15.5%, and 8.9% on the M-OWODB, and 27.1%, 29.1%, and 25.1% on the S-OWODB.
READ FULL TEXT