Overlap-aware diarization: resegmentation using neural end-to-end overlapped speech detection
We address the problem of effectively handling overlapping speech in a diarization system. First, we detail a neural Long Short-Term Memory-based architecture for overlap detection. Secondly, detected overlap regions are exploited in conjunction with a frame-level speaker posterior matrix to make two-speaker assignments for overlapped frames in the resegmentation step. The overlap detection module achieves state-of-the-art performance on the AMI, DIHARD, and ETAPE corpora. We apply overlap-aware resegmentation on AMI, resulting in a 20 approach is by no means an end-all solution to overlap-aware diarization, it reveals promising directions for handling overlap.
READ FULL TEXT