Comparing spoken segments is a central operation to speech processing.
T...
Many self-supervised speech models (S3Ms) have been introduced over the ...
Query-by-example (QbE) speech search is the task of matching spoken quer...
Segmental models are sequence prediction models in which scores of hypot...
Acoustic word embeddings (AWEs) are vector representations of spoken wor...
Direct acoustics-to-word (A2W) systems for end-to-end automatic speech
r...
Query-by-example search often uses dynamic time warping (DTW) for compar...
During language acquisition, infants have the benefit of visual cues to
...
Acoustic word embeddings --- fixed-dimensional vector representations of...