Automated Audio Captioning (AAC) is the task of generating natural langu...
End-to-end speech summarization has been shown to improve performance ov...
In the domain of audio processing, Transfer Learning has facilitated the...
This paper presents a novel approach for detecting ChatGPT-generated vs....
Emotions lie on a broad continuum and treating emotions as a discrete nu...
Traditionally, in paralinguistic analysis for emotion detection from spe...
This work presents a multitask approach to the simultaneous estimation o...
We present a conditional estimation (CEST) framework to learn 3D facial
...
This paper addresses the deep face recognition problem under an open-set...
State-of-the-art deep face recognition methods are mostly trained with a...
Multiple studies in the past have shown that there is a strong correlati...
While multitask and transfer learning has shown to improve the performan...
Open-set speaker recognition can be regarded as a metric learning proble...
In the pathogenesis of COVID-19, impairment of respiratory functions is ...
Phonation, or the vibration of the vocal folds, is the primary source of...
Weakly Labelled learning has garnered lot of attention in recent years d...
Can vocal emotions be emulated? This question has been a recurrent conce...
Do men and women perceive emotions differently? Popular convictions plac...
So far, several physical models have been proposed for the study of voca...
Recent breakthroughs in the field of deep learning have led to advanceme...
Voice profiling aims at inferring various human parameters from their sp...
In regression tasks the distribution of the data is often too complex to...
Steganography is the science of hiding a secret message within an ordina...
Regression-via-Classification (RvC) is the process of converting a regre...
We propose a novel framework, called Disjoint Mapping Network (DIMNet), ...
In many retrieval problems, where we must retrieve one or more entries f...
Voice impersonation is not the same as voice transformation, although th...
This paper examines the speaker identification potential of breath sound...
Existing video indexing and retrieval methods on popular web-based multi...
Given the large number of new musical tracks released each year, automat...