b'Rita Singh'

research

∙ 09/14/2023

Training Audio Captioning Models without Audio

Automated Audio Captioning (AAC) is the task of generating natural langu...

0 Soham Deshmukh, et al. ∙

research

∙ 07/17/2023

BASS: Block-wise Adaptation for Speech Summarization

End-to-end speech summarization has been shown to improve performance ov...

0 Roshan Sharma, et al. ∙

research

∙ 05/19/2023

Pengi: An Audio Language Model for Audio Tasks

In the domain of audio processing, Transfer Learning has facilitated the...

0 Soham Deshmukh, et al. ∙

research

∙ 05/13/2023

GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content

This paper presents a novel approach for detecting ChatGPT-generated vs....

0 Yutian Chen, et al. ∙

research

∙ 11/14/2022

Describing emotions with acoustic property prompts for speech emotion recognition

Emotions lie on a broad continuum and treating emotions as a discrete nu...

0 Hira Dhamyal, et al. ∙

research

∙ 10/29/2022

Unifying the Discrete and Continuous Emotion labels for Speech Emotion Recognition

Traditionally, in paralinguistic analysis for emotion detection from spe...

0 Roshan Sharma, et al. ∙

research

∙ 06/25/2022

Self-supervision and Learnable STRFs for Age, Emotion, and Country Prediction

This work presents a multitask approach to the simultaneous estimation o...

0 Roshan Sharma, et al. ∙

research

∙ 10/10/2021

Self-Supervised 3D Face Reconstruction via Conditional Estimation

We present a conditional estimation (CEST) framework to learn 3D facial ...

0 Yandong Wen, et al. ∙

research

∙ 09/12/2021

SphereFace Revived: Unifying Hyperspherical Face Recognition

This paper addresses the deep face recognition problem under an open-set...

0 Weiyang Liu, et al. ∙

research

∙ 08/03/2021

SphereFace2: Binary Classification is All You Need for Deep Face Recognition

State-of-the-art deep face recognition methods are mostly trained with a...

10 Yandong Wen, et al. ∙

research

∙ 07/16/2021

Controlled AutoEncoders to Generate Faces from Voices

Multiple studies in the past have shown that there is a strong correlati...

0 Hao Liang, et al. ∙

research

∙ 06/12/2021

Improving weakly supervised sound event detection with self-supervised auxiliary tasks

While multitask and transfer learning has shown to improve the performan...

0 Soham Deshmukh, et al. ∙

research

∙ 11/09/2020

Mask Proxy Loss for Text-Independent Speaker Recognition

Open-set speaker recognition can be regarded as a metric learning proble...

0 Jiachen Lian, et al. ∙

research

∙ 10/29/2020

Interpreting glottal flow dynamics for detecting COVID-19 from voice

In the pathogenesis of COVID-19, impairment of respiratory functions is ...

0 Soham Deshmukh, et al. ∙

research

∙ 10/21/2020

Detection of COVID-19 through the analysis of vocal fold oscillations

Phonation, or the vibration of the vocal folds, is the primary source of...

0 Mahmoud Al Ismail, et al. ∙

research

∙ 08/17/2020

Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection

Weakly Labelled learning has garnered lot of attention in recent years d...

5 Soham Deshmukh, et al. ∙

research

∙ 11/13/2019

The phonetic bases of vocal expressed emotion: natural versus acted

Can vocal emotions be emulated? This question has been a recurrent conce...

0 Hira Dhamyal, et al. ∙

research

∙ 10/24/2019

Detecting gender differences in perception of emotion in crowdsourced data

Do men and women perceive emotions differently? Popular convictions plac...

0 Shahan Ali Memon, et al. ∙

research

∙ 10/20/2019

Speech-Based Parameter Estimation of an Asymmetric Vocal Fold Oscillation Model and Its Application in Discriminating Vocal Fold Pathologies

So far, several physical models have been proposed for the study of voca...

0 Wenbo Zhao, et al. ∙

research

∙ 05/26/2019

Non-Determinism in Neural Networks for Adversarial Robustness

Recent breakthroughs in the field of deep learning have led to advanceme...

0 Daanish Ali Khan, et al. ∙

research

∙ 05/25/2019

Reconstructing faces from voices

Voice profiling aims at inferring various human parameters from their sp...

0 Yandong Wen, et al. ∙

research

∙ 03/18/2019

Hierarchical Routing Mixture of Experts

In regression tasks the distribution of the data is often too complex to...

0 Wenbo Zhao, et al. ∙

research

∙ 02/07/2019

Hide and Speak: Deep Neural Networks for Speech Steganography

Steganography is the science of hiding a secret message within an ordina...

0 Felix Kreuk, et al. ∙

research

∙ 10/01/2018

Neural Regression Trees

Regression-via-Classification (RvC) is the process of converting a regre...

0 Shahan Ali Memon, et al. ∙

research

∙ 07/12/2018

Disjoint Mapping Network for Cross-modal Matching of Voices and Faces

We propose a novel framework, called Disjoint Mapping Network (DIMNet), ...

0 Yandong Wen, et al. ∙

research

∙ 07/12/2018

Optimal Strategies for Matching and Retrieval Problems by Comparing Covariates

In many retrieval problems, where we must retrieve one or more entries f...

0 Yandong Wen, et al. ∙

research

∙ 02/19/2018

Voice Impersonation using Generative Adversarial Networks

Voice impersonation is not the same as voice transformation, although th...

0 Yang Gao, et al. ∙

research

∙ 12/01/2017

Speaker identification from the sound of the human breath

This paper examines the speaker identification potential of breath sound...

0 Wenbo Zhao, et al. ∙

research

∙ 02/27/2016

Content-based Video Indexing and Retrieval Using Corr-LDA

Existing video indexing and retrieval methods on popular web-based multi...

0 Rahul Radhakrishnan Iyer, et al. ∙

research

∙ 02/27/2015

Plagiarism Detection in Polyphonic Music using Monaural Signal Separation

Given the large number of new musical tracks released each year, automat...

0 Soham De, et al. ∙

Rita Singh

Featured Co-authors

Sign in with Google

Consider DeepAI Pro