Tatsuya Kawahara

research

∙ 09/17/2023

Zero- and Few-shot Sound Event Localization and Detection

Sound event localization and detection (SELD) systems estimate direction...

0 Kazuki Shimada, et al. ∙

research

∙ 08/21/2023

Towards Objective Evaluation of Socially-Situated Conversational Robots: Assessing Human-Likeness through Multimodal User Behaviors

This paper tackles the challenging task of evaluating socially situated ...

0 Koji Inoue, et al. ∙

research

∙ 07/28/2023

Reasoning before Responding: Integrating Commonsense-based Causality Explanation for Empathetic Response Generation

Recent approaches to empathetic response generation try to incorporate c...

0 Yahui Fu, et al. ∙

research

∙ 05/18/2023

Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders

Diffusion-based speech enhancement (SE) has been investigated recently, ...

0 Hao Shi, et al. ∙

research

∙ 03/26/2023

Time-domain Speech Enhancement Assisted by Multi-resolution Frequency Encoder and Decoder

Time-domain speech enhancement (SE) has recently been intensively invest...

0 Hao Shi, et al. ∙

research

∙ 03/01/2023

I Know Your Feelings Before You Do: Predicting Future Affective Reactions in Human-Computer Dialogue

Current Spoken Dialogue Systems (SDSs) often serve as passive listeners ...

0 Yuanchao Li, et al. ∙

research

∙ 11/15/2022

Alzheimer's Dementia Detection through Spontaneous Dialogue with Proactive Robotic Listeners

As the aging of society continues to accelerate, Alzheimer's Disease (AD...

0 Yuanchao Li, et al. ∙

research

∙ 09/08/2022

Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM

Connectionist temporal classification (CTC) -based models are attractive...

0 Hayato Futami, et al. ∙

research

∙ 09/05/2022

Distilling the Knowledge of BERT for CTC-based ASR

Connectionist temporal classification (CTC) -based models are attractive...

0 Hayato Futami, et al. ∙

research

∙ 07/07/2022

End-to-end Speech-to-Punctuated-Text Recognition

Conventional automatic speech recognition systems do not produce punctua...

0 Jumon Nozaki, et al. ∙

research

∙ 10/05/2021

ASR Rescoring and Confidence Estimation with ELECTRA

In automatic speech recognition (ASR) rescoring, the hypothesis with the...

0 Hayato Futami, et al. ∙

research

∙ 09/09/2021

Non-autoregressive End-to-end Speech Translation with Parallel Autoregressive Rescoring

This article describes an efficient end-to-end speech translation (E2E-S...

0 Hirofumi Inaguma, et al. ∙

research

∙ 07/15/2021

VAD-free Streaming Hybrid CTC/Attention ASR for Unsegmented Recording

In this work, we propose novel decoding algorithms to enable streaming a...

0 Hirofumi Inaguma, et al. ∙

research

∙ 07/01/2021

StableEmit: Selection Probability Discount for Reducing Emission Latency of Streaming Monotonic Attention ASR

While attention-based encoder-decoder (AED) models have been successfull...

0 Hirofumi Inaguma, et al. ∙

research

∙ 06/04/2021

ERICA: An Empathetic Android Companion for Covid-19 Quarantine

Over the past year, research in various domains, including Natural Langu...

0 Etsuko Ishii, et al. ∙

research

∙ 05/02/2021

Intelligent Conversational Android ERICA Applied to Attentive Listening and Job Interview

Following the success of spoken dialogue systems (SDS) in smartphone ass...

0 Tatsuya Kawahara, et al. ∙

research

∙ 04/13/2021

Source and Target Bidirectional Knowledge Distillation for End-to-end Speech Translation

A conventional approach to improving the performance of end-to-end speec...

0 Hirofumi Inaguma, et al. ∙

research

∙ 02/28/2021

Alignment Knowledge Distillation for Online Streaming Attention-based Speech Recognition

This article describes an efficient training method for online streaming...

0 Hirofumi Inaguma, et al. ∙

research

∙ 10/25/2020

Orthros: Non-autoregressive End-to-end Speech Translation with Dual-decoder

Fast inference speed is an important goal towards real-world deployment ...

0 Hirofumi Inaguma, et al. ∙

research

∙ 09/15/2020

Multi-Referenced Training for Dialogue Response Generation

In open-domain dialogue response generation, a dialogue context can be c...

0 Tianyu Zhao, et al. ∙

research

∙ 08/09/2020

Distilling the Knowledge of BERT for Sequence-to-Sequence ASR

Attention-based sequence-to-sequence (seq2seq) models have achieved prom...

0 Hayato Futami, et al. ∙

research

∙ 05/19/2020

Enhancing Monotonic Multihead Attention for Streaming ASR

We investigate a monotonic multihead attention (MMA) by extending hard m...

0 Hirofumi Inaguma, et al. ∙

research

∙ 05/19/2020

Generative Adversarial Training Data Adaptation for Very Low-resource Automatic Speech Recognition

It is important to transcribe and archive speech data of endangered lang...

0 Kohei Matsuura, et al. ∙

research

∙ 05/10/2020

CTC-synchronous Training for Monotonic Attention Model

Monotonic chunkwise attention (MoChA) has been studied for the online st...

0 Hirofumi Inaguma, et al. ∙

research

∙ 04/23/2020

End-to-end speech-to-dialog-act recognition

Spoken language understanding, which extracts intents and/or semantic co...

0 Viet-Trung Dang, et al. ∙

research

∙ 04/10/2020

Designing Precise and Robust Dialogue Response Evaluators

Automatic dialogue response evaluator has been proposed as an alternativ...

0 Tianyu Zhao, et al. ∙

research

∙ 02/16/2020

Speech Corpus of Ainu Folklore and End-to-end Speech Recognition for Ainu Language

Ainu is an unwritten language that has been spoken by Ainu people who ar...

0 Kohei Matsuura, et al. ∙

research

∙ 10/01/2019

Multilingual End-to-End Speech Translation

In this paper, we propose a simple yet effective framework for multiling...

0 Hirofumi Inaguma, et al. ∙

research

∙ 09/22/2019

Improving OOV Detection and Resolution with External Language Models in Acoustic-to-Word ASR

Acoustic-to-word (A2W) end-to-end automatic speech recognition (ASR) sys...

0 Hirofumi Inaguma, et al. ∙

research

∙ 07/12/2019

Effective Incorporation of Speaker Information in Utterance Encoding in Dialog

In dialog studies, we often encode a dialog using a hierarchical encoder...

0 Tianyu Zhao, et al. ∙

research

∙ 05/31/2019

Content Word-based Sentence Decoding and Evaluating for Open-domain Neural Response Generation

Various encoder-decoder models have been applied to response generation ...

0 Tianyu Zhao, et al. ∙

research

∙ 03/22/2019

Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition

This paper describes multichannel speech enhancement for improving autom...

10 Kazuki Shimada, et al. ∙

research

∙ 11/06/2018

Transfer learning of language-independent end-to-end ASR with language model fusion

This work explores better adaptation methods to low-resource languages u...

0 Hirofumi Inaguma, et al. ∙

research

∙ 10/31/2017

Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization

This paper presents a statistical method of single-channel speech enhanc...

0 Yoshiaki Bando, et al. ∙

research

∙ 09/29/2017

Detection of social signals for recognizing engagement in human-robot interaction

Detection of engagement during a conversation is an important function o...

0 Divesh Lala, et al. ∙

Tatsuya Kawahara

Featured Co-authors

Sign in with Google

Consider DeepAI Pro