Takafumi Moriya

research

∙ 06/14/2023

SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?

Self-supervised learning (SSL) for speech representation has been succes...

0 Takanori Ashihara, et al. ∙

research

∙ 06/07/2023

Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization

End-to-end speech summarization (E2E SSum) directly summarizes input spe...

0 Kohei Matsuura, et al. ∙

research

∙ 06/04/2023

End-to-End Joint Target and Non-Target Speakers ASR

This paper proposes a novel automatic speech recognition (ASR) system th...

0 Ryo Masumura, et al. ∙

research

∙ 05/24/2023

Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss

Self-supervised learning (SSL) is the latest breakthrough in speech proc...

0 Hiroshi Sato, et al. ∙

research

∙ 05/09/2023

Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models

Self-supervised learning (SSL) has been dramatically successful not only...

0 Takanori Ashihara, et al. ∙

research

∙ 04/24/2023

Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model

This paper proposes a zero-shot text-to-speech (TTS) conditioned by a se...

0 Kenichi Fujita, et al. ∙

research

∙ 03/02/2023

Leveraging Large Text Corpora for End-to-End Speech Summarization

End-to-end speech summarization (E2E SSum) is a technique to directly ge...

0 Kohei Matsuura, et al. ∙

research

∙ 10/28/2022

On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis

This paper investigates the effectiveness and implementation of modality...

0 Atsushi Ando, et al. ∙

research

∙ 09/09/2022

Streaming Target-Speaker ASR with Neural Transducer

Although recent advances in deep learning technology have boosted automa...

0 Takafumi Moriya, et al. ∙

research

∙ 07/14/2022

Deep versus Wide: An Analysis of Student Architectures for Task-Agnostic Knowledge Distillation of Self-Supervised Speech Models

Self-supervised learning (SSL) is seen as a very promising approach with...

0 Takanori Ashihara, et al. ∙

research

∙ 06/16/2022

Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations

Target speech extraction is a technique to extract the target speaker's ...

0 Hiroshi Sato, et al. ∙

research

∙ 01/11/2022

Learning to Enhance or Not: Neural Network-Based Switching of Enhanced and Observed Signals for Overlapping Speech Recognition

The combination of a deep neural network (DNN) -based speech enhancement...

0 Hiroshi Sato, et al. ∙

research

∙ 07/04/2021

Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition

We propose a cross-modal transformer-based neural correction models that...

0 Tomohiro Tanaka, et al. ∙

research

∙ 06/02/2021

Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition

Although recent advances in deep learning technology improved automatic ...

0 Hiroshi Sato, et al. ∙

Takafumi Moriya

Featured Co-authors

Sign in with Google

Consider DeepAI Pro