Search Results for author: Kenji Nagamatsu

We propose a streaming diarization method based on an end-to-end neural diarization (EEND) model, which handles flexible numbers of speakers and overlapping speech.

Speaker Diarization Sound Audio and Speech Processing

Paper
Add Code

End-to-End Speaker Diarization as Post-Processing

no code implementations • 18 Dec 2020 • Shota Horiguchi, Paola Garcia, Yusuke Fujita, Shinji Watanabe, Kenji Nagamatsu

Clustering-based diarization methods partition frames into clusters of the number of speakers; thus, they typically cannot handle overlapping speech because each frame is assigned to one speaker.

Clustering Multi-Label Classification +2

Paper
Add Code

Block-Online Guided Source Separation

no code implementations • 16 Nov 2020 • Shota Horiguchi, Yusuke Fujita, Kenji Nagamatsu

It is also a problem that the offline GSS is an utterance-wise algorithm so that it produces latency according to the length of the utterance.

Speech Separation

Paper
Add Code

Utterance-Wise Meeting Transcription System Using Asynchronous Distributed Microphones

no code implementations • 31 Jul 2020 • Shota Horiguchi, Yusuke Fujita, Kenji Nagamatsu

We also showed that our framework achieved CER of 21. 8 %, which is only 2. 1 percentage points higher than the CER in headset microphone-based transcription.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Online End-to-End Neural Diarization with Speaker-Tracing Buffer

no code implementations • 4 Jun 2020 • Yawen Xue, Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Kenji Nagamatsu

This paper proposes a novel online speaker diarization algorithm based on a fully supervised self-attention mechanism (SA-EEND).

speaker-diarization Speaker Diarization

Paper
Add Code

Neural Speaker Diarization with Speaker-Wise Chain Rule

1 code implementation • 2 Jun 2020 • Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Yawen Xue, Jing Shi, Kenji Nagamatsu

Speaker diarization is an essential step for processing multi-speaker audio.

speaker-diarization Speaker Diarization

348

Paper
Code

End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors

3 code implementations • 20 May 2020 • Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, Kenji Nagamatsu

End-to-end speaker diarization for an unknown number of speakers is addressed in this paper.

Clustering speaker-diarization +1

348

Paper
Code

End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-label Classification

1 code implementation • 24 Feb 2020 • Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu

However, the clustering-based approach has a number of problems; i. e., (i) it is not optimized to minimize diarization errors directly, (ii) it cannot handle speaker overlaps correctly, and (iii) it has trouble adapting their speaker embedding models to real audio recordings with speaker overlaps.

Clustering General Classification +3

Paper
Code

Addressing Ambiguity of Emotion Labels Through Meta-Learning

no code implementations • 6 Nov 2019 • Takuya Fujioka, Dario Bertero, Takeshi Homma, Kenji Nagamatsu

We therefore propose a dynamic label correction and sample contribution weight estimation model.

Emotion Recognition Meta-Learning

Paper
Add Code

Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models

no code implementations • 17 Sep 2019 • Naoyuki Kanda, Shota Horiguchi, Yusuke Fujita, Yawen Xue, Kenji Nagamatsu, Shinji Watanabe

Our proposed method combined with i-vector speaker embeddings ultimately achieved a WER that differed by only 2. 1 % from that of TS-ASR given oracle speaker embeddings.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

End-to-End Neural Speaker Diarization with Self-attention

2 code implementations • 13 Sep 2019 • Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu, Shinji Watanabe

Our method was even better than that of the state-of-the-art x-vector clustering-based method.

Ranked #2 on Speaker Diarization on CALLHOME

Clustering speaker-diarization +1

348

Paper
Code

End-to-End Neural Speaker Diarization with Permutation-Free Objectives

1 code implementation • 12 Sep 2019 • Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, Shinji Watanabe

To realize such a model, we formulate the speaker diarization problem as a multi-label classification problem, and introduces a permutation-free objective function to directly minimize diarization errors without being suffered from the speaker-label permutation problem.

Ranked #6 on Speaker Diarization on CALLHOME

Clustering Domain Adaptation +3