Search Results for author: Tanel Alumäe

Found 10 papers, 2 papers with code

PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings

no code implementations • 4 Mar 2024 • Joonas Kalda, Clément Pagés, Ricard Marxer, Tanel Alumäe, Hervé Bredin

A major drawback of supervised speech separation (SSep) systems is their reliance on synthetic data, leading to poor real-world generalization.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Dialect Adaptation and Data Augmentation for Low-Resource ASR: TalTech Systems for the MADASR 2023 Challenge

no code implementations • 26 Oct 2023 • Tanel Alumäe, Jiaming Kong, Daniil Robnikov

This paper describes Tallinn University of Technology (TalTech) systems developed for the ASRU MADASR 2023 Challenge.

Automatic Speech Recognition Data Augmentation +2

Paper
Add Code

Collar-aware Training for Streaming Speaker Change Detection in Broadcast Speech

no code implementations • 14 May 2022 • Joonas Kalda, Tanel Alumäe

Instead, the proposed method uses an objective function which encourages the model to predict a single positive label within a specified collar.

Change Detection

Paper
Add Code

Pretraining Approaches for Spoken Language Recognition: TalTech Submission to the OLR 2021 Challenge

no code implementations • 14 May 2022 • Tanel Alumäe, Kunnar Kukk

For the unconstrained task, we relied on both externally available pretrained models as well as external data: the multilingual XLSR-53 wav2vec2. 0 model was finetuned on the VoxLingua107 corpus for the language recognition task, and finally finetuned on the provided target language training data, augmented with CommonVoice data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Improving Language Identification of Accented Speech

no code implementations • 31 Mar 2022 • Kunnar Kukk, Tanel Alumäe

Language identification from speech is a common preprocessing step in many spoken language processing systems.

Language Identification speech-recognition +2

Paper
Add Code

VoxLingua107: a Dataset for Spoken Language Recognition

2 code implementations • 25 Nov 2020 • Jörgen Valk, Tanel Alumäe

Speech activity detection and speaker diarization are used to extract segments from the videos that contain speech.

Action Detection Activity Detection +4

Paper
Code

Robust Training of Vector Quantized Bottleneck Models

1 code implementation • 18 May 2020 • Adrian Łańcucki, Jan Chorowski, Guillaume Sanchez, Ricard Marxer, Nanxin Chen, Hans J. G. A. Dolfing, Sameer Khurana, Tanel Alumäe, Antoine Laurent

We show that the codebook learning can suffer from poor initialization and non-stationarity of clustered encoder outputs.

Clustering Disentanglement +1

Paper
Code

Advanced Rich Transcription System for Estonian Speech

no code implementations • 11 Jan 2019 • Tanel Alumäe, Ottokar Tilk, Asadullah

Out-of-vocabulary words are recovered using a phoneme n-gram based decoding subgraph and a FST-based phoneme-to-grapheme model.

Speaker Identification

Paper
Add Code

Weakly Supervised Training of Speaker Identification Models

no code implementations • 22 Jun 2018 • Martin Karu, Tanel Alumäe

The method uses speaker diarization to find unique speakers in each recording, and i-vectors to project the speech of each speaker to a fixed-dimensional vector.

speaker-diarization Speaker Diarization +1

Paper
Add Code

Low-Resource Neural Headline Generation

no code implementations • WS 2017 • Ottokar Tilk, Tanel Alumäe

Recent neural headline generation models have shown great results, but are generally trained on very large datasets.

Headline Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.