Search Results for author: Tae Jin Park

Found 9 papers, 1 papers with code

The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System

no code implementations • 18 Oct 2023 • Tae Jin Park, He Huang, Ante Jukic, Kunal Dhawan, Krishna C. Puvvada, Nithin Koluguri, Nikolay Karpov, Aleksandr Laptev, Jagadeesh Balam, Boris Ginsburg

We present the NVIDIA NeMo team's multi-channel speech recognition system for the 7th CHiME Challenge Distant Automatic Speech Recognition (DASR) Task, focusing on the development of a multi-channel, multi-speaker speech recognition system tailored to transcribe speech from distributed microphones and microphone arrays.

Automatic Speech Recognition speaker-diarization +3

Paper
Add Code

Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation

no code implementations • 18 Oct 2023 • Tae Jin Park, He Huang, Coleman Hooper, Nithin Koluguri, Kunal Dhawan, Ante Jukic, Jagadeesh Balam, Boris Ginsburg

This capability offers a tailored training environment for developing neural models suited for speaker diarization and voice activity detection.

Action Detection Activity Detection +3

Paper
Add Code

Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach

no code implementations • 11 Sep 2023 • Tae Jin Park, Kunal Dhawan, Nithin Koluguri, Jagadeesh Balam

In addition, these findings point to the potential of using LLMs to improve speaker diarization and other speech processing tasks by capturing semantic and contextual cues.

speaker-diarization Speaker Diarization

Paper
Add Code

Multi-scale Speaker Diarization with Dynamic Scale Weighting

no code implementations • 30 Mar 2022 • Tae Jin Park, Nithin Rao Koluguri, Jagadeesh Balam, Boris Ginsburg

First, we use multi-scale clustering as an initialization to estimate the number of speakers and obtain the average speaker representation vector for each speaker and each scale.

speaker-diarization Speaker Diarization

Paper
Add Code

Tackling Dynamics in Federated Incremental Learning with Variational Embedding Rehearsal

no code implementations • 19 Oct 2021 • Tae Jin Park, Kenichi Kumatani, Dimitrios Dimitriadis

Federated Learning is a fast growing area of ML where the training datasets are extremely distributed, all while dynamically changing over time.

Federated Learning Incremental Learning

Paper
Add Code

A Review of Speaker Diarization: Recent Advances with Deep Learning

no code implementations • 24 Jan 2021 • Tae Jin Park, Naoyuki Kanda, Dimitrios Dimitriadis, Kyu J. Han, Shinji Watanabe, Shrikanth Narayanan

Speaker diarization is a task to label audio or video recordings with classes that correspond to speaker identity, or in short, a task to identify "who spoke when".

Retrieval speaker-diarization +3

Paper
Add Code

Speaker Diarization with Lexical Information

no code implementations • 13 Apr 2020 • Tae Jin Park, Kyu J. Han, Jing Huang, Xiaodong He, Bo-Wen Zhou, Panayiotis Georgiou, Shrikanth Narayanan

This work presents a novel approach for speaker diarization to leverage lexical information provided by automatic speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap

1 code implementation • 5 Mar 2020 • Tae Jin Park, Kyu J. Han, Manoj Kumar, Shrikanth Narayanan

In this study, we propose a new spectral clustering framework that can auto-tune the parameters of the clustering algorithm in the context of speaker diarization.

Ranked #1 on Speaker Diarization on CALLHOME (DER(ig olp) metric)

Clustering speaker-diarization +1

Paper
Code

Speaker Diarization With Lexical Information

no code implementations • 27 Nov 2018 • Tae Jin Park, Kyu Han, Ian Lane, Panayiotis Georgiou

This work presents a novel approach to leverage lexical information for speaker diarization.

Clustering speaker-diarization +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.