Search Results for author: Ha-Jin Yu

Found 28 papers, 14 papers with code

NM-FlowGAN: Modeling sRGB Noise with a Hybrid Approach based on Normalizing Flows and Generative Adversarial Networks

1 code implementation • 15 Dec 2023 • Young Joo Han, Ha-Jin Yu

In our experiments, our NM-FlowGAN outperforms other baselines on the sRGB noise synthesis task.

Paper
Code

HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods

no code implementations • 15 Sep 2023 • Hyun-seo Shin, Jungwoo Heo, Ju-ho Kim, Chan-yeong Lim, Wonbin Kim, Ha-Jin Yu

Audio deepfake detection (ADD) is the task of detecting spoofing attacks generated by text-to-speech or voice conversion systems.

DeepFake Detection Face Swapping +1

Paper
Add Code

Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models

1 code implementation • 14 Sep 2023 • Ju-ho Kim, Jungwoo Heo, Hyun-seo Shin, Chan-yeong Lim, Ha-Jin Yu

Diff-SV unifies a DPM-based speech enhancement system with a speaker embedding extractor, and yields a discriminative and noise-tolerable speaker representation through a hierarchical structure.

Speaker Verification Speech Enhancement

Paper
Code

PAS: Partial Additive Speech Data Augmentation Method for Noise Robust Speaker Verification

1 code implementation • 20 Jul 2023 • Wonbin Kim, Hyun-seo Shin, Ju-ho Kim, Jungwoo Heo, Chan-yeong Lim, Ha-Jin Yu

In this paper, we propose a new additive noise method, partial additive speech (PAS), which aims to train SV systems to be less affected by noisy environments.

Data Augmentation Speaker Verification

Paper
Code

One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker Verification

1 code implementation • 27 May 2023 • Jungwoo Heo, Chan-yeong Lim, Ju-ho Kim, Hyun-seo Shin, Ha-Jin Yu

This paper suggests One-Step Knowledge Distillation and Fine-Tuning (OS-KDFT), which incorporates KD and fine-tuning (FT).

Knowledge Distillation Self-Supervised Learning +1

Paper
Code

SS-BSN: Attentive Blind-Spot Network for Self-Supervised Denoising with Nonlocal Self-Similarity

1 code implementation • 17 May 2023 • Young-Joo Han, Ha-Jin Yu

However, these methods rely on large-scale noisy-clean image pairs, which are difficult to obtain in practice.

Image Denoising Image Restoration

Paper
Code

Integrated Parameter-Efficient Tuning for General-Purpose Audio Models

1 code implementation • 4 Nov 2022 • Ju-ho Kim, Jungwoo Heo, Hyun-seo Shin, Chan-yeong Lim, Ha-Jin Yu

To overcome these limitations, the present study explores and applies efficient transfer learning methods in the audio domain.

Genre classification Keyword Spotting +3

Paper
Code

Convolution channel separation and frequency sub-bands aggregation for music genre classification

no code implementations • 3 Nov 2022 • Jungwoo Heo, Hyun-seo Shin, Ju-ho Kim, Chan-yeong Lim, Ha-Jin Yu

In music, short-term features such as pitch and tempo constitute long-term semantic features such as melody and narrative.

Genre classification Music Genre Classification

Paper
Add Code

Extended U-Net for Speaker Verification in Noisy Environments

1 code implementation • 27 Jun 2022 • Ju-ho Kim, Jungwoo Heo, Hye-jin Shim, Ha-Jin Yu

Background noise is a well-known factor that deteriorates the accuracy and reliability of speaker verification (SV) systems by blurring speech intelligibility.

Denoising Speaker Identification +1

Paper
Code

SASV 2022: The First Spoofing-Aware Speaker Verification Challenge

no code implementations • 28 Mar 2022 • Jee-weon Jung, Hemlata Tak, Hye-jin Shim, Hee-Soo Heo, Bong-Jin Lee, Soo-Whan Chung, Ha-Jin Yu, Nicholas Evans, Tomi Kinnunen

Pre-trained spoofing detection and speaker verification models are provided as open source and are used in two baseline SASV solutions.

Speaker Verification

Paper
Add Code

RawNeXt: Speaker verification system for variable-duration utterances with deep layer aggregation and extended dynamic scaling policies

1 code implementation • 15 Dec 2021 • Ju-ho Kim, Hye-jin Shim, Jungwoo Heo, Ha-Jin Yu

Despite achieving satisfactory performance in speaker verification using deep neural networks, variable-duration utterances remain a challenge that threatens the robustness of systems.

Speaker Verification

Paper
Code

AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks

1 code implementation • 4 Oct 2021 • Jee-weon Jung, Hee-Soo Heo, Hemlata Tak, Hye-jin Shim, Joon Son Chung, Bong-Jin Lee, Ha-Jin Yu, Nicholas Evans

Artefacts that differentiate spoofed from bona-fide utterances can reside in spectral or temporal domains.

Ranked #1 on Voice Anti-spoofing on ASVspoof 2019 - LA

Graph Attention Voice Anti-spoofing

116

Paper
Code

Attentive max feature map and joint training for acoustic scene classification

no code implementations • 15 Apr 2021 • Hye-jin Shim, Jee-weon Jung, Ju-ho Kim, Ha-Jin Yu

Furthermore, adopting the proposed attentive max feature map, our team placed fourth in the recent DCASE 2021 challenge.

Acoustic Scene Classification Multi-Task Learning +1

Paper
Add Code

Learning Metrics from Mean Teacher: A Supervised Learning Method for Improving the Generalization of Speaker Verification System

no code implementations • 14 Apr 2021 • Ju-ho Kim, Hye-jin Shim, Jee-weon Jung, Ha-Jin Yu

By learning the reliable intermediate representation of the mean teacher network, we expect that the proposed method can explore more discriminatory embedding spaces and improve the generalization performance of the speaker verification system.

Speaker Verification

Paper
Add Code

Graph Attention Networks for Speaker Verification

no code implementations • 22 Oct 2020 • Jee-weon Jung, Hee-Soo Heo, Ha-Jin Yu, Joon Son Chung

The proposed framework inputs segment-wise speaker embeddings from an enrollment and a test utterance and directly outputs a similarity score.

Graph Attention Speaker Verification

Paper
Add Code

DCASENET: A joint pre-trained deep neural network for detecting and classifying acoustic scenes and events

1 code implementation • 21 Sep 2020 • Jee-weon Jung, Hye-jin Shim, Ju-ho Kim, Ha-Jin Yu

Single task deep neural networks that perform a target task among diverse cross-related tasks in the acoustic scene and event literature are being developed.

Acoustic Scene Classification Audio Tagging +3

Paper
Code

Capturing scattered discriminative information using a deep architecture in acoustic scene classification

no code implementations • 9 Jul 2020 • Hye-jin Shim, Jee-weon Jung, Ju-ho Kim, Ha-Jin Yu

Various experiments are conducted using the detection and classification of acoustic scenes and events 2020 task1-a dataset to validate the proposed methods.

Acoustic Scene Classification General Classification +1

Paper
Add Code

Integrated Replay Spoofing-aware Text-independent Speaker Verification

no code implementations • 10 Jun 2020 • Hye-jin Shim, Jee-weon Jung, Ju-ho Kim, Seung-bin Kim, Ha-Jin Yu

In this paper, we propose two approaches for building an integrated system of speaker verification and presentation attack detection: an end-to-end monolithic approach and a back-end modular approach.

Multi-Task Learning Speaker Identification +1

Paper
Add Code

Segment Aggregation for short utterances speaker verification using raw waveforms

1 code implementation • 7 May 2020 • Seung-bin Kim, Jee-weon Jung, Hye-jin Shim, Ju-ho Kim, Ha-Jin Yu

The proposed method segments an input utterance into several short utterances and then aggregates the segment embeddings extracted from the segmented inputs to compose a speaker embedding.

Speaker Verification

Paper
Code

Improved RawNet with Feature Map Scaling for Text-independent Speaker Verification using Raw Waveforms

2 code implementations • 1 Apr 2020 • Jee-weon Jung, Seung-bin Kim, Hye-jin Shim, Ju-ho Kim, Ha-Jin Yu

Recent advances in deep learning have facilitated the design of speaker verification systems that directly input raw waveforms.

Text-Independent Speaker Verification

332

Paper
Code

A study on the role of subsidiary information in replay attack spoofing detection

no code implementations • 31 Jan 2020 • Jee-weon Jung, Hye-jin Shim, Hee-Soo Heo, Ha-Jin Yu

For addition, we utilize the multi-task learning framework to include subsidiary information to the code.

Binary Classification Multi-Task Learning

Paper
Add Code

Self-supervised pre-training with acoustic configurations for replay spoofing detection

no code implementations • 22 Oct 2019 • Hye-jin Shim, Hee-Soo Heo, Jee-weon Jung, Ha-Jin Yu

Constructing a dataset for replay spoofing detection requires a physical process of playing an utterance and re-recording it, presenting a challenge to the collection of large-scale datasets.

Speaker Verification

Paper
Add Code

Cosine similarity-based adversarial process

no code implementations • 1 Jul 2019 • Hee-Soo Heo, Jee-weon Jung, Hye-jin Shim, IL-Ho Yang, Ha-Jin Yu

In particular, the adversarial process degrades the performance of the subsidiary model by eliminating the subsidiary information in the input which, in assumption, may degrade the performance of the primary model.

Speaker Identification

Paper
Add Code

Replay attack detection with complementary high-resolution information using end-to-end DNN for the ASVspoof 2019 Challenge

1 code implementation • 23 Apr 2019 • Jee-weon Jung, Hye-jin Shim, Hee-Soo Heo, Ha-Jin Yu

To detect unrevealed characteristics that reside in a replayed speech, we directly input spectrograms into an end-to-end DNN without knowledge-based intervention.

Paper
Code

RawNet: Advanced end-to-end deep neural network using raw waveforms for text-independent speaker verification

4 code implementations • 17 Apr 2019 • Jee-weon Jung, Hee-Soo Heo, Ju-ho Kim, Hye-jin Shim, Ha-Jin Yu

In this study, we explore end-to-end deep neural networks that input raw waveforms to improve various aspects: front-end speaker embedding extraction including model architecture, pre-training scheme, additional objective functions, and back-end classification.

Classification Data Augmentation +2

332

Paper
Code

End-to-end losses based on speaker basis vectors and all-speaker hard negative mining for speaker verification

no code implementations • 7 Feb 2019 • Hee-Soo Heo, Jee-weon Jung, IL-Ho Yang, Sung-Hyun Yoon, Hye-jin Shim, Ha-Jin Yu

Each speaker basis is designed to represent the corresponding speaker in the process of training deep neural networks.

Metric Learning Speaker Verification

Paper
Add Code

Short utterance compensation in speaker verification via cosine-based teacher-student learning of speaker embeddings

no code implementations • 25 Oct 2018 • Jee-weon Jung, Hee-Soo Heo, Hye-jin Shim, Ha-Jin Yu

The short duration of an input utterance is one of the most critical threats that degrade the performance of speaker verification systems.

Text-Independent Speaker Verification

Paper
Add Code

Replay spoofing detection system for automatic speaker verification using multi-task learning of noise classes

no code implementations • 29 Aug 2018 • Hye-jin Shim, Jee-weon Jung, Hee-Soo Heo, Sung-Hyun Yoon, Ha-Jin Yu

We explore the effectiveness of training a deep neural network simultaneously for replay attack spoofing detection and replay noise classification.

General Classification Multi-Task Learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.