Search Results for author: Kyogu Lee

Found 38 papers, 18 papers with code

Removing Speaker Information from Speech Representation using Variable-Length Soft Pooling

no code implementations • 1 Apr 2024 • Injune Hwang, Kyogu Lee

Recently, there have been efforts to encode the linguistic information of speech using a self-supervised framework for speech synthesis.

Speaker Identification Speech Synthesis

Paper
Add Code

Learning Semantic Information from Raw Audio Signal Using Both Contextual and Phonetic Representations

no code implementations • 2 Feb 2024 • Jaeyeon Kim, Injune Hwang, Kyogu Lee

We propose a framework to learn semantics from raw audio signals using two types of representations, encoding contextual and phonetic information respectively.

Language Modelling Spoken Language Understanding

Paper
Add Code

Music Auto-Tagging with Robust Music Representation Learned via Domain Adversarial Training

no code implementations • 27 Jan 2024 • Haesun Joung, Kyogu Lee

Music auto-tagging is crucial for enhancing music discovery and recommendation.

Information Retrieval Music Auto-Tagging +2

Paper
Add Code

DDD: A Perceptually Superior Low-Response-Time DNN-based Declipper

1 code implementation • 8 Jan 2024 • Jayeon Yi, Junghyun Koo, Kyogu Lee

Clipping is a common nonlinear distortion that occurs whenever the input or output of an audio system exceeds the supported range.

Paper
Code

Inverse Nonlinearity Compensation of Hyperelastic Deformation in Dielectric Elastomer for Acoustic Actuation

no code implementations • 8 Jan 2024 • Jin Woo Lee, Gwang Seok An, Jeong-Yun Sun, Kyogu Lee

This paper delves into the analysis of nonlinear deformation induced by dielectric actuation in pre-stressed ideal dielectric elastomers.

Numerical Integration

Paper
Add Code

Combinatorial music generation model with song structure graph analysis

1 code implementation • 24 Dec 2023 • SeongHyeon Go, Kyogu Lee

In this work, we propose a symbolic music generation model with the song structure graph analysis network.

Music Classification Music Generation

Paper
Code

Beat-Aligned Spectrogram-to-Sequence Generation of Rhythm-Game Charts

no code implementations • 22 Nov 2023 • Jayeon Yi, Sungho Lee, Kyogu Lee

In the heart of "rhythm games" - games where players must perform actions in sync with a piece of music - are "charts", the directives to be given to players.

Paper
Add Code

Exploiting Time-Frequency Conformers for Music Audio Enhancement

no code implementations • 24 Aug 2023 • Yunkee Chae, Junghyun Koo, Sungho Lee, Kyogu Lee

With the proliferation of video platforms on the internet, recording musical performances by mobile devices has become commonplace.

Speech Enhancement

Paper
Add Code

Self-refining of Pseudo Labels for Music Source Separation with Noisy Labeled Data

no code implementations • 24 Jul 2023 • Junghyun Koo, Yunkee Chae, Chang-Bin Jeon, Kyogu Lee

Music source separation (MSS) faces challenges due to the limited availability of correctly-labeled individual instrument tracks.

Instrument Recognition Music Source Separation

Paper
Add Code

Debiased Automatic Speech Recognition for Dysarthric Speech via Sample Reweighting with Sample Affinity Test

no code implementations • 22 May 2023 • Eungbeom Kim, Yunkee Chae, Jaeheon Sim, Kyogu Lee

Since ERM utilizes the averaged performance on the data samples regardless of a group such as healthy or dysarthric speakers, ASR systems are unaware of the performance disparities across the groups.

Automatic Speech Recognition speech-recognition +1

Paper
Add Code

Show Me the Instruments: Musical Instrument Retrieval from Mixture Audio

1 code implementation • 15 Nov 2022 • KyungSu Kim, Minju Park, Haesun Joung, Yunkee Chae, Yeongbeom Hong, SeongHyeon Go, Kyogu Lee

The Single-Instrument Encoder is trained to classify the instruments used in single-track audio, and we take its penultimate layer's activation as the instrument embedding.

Retrieval

Paper
Code

MedleyVox: An Evaluation Dataset for Multiple Singing Voices Separation

1 code implementation • 14 Nov 2022 • Chang-Bin Jeon, Hyeongi Moon, Keunwoo Choi, Ben Sangbae Chon, Kyogu Lee

Second, to overcome the absence of existing multi-singing datasets for a training purpose, we present a strategy for construction of multiple singing mixtures using various single-singing datasets.

Music Source Separation Super-Resolution

Paper
Code

Semi-supervised learning for continuous emotional intensity controllable speech synthesis with disentangled representations

no code implementations • 11 Nov 2022 • Yoori Oh, Juheon Lee, Yoseob Han, Kyogu Lee

However, the emotional latent space generated from the existing models is difficult to control the continuous emotional intensity because of the entanglement of features like emotions, speakers, etc.

Emotional Speech Synthesis

Paper
Add Code

Music Mixing Style Transfer: A Contrastive Learning Approach to Disentangle Audio Effects

1 code implementation • 4 Nov 2022 • Junghyun Koo, Marco A. Martínez-Ramírez, Wei-Hsiang Liao, Stefan Uhlich, Kyogu Lee, Yuki Mitsufuji

We propose an end-to-end music mixing style transfer system that converts the mixing style of an input multitrack to that of a reference song.

Contrastive Learning Disentanglement +2

144

Paper
Code

Pop2Piano : Pop Audio-based Piano Cover Generation

1 code implementation • 2 Nov 2022 • Jongho Choi, Kyogu Lee

Piano covers of pop music are enjoyed by many people.

432

Paper
Code

Neural Fourier Shift for Binaural Speech Rendering

no code implementations • 2 Nov 2022 • Jin Woo Lee, Kyogu Lee

We present a neural network for rendering binaural speech from given monaural audio, position, and orientation of the source.

Paper
Add Code

Exploring Train and Test-Time Augmentations for Audio-Language Learning

no code implementations • 31 Oct 2022 • Eungbeom Kim, Jinhee Kim, Yoori Oh, KyungSu Kim, Minju Park, Jaeheon Sim, Jinwoo Lee, Kyogu Lee

In this paper, we aim to unveil the impact of data augmentation in audio-language multi-modal learning, which has not been explored despite its importance.

Ranked #2 on Audio to Text Retrieval on AudioCaps

Audio captioning Audio to Text Retrieval +4

Paper
Add Code

Exploiting Negative Preference in Content-based Music Recommendation with Contrastive Learning

no code implementations • 28 Jul 2022 • Minju Park, Kyogu Lee

Advanced music recommendation systems are being introduced along with the development of machine learning.

Contrastive Learning Music Recommendation +1

Paper
Add Code

Representation Selective Self-distillation and wav2vec 2.0 Feature Exploration for Spoof-aware Speaker Verification

no code implementations • 6 Apr 2022 • Jin Woo Lee, Eungbeom Kim, Junghyun Koo, Kyogu Lee

Our study allows us to analyze which attribute of speech signals is advantageous for the CM systems.

Attribute Speaker Verification +1

Paper
Add Code

Global HRTF Interpolation via Learned Affine Transformation of Hyper-conditioned Features

1 code implementation • 6 Apr 2022 • Jin Woo Lee, Sungho Lee, Kyogu Lee

Especially for the data-driven approaches, existing HRTF datasets differ in spatial sampling distributions of source positions, posing a major problem when generalizing the method across multiple datasets.

Position

Paper
Code

End-to-end Music Remastering System Using Self-supervised and Adversarial Training

1 code implementation • 17 Feb 2022 • Junghyun Koo, Seungryeol Paik, Kyogu Lee

Mastering is an essential step in music production, but it is also a challenging task that has to go through the hands of experienced audio engineers, where they adjust tone, space, and volume of a song.

Paper
Code

Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations

2 code implementations • NeurIPS 2021 • Hyeong-Seok Choi, Juheon Lee, Wansoo Kim, Jie Hwan Lee, Hoon Heo, Kyogu Lee

We present a neural analysis and synthesis (NANSY) framework that can manipulate voice, pitch, and speed of an arbitrary speech signal.

Voice Conversion

152

Paper
Code

Reverb Conversion of Mixed Vocal Tracks Using an End-to-end Convolutional Deep Neural Network

no code implementations • 3 Mar 2021 • Junghyun Koo, Seungryeol Paik, Kyogu Lee

This method enables us to apply the reverb of the reference track to the source track to which the effect is desired.

Paper
Add Code

Real-time Denoising and Dereverberation with Tiny Recurrent U-Net

1 code implementation • 5 Feb 2021 • Hyeong-Seok Choi, Sungjin Park, Jie Hwan Lee, Hoon Heo, Dongsuk Jeon, Kyogu Lee

Modern deep learning-based models have seen outstanding performance improvement with speech enhancement tasks.

Denoising Speech Enhancement

Paper
Code

Neural Audio Fingerprint for High-specific Audio Retrieval based on Contrastive Learning

3 code implementations • 22 Oct 2020 • Sungkyun Chang, Donmoon Lee, Jeongsoo Park, Hyungui Lim, Kyogu Lee, Karam Ko, Yoonchang Han

Most of existing audio fingerprinting systems have limitations to be used for high-specific audio retrieval at scale.

Audio Fingerprint Contrastive Learning +2

2,968

Paper
Code

Exploiting Multi-Modal Features From Pre-trained Networks for Alzheimer's Dementia Recognition

no code implementations • 9 Sep 2020 • Junghyun Koo, Jie Hwan Lee, Jaewoo Pyo, Yujin Jo, Kyogu Lee

In this work, we exploit various multi-modal features extracted from pre-trained networks to recognize Alzheimer's Dementia using a neural network, with a small dataset provided by the ADReSS Challenge at INTERSPEECH 2020.

regression

Paper
Add Code

From Inference to Generation: End-to-end Fully Self-supervised Generation of Human Face from Speech

no code implementations • ICLR 2020 • Hyeong-Seok Choi, Changdae Park, Kyogu Lee

We analyze the extent to which the network can naturally disentangle two latent factors that contribute to the generation of a face image - one that comes directly from a speech signal and the other that is not related to it - and explore whether the network can learn to generate natural human face image distribution by modeling these factors.

Paper
Add Code

VirtuosoNet: A Hierarchical RNN-based System for Modeling Expressive Piano Performance

1 code implementation • ISMIR 2019 • Dasaem Jeong, Taegyun Kwon, Yoojin Kim, Kyogu Lee, Juhan Nam

In this paper, we present our application of deep neural network to modeling piano performance, which imitates the expressive control of tempo, dynamics, articulations and pedaling from pianists.

Music Performance Rendering

Paper
Code

Disentangling Timbre and Singing Style with Multi-singer Singing Synthesis System

no code implementations • 29 Oct 2019 • Juheon Lee, Hyeong-Seok Choi, Junghyun Koo, Kyogu Lee

In this study, we define the identity of the singer with two independent concepts - timbre and singing style - and propose a multi-singer singing synthesis system that can model them separately.

Sound Audio and Speech Processing

Paper
Add Code

Adversarially Trained End-to-end Korean Singing Voice Synthesis System

no code implementations • 6 Aug 2019 • Juheon Lee, Hyeong-Seok Choi, Chang-Bin Jeon, Junghyun Koo, Kyogu Lee

In this paper, we propose an end-to-end Korean singing voice synthesis system from lyrics and a symbolic melody using the following three novel approaches: 1) phonetic enhancement masking, 2) local conditioning of text and pitch to the super-resolution network, and 3) conditional adversarial training.

Sound Audio and Speech Processing

Paper
Add Code

Phase-aware Speech Enhancement with Deep Complex U-Net

7 code implementations • ICLR 2019 • Hyeong-Seok Choi, Jang-Hyun Kim, Jaesung Huh, Adrian Kim, Jung-Woo Ha, Kyogu Lee

Most deep learning-based models for speech enhancement have mainly focused on estimating the magnitude of spectrogram while reusing the phase from noisy speech for reconstruction.

Speech Enhancement valid

309

Paper
Code

Sequential Skip Prediction with Few-shot in Streamed Music Contents

1 code implementation • 24 Jan 2019 • Sungkyun Chang, Seungjin Lee, Kyogu Lee

This paper provides an outline of the algorithms submitted for the WSDM Cup 2019 Spotify Sequential Skip Prediction Challenge (team name: mimbres).

Ranked #1 on Sequential skip prediction on MSSD

Few-Shot Learning Metric Learning +1

Paper
Code

Music Source Separation Using Stacked Hourglass Networks

4 code implementations • 22 May 2018 • Sungheon Park, Tae-hoon Kim, Kyogu Lee, Nojun Kwak

In this paper, we propose a simple yet effective method for multiple music source separation using convolutional neural networks.

Sound Audio and Speech Processing

Paper
Code

Chord Generation from Symbolic Melody Using BLSTM Networks

2 code implementations • 4 Dec 2017 • Hyungui Lim, Seungyeon Rhyu, Kyogu Lee

Generating a chord progression from a monophonic melody is a challenging problem because a chord progression requires a series of layered notes played simultaneously.

Paper
Code

Audio Cover Song Identification using Convolutional Neural Network

no code implementations • 1 Dec 2017 • Sungkyun Chang, Juheon Lee, Sang Keun Choe, Kyogu Lee

To do this, we first build the CNN using as an input a cross-similarity matrix generated from a pair of songs.

Cover song identification Relation

Paper
Add Code

Lyrics-to-Audio Alignment by Unsupervised Discovery of Repetitive Patterns in Vowel Acoustics

no code implementations • 21 Jan 2017 • Sungkyun Chang, Kyogu Lee

Most of the previous approaches to lyrics-to-audio alignment used a pre-developed automatic speech recognition (ASR) system that innately suffered from several difficulties to adapt the speech model to individual singers.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Deep convolutional neural networks for predominant instrument recognition in polyphonic music

1 code implementation • 31 May 2016 • Yoonchang Han, Jaehun Kim, Kyogu Lee

We train our network from fixed-length music excerpts with a single-labeled predominant instrument and estimate an arbitrary number of predominant instruments from an audio signal with a variable length.

Information Retrieval Instrument Recognition +3

Paper
Code

A Deep Bag-of-Features Model for Music Auto-Tagging

1 code implementation • 20 Aug 2015 • Juhan Nam, Jorge Herrera, Kyogu Lee

Feature learning and deep learning have drawn great attention in recent years as a way of transforming input data into more effective representations using learning algorithms.

Audio Classification Information Retrieval +4

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.