Search Results for author: Sang-Hoon Lee

Found 21 papers, 9 papers with code

TranSentence: Speech-to-speech Translation via Language-agnostic Sentence-level Speech Encoding without Language-parallel Data

no code implementations17 Jan 2024 Seung-bin Kim, Sang-Hoon Lee, Seong-Whan Lee

With this method, despite training exclusively on the target language's monolingual data, we can generate target language speech in the inference stage using language-agnostic speech embedding from the source language speech.

Sentence Speech-to-Speech Translation +1

DurFlex-EVC: Duration-Flexible Emotional Voice Conversion with Parallel Generation

1 code implementation16 Jan 2024 Hyung-Seok Oh, Sang-Hoon Lee, Deok-Hyeon Cho, Seong-Whan Lee

Emotional voice conversion (EVC) seeks to modify the emotional tone of a speaker's voice while preserving the original linguistic content and the speaker's unique vocal characteristics.

Disentanglement Self-Supervised Learning +1

DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training

1 code implementation31 Jul 2023 Hyung-Seok Oh, Sang-Hoon Lee, Seong-Whan Lee

Expressive text-to-speech systems have undergone significant advancements owing to prosody modeling, but conventional methods can still be improved.

Denoising Expressive Speech Synthesis

HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer

no code implementations30 Jul 2023 Sang-Hoon Lee, Ha-Yeong Choi, Hyung-Seok Oh, Seong-Whan Lee

With a hierarchical adaptive structure, the model can adapt to a novel voice style and convert speech progressively.

Style Transfer Variational Inference

HiddenSinger: High-Quality Singing Voice Synthesis via Neural Audio Codec and Latent Diffusion Models

no code implementations12 Jun 2023 Ji-Sang Hwang, Sang-Hoon Lee, Seong-Whan Lee

To alleviate the challenges posed by model complexity in singing voice synthesis, we propose HiddenSinger, a high-quality singing voice synthesis system using a neural audio codec and latent diffusion models.

Denoising Singing Voice Synthesis +1

DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion

no code implementations25 May 2023 Ha-Yeong Choi, Sang-Hoon Lee, Seong-Whan Lee

To address the above problem, this paper presents decoupled denoising diffusion models (DDDMs) with disentangled representations, which can control the style for each attribute in generative models.

Attribute Denoising +2

VoiceMixer: Adversarial Voice Style Mixup

no code implementations NeurIPS 2021 Sang-Hoon Lee, Ji-Hoon Kim, Hyunseung Chung, Seong-Whan Lee

This insufficiency leads to the converted speech containing source speech style or losing source speech content.

Disentanglement Voice Conversion

GC-TTS: Few-shot Speaker Adaptation with Geometric Constraints

no code implementations16 Aug 2021 Ji-Hoon Kim, Sang-Hoon Lee, Ji-Hyun Lee, Hong-Gyu Jung, Seong-Whan Lee

While numerous attempts have been made to the few-shot speaker adaptation system, there is still a gap in terms of speaker similarity to the target speaker depending on the amount of data.

Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech

no code implementations5 Jun 2021 Hyunseung Chung, Sang-Hoon Lee, Seong-Whan Lee

Experimental results also show the superiority of our proposed model compared to other state-of-the-art TTS models with internal and external aligners.

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

2 code implementations4 Jun 2021 Ji-Hoon Kim, Sang-Hoon Lee, Ji-Hyun Lee, Seong-Whan Lee

Although recent works on neural vocoder have improved the quality of synthesized audio, there still exists a gap between generated and ground-truth audio in frequency space.

Audio Synthesis

GraphX-Convolution for Point Cloud Deformation in 2D-to-3D Conversion

1 code implementation ICCV 2019 Anh-Duc Nguyen, Seonghwa Choi, Woojae Kim, Sang-Hoon Lee

In this paper, we present a novel deep method to reconstruct a point cloud of an object from a single still image.

3D Reconstruction Object

HATS: A Hierarchical Graph Attention Network for Stock Movement Prediction

3 code implementations7 Aug 2019 Raehyun Kim, Chan Ho So, Minbyul Jeong, Sang-Hoon Lee, Jinkyu Kim, Jaewoo Kang

Methods that use relational data for stock market prediction have been recently proposed, but they are still in their infancy.

Graph Attention Graph Classification +2

Propagating LSTM: 3D Pose Estimation based on Joint Interdependency

no code implementations ECCV 2018 Kyoungoh Lee, Inwoong Lee, Sang-Hoon Lee

We present a novel 3D pose estimation method based on joint interdependency (JI) for acquiring 3D joints from the human pose of an RGB image.

3D Human Pose Estimation 3D Pose Estimation

Deep Learning of Human Visual Sensitivity in Image Quality Assessment Framework

1 code implementation CVPR 2017 Jongyoo Kim, Sang-Hoon Lee

Since human observers are the ultimate receivers of digital images, image quality metrics should be designed from a human-oriented perspective.

Image Quality Assessment

Cannot find the paper you are looking for? You can Submit a new open access paper.