Search Results for author: Chieh-Hsin Lai

Found 19 papers, 8 papers with code

Understanding Multimodal Contrastive Learning Through Pointwise Mutual Information

no code implementations • 30 Apr 2024 • Toshimitsu Uesaka, Taiji Suzuki, Yuhta Takida, Chieh-Hsin Lai, Naoki Murata, Yuki Mitsufuji

Multimodal representation learning to integrate different modalities, such as text, vision, and audio is important for real-world applications.

Classification Contrastive Learning +2

Paper
Add Code

HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes

no code implementations • 31 Dec 2023 • Yuhta Takida, Yukara Ikemiya, Takashi Shibuya, Kazuki Shimada, Woosung Choi, Chieh-Hsin Lai, Naoki Murata, Toshimitsu Uesaka, Kengo Uchida, Wei-Hsiang Liao, Yuki Mitsufuji

Vector quantization (VQ) is a technique to deterministically learn features with discrete codebook representations.

Quantization Representation Learning

Paper
Add Code

Manifold Preserving Guided Diffusion

no code implementations • 28 Nov 2023 • Yutong He, Naoki Murata, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Dongjun Kim, Wei-Hsiang Liao, Yuki Mitsufuji, J. Zico Kolter, Ruslan Salakhutdinov, Stefano Ermon

Despite the recent advancements, conditional image generation still faces challenges of cost, generalizability, and the need for task-specific training.

Conditional Image Generation

Paper
Add Code

On the Language Encoder of Contrastive Cross-modal Models

no code implementations • 20 Oct 2023 • Mengjie Zhao, Junya Ono, Zhi Zhong, Chieh-Hsin Lai, Yuhta Takida, Naoki Murata, Wei-Hsiang Liao, Takashi Shibuya, Hiromi Wakaki, Yuki Mitsufuji

Contrastive cross-modal models such as CLIP and CLAP aid various vision-language (VL) and audio-language (AL) tasks.

Sentence Sentence Embedding +1

Paper
Add Code

Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion

1 code implementation • 1 Oct 2023 • Dongjun Kim, Chieh-Hsin Lai, Wei-Hsiang Liao, Naoki Murata, Yuhta Takida, Toshimitsu Uesaka, Yutong He, Yuki Mitsufuji, Stefano Ermon

Consistency Models (CM) (Song et al., 2023) accelerate score-based diffusion model sampling at the cost of sample quality but lack a natural way to trade-off quality for speed.

Ranked #2 on Image Generation on CIFAR-10

Denoising Image Generation

180

Paper
Code

VRDMG: Vocal Restoration via Diffusion Posterior Sampling with Multiple Guidance

no code implementations • 13 Sep 2023 • Carlos Hernandez-Olivan, Koichi Saito, Naoki Murata, Chieh-Hsin Lai, Marco A. Martínez-Ramirez, Wei-Hsiang Liao, Yuki Mitsufuji

Restoring degraded music signals is essential to enhance audio quality for downstream music manipulation.

Bandwidth Extension

Paper
Add Code

The Sound Demixing Challenge 2023 $\unicode{x2013}$ Music Demixing Track

2 code implementations • 14 Aug 2023 • Giorgio Fabbro, Stefan Uhlich, Chieh-Hsin Lai, Woosung Choi, Marco Martínez-Ramírez, WeiHsiang Liao, Igor Gadelha, Geraldo Ramos, Eddie Hsu, Hugo Rodrigues, Fabian-Robert Stöter, Alexandre Défossez, Yi Luo, Jianwei Yu, Dipam Chakraborty, Sharada Mohanty, Roman Solovyev, Alexander Stempkovskiy, Tatiana Habruseva, Nabarun Goswami, Tatsuya Harada, Minseok Kim, Jun Hyung Lee, Yuanliang Dong, Xinran Zhang, Jiafeng Liu, Yuki Mitsufuji

We propose a formalization of the errors that can occur in the design of a training dataset for MSS systems and introduce two new datasets that simulate such errors: SDXDB23_LabelNoise and SDXDB23_Bleeding.

Music Source Separation

503

Paper
Code

On the Equivalence of Consistency-Type Models: Consistency Models, Consistent Diffusion Models, and Fokker-Planck Regularization

no code implementations • 1 Jun 2023 • Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Naoki Murata, Yuki Mitsufuji, Stefano Ermon

The emergence of various notions of ``consistency'' in diffusion models has garnered considerable attention and helped achieve improved sample quality, likelihood estimation, and accelerated sampling.

Paper
Add Code

GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration

1 code implementation • 30 Jan 2023 • Naoki Murata, Koichi Saito, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Yuki Mitsufuji, Stefano Ermon

Pre-trained diffusion models have been successfully used as priors in a variety of linear inverse problems, where the goal is to reconstruct a signal from noisy linear measurements.

Blind Image Deblurring Denoising +1

Paper
Code

SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer

1 code implementation • 30 Jan 2023 • Yuhta Takida, Masaaki Imaizumi, Takashi Shibuya, Chieh-Hsin Lai, Toshimitsu Uesaka, Naoki Murata, Yuki Mitsufuji

Generative adversarial networks (GANs) learn a target probability distribution by optimizing a generator and a discriminator with minimax objectives.

Ranked #1 on Image Generation on FFHQ 1024 x 1024

Image Generation

Paper
Code

Unsupervised vocal dereverberation with diffusion-based generative models

no code implementations • 8 Nov 2022 • Koichi Saito, Naoki Murata, Toshimitsu Uesaka, Chieh-Hsin Lai, Yuhta Takida, Takao Fukui, Yuki Mitsufuji

Removing reverb from reverberant music is a necessary technique to clean up audio for downstream music manipulations.

Paper
Add Code

FP-Diffusion: Improving Score-based Diffusion Models by Enforcing the Underlying Score Fokker-Planck Equation

1 code implementation • 9 Oct 2022 • Chieh-Hsin Lai, Yuhta Takida, Naoki Murata, Toshimitsu Uesaka, Yuki Mitsufuji, Stefano Ermon

Score-based generative models (SGMs) learn a family of noise-conditional score functions corresponding to the data density perturbed with increasingly large amounts of noise.

Denoising

Paper
Code

SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization

1 code implementation • 16 May 2022 • Yuhta Takida, Takashi Shibuya, WeiHsiang Liao, Chieh-Hsin Lai, Junki Ohmura, Toshimitsu Uesaka, Naoki Murata, Shusuke Takahashi, Toshiyuki Kumakura, Yuki Mitsufuji

In this paper, we propose a new training scheme that extends the standard VAE via novel stochastic dequantization and quantization, called stochastically quantized variational autoencoder (SQ-VAE).

Quantization

165

Paper
Code

Robust Vector Quantized-Variational Autoencoder

no code implementations • 4 Feb 2022 • Chieh-Hsin Lai, Dongmian Zou, Gilad Lerman

We experimentally demonstrate that RVQ-VAE is able to generate examples from inliers even if a large portion of the training data points are corrupted.

Decoder Quantization

Paper
Add Code

Preventing Oversmoothing in VAE via Generalized Variance Parameterization

no code implementations • 17 Feb 2021 • Yuhta Takida, Wei-Hsiang Liao, Chieh-Hsin Lai, Toshimitsu Uesaka, Shusuke Takahashi, Yuki Mitsufuji

Variational autoencoders (VAEs) often suffer from posterior collapse, which is a phenomenon in which the learned latent space becomes uninformative.

Decoder

Paper
Add Code

Unlocking Inverse Problems Using Deep Learning: Breaking Symmetries in Phase Retrieval

no code implementations • 23 Oct 2020 • Kshitij Tayal, Chieh-Hsin Lai, Raunak Manekar, Zhong Zhuang, Vipin Kumar, Ju Sun

In many physical systems, inputs related by intrinsic system symmetries generate the same output.

Retrieval

Paper
Add Code

Novelty Detection via Robust Variational Autoencoding

1 code implementation • 9 Jun 2020 • Chieh-Hsin Lai, Dongmian Zou, Gilad Lerman

We establish both robustness to outliers and suitability to low-rank modeling of the Wasserstein metric as opposed to the KL divergence.

Dimensionality Reduction Novelty Detection

Paper
Code

Inverse Problems, Deep Learning, and Symmetry Breaking

no code implementations • 20 Mar 2020 • Kshitij Tayal, Chieh-Hsin Lai, Vipin Kumar, Ju Sun

In many physical systems, inputs related by intrinsic system symmetries are mapped to the same output.

Retrieval

Paper
Add Code

Robust Subspace Recovery Layer for Unsupervised Anomaly Detection

2 code implementations • ICLR 2020 • Chieh-Hsin Lai, Dongmian Zou, Gilad Lerman

The encoder maps the data into a latent space, from which the RSR layer extracts the subspace.

Ranked #1 on Unsupervised Anomaly Detection with Specified Settings -- 1% anomaly on MNIST

Decoder Unsupervised Anomaly Detection with Specified Settings -- 0.1% anomaly +4

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.