1 code implementation • 5 Feb 2024 • Junyoung Seo, Susung Hong, Wooseok Jang, Inès Hyeonsu Kim, Minseop Kwak, Doyup Lee, Seungryong Kim
We leverage the retrieved asset to incorporate its geometric prior in the variational objective and adapt the diffusion model's 2D prior toward view consistency, achieving drastic improvements in both geometry and fidelity of generated scenes.
no code implementations • 12 Dec 2023 • Yoonwoo Jeong, Jinwoo Lee, Chiheon Kim, Minsu Cho, Doyup Lee
Transfer learning of large-scale Text-to-Image (T2I) models has recently shown impressive potential for Novel View Synthesis (NVS) of diverse objects from a single image.
no code implementations • CVPR 2023 • Minsoo Kang, Doyup Lee, Jiseob Kim, Saehoon Kim, Bohyung Han
We propose a text-to-image generation algorithm based on deep neural networks when text captions for images are unavailable during training.
1 code implementation • CVPR 2023 • Jaehoon Yoo, Semin Kim, Doyup Lee, Chiheon Kim, Seunghoon Hong
However, the transformers are prohibited from directly learning the long-term dependency in videos due to the quadratic complexity of self-attention, and inherently suffering from slow inference time and error propagation due to the autoregressive process.
Ranked #24 on Video Generation on UCF-101
1 code implementation • CVPR 2023 • Chiheon Kim, Doyup Lee, Saehoon Kim, Minsu Cho, Wook-Shin Han
Despite recent advances in implicit neural representations (INRs), it remains challenging for a coordinate-based multi-layer perceptron (MLP) of INRs to learn a common representation across data instances and generalize it for unseen instances.
no code implementations • 9 Jun 2022 • Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, Wook-Shin Han
After code stacks in the sequence are randomly masked, Contextual RQ-Transformer is trained to infill the masked code stacks based on the unmasked contexts of the image.
Ranked #1 on Text-to-Image Generation on Conceptual Captions
3 code implementations • CVPR 2022 • Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, Wook-Shin Han
However, we postulate that previous VQ cannot shorten the code sequence and generate high-fidelity images together in terms of the rate-distortion trade-off.
Ranked #2 on Text-to-Image Generation on Conceptual Captions
no code implementations • 17 Jan 2022 • Doyup Lee, Sungwoong Kim, Ildoo Kim, Yeongjae Cheon, Minsu Cho, Wook-Shin Han
Consistency regularization on label predictions becomes a fundamental technique in semi-supervised learning, but it still requires a large number of training iterations for high performance.
no code implementations • 21 Sep 2020 • Doyup Lee, Yeongjae Cheon, Wook-Shin Han
The results imply that cross-modal attention in VQA is important to improve not only VQA accuracy, but also the robustness to various anomalies.
no code implementations • 7 Jul 2020 • Doyup Lee, Yeongjae Cheon
Soft labeling becomes a common output regularization for generalization and model compression of deep neural networks.
3 code implementations • 26 May 2019 • Doyup Lee, Suehun Jung, Yeongjae Cheon, Dongil Kim, Seungil You
TGNet learns an autoregressive model, conditioned on temporal contexts of forecasting targets from temporal-guided embedding.
1 code implementation • 8 Aug 2017 • Doyup Lee
In this paper, I propose an automatic DBMS diagnosis system that detects anomaly periods with abnormal DB stat metrics and finds causal events in the periods.