1 code implementation • CVPR 2023 • Chiheon Kim, Doyup Lee, Saehoon Kim, Minsu Cho, Wook-Shin Han
Despite recent advances in implicit neural representations (INRs), it remains challenging for a coordinate-based multi-layer perceptron (MLP) of INRs to learn a common representation across data instances and generalize it for unseen instances.
no code implementations • 9 Jun 2022 • Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, Wook-Shin Han
After code stacks in the sequence are randomly masked, Contextual RQ-Transformer is trained to infill the masked code stacks based on the unmasked contexts of the image.
Ranked #1 on Text-to-Image Generation on Conceptual Captions
3 code implementations • CVPR 2022 • Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, Wook-Shin Han
However, we postulate that previous VQ cannot shorten the code sequence and generate high-fidelity images together in terms of the rate-distortion trade-off.
Ranked #2 on Text-to-Image Generation on Conceptual Captions
no code implementations • 17 Jan 2022 • Doyup Lee, Sungwoong Kim, Ildoo Kim, Yeongjae Cheon, Minsu Cho, Wook-Shin Han
Consistency regularization on label predictions becomes a fundamental technique in semi-supervised learning, but it still requires a large number of training iterations for high performance.
no code implementations • 21 Sep 2020 • Doyup Lee, Yeongjae Cheon, Wook-Shin Han
The results imply that cross-modal attention in VQA is important to improve not only VQA accuracy, but also the robustness to various anomalies.