no code implementations • 2 Apr 2024 • Seokha Moon, Hongbeen Park, Jungphil Kwon, Jaekoo Lee, Jinkyu Kim
To this end, we propose a model called DAP (Detection After Prediction), consisting of a two-branch network: (i) a branch responsible for forecasting the current objects' poses given past observations and (ii) another branch that detects objects based on the current and past observations.
no code implementations • 18 Mar 2024 • Mincheol Chang, Siyeong Lee, Jinkyu Kim, Namil Kim
Typical LiDAR-based 3D object detection models are trained in a supervised manner with real-world data collection, which is often imbalanced over classes (or long-tailed).
no code implementations • 6 Mar 2024 • Gyusam Chang, Wonseok Roh, Sujin Jang, Dongwook Lee, Daehyun Ji, Gyeongrok Oh, Jinsun Park, Jinkyu Kim, Sangpil Kim
Recent LiDAR-based 3D Object Detection (3DOD) methods show promising results, but they often do not generalize well to target domains outside the source (or training) data distribution.
no code implementations • 22 Feb 2024 • Haeji Jung, Changdae Oh, Jooeon Kang, Jimin Sohn, Kyungwoo Song, Jinkyu Kim, David R. Mortensen
Approaches to improving multilingual language understanding often require multiple languages during the training phase, rely on complicated training techniques, and -- importantly -- struggle with significant performance gaps between high-resource and low-resource languages.
no code implementations • 10 Jan 2024 • Seonguk Seo, Jinkyu Kim, Geeho Kim, Bohyung Han
We propose a novel contrastive learning framework to effectively address the challenges of data heterogeneity in federated learning.
no code implementations • 7 Dec 2023 • Gyeongrok Oh, Jaehwan Jeong, Sieun Kim, Wonmin Byeon, Jinkyu Kim, Sungwoong Kim, Hyeokmin Kwon, Sangpil Kim
Concerning the characteristics of video, multi-text conditioning incorporating sequential events is necessary for next-step video generation.
no code implementations • 4 Dec 2023 • Daewon Chae, Nokyung Park, Jinkyu Kim, Kimin Lee
In this work, we introduce InstructBooth, a novel method designed to enhance image-text alignment in personalized text-to-image models without sacrificing the personalization ability.
no code implementations • 4 Oct 2023 • Nokyung Park, Daewon Chae, Jeongyong Shim, Sangpil Kim, Eun-Sol Kim, Jinkyu Kim
However, they use pivot embedding in global manner (i. e., aligning an image embedding with sentence-level text embedding), not fully utilizing the semantic cues of given text description.
no code implementations • ICCV 2023 • Yujin Jeong, Wonjeong Ryoo, SeungHyun Lee, Dabin Seo, Wonmin Byeon, Sangpil Kim, Jinkyu Kim
Hence, we propose The Power of Sound (TPoS) model to incorporate audio input that includes both changeable temporal semantics and magnitude.
1 code implementation • 13 Apr 2023 • Seung Hyun Lee, Sieun Kim, Innfarn Yoo, Feng Yang, Donghyeon Cho, Youngseo Kim, Huiwen Chang, Jinkyu Kim, Sangpil Kim
We propose a method for adding sound-guided visual effects to specific regions of videos with a zero-shot setting.
no code implementations • 18 Jan 2023 • Gyeongrok Oh, Heon Gu, Jinkyu Kim, Sangpil Kim
Interference between overlapping gird patterns creates moire patterns, degrading the visual quality of an image that captures a screen of a digital display device by an ordinary digital camera.
no code implementations • 7 Dec 2022 • Seongbeom Park, Suhong Moon, Jinkyu Kim
Text-to-image generation methods produce high-resolution and high-quality images, but these methods should not produce immoral images that may contain inappropriate content from the perspective of commonsense morality.
no code implementations • 21 Nov 2022 • Seung Hyun Lee, Chanyoung Kim, Wonmin Byeon, Sang Ho Yoon, Jinkyu Kim, Sangpil Kim
We present a novel framework, Localized Image Stylization with Audio (LISA) which performs audio-driven localized image stylization.
1 code implementation • 10 Nov 2022 • Yujin Jeong, Seongbeom Park, Suhong Moon, Jinkyu Kim
Here, we propose a model that predicts visual commonsense immorality in a zero-shot manner.
no code implementations • 7 Oct 2022 • Daeun Lee, Jongwon Park, Jinkyu Kim
An autonomous driving system requires a 3D object detector, which must perceive all present road agents reliably to navigate an environment safely.
no code implementations • 30 Aug 2022 • Seung Hyun Lee, Gyeongrok Oh, Wonmin Byeon, Sang Ho Yoon, Jinkyu Kim, Sangpil Kim
Our extensive experiments show that our sound-guided image manipulation approach produces semantically and visually more plausible manipulation results than the state-of-the-art text and sound-guided image manipulation methods, which are further confirmed by our human evaluations.
1 code implementation • 21 Jul 2022 • Seonwoo Min, Nokyung Park, Siwon Kim, Seunghyun Park, Jinkyu Kim
In this work, we advocate for leveraging natural language supervision for the domain generalization task.
2 code implementations • ICML 2022 • Jinkyu Kim, Geeho Kim, Bohyung Han
A critical challenge of federated learning is data heterogeneity and imbalance across clients, which leads to inconsistency between local networks and unstable convergence of global models.
no code implementations • 7 Jul 2022 • Suhong Moon, Domas Buracas, Seunghyun Park, Jinkyu Kim, John Canny
It also uses a purely-dynamic local dispersive force (Brownian motion) that shows improved performance over other methods and does not require knowledge of other particle coordinates.
no code implementations • 2 Jul 2022 • Wonseok Roh, Gyusam Chang, Seokha Moon, Giljoo Nam, Chanyoung Kim, Younghyun Kim, Jinkyu Kim, Sangpil Kim
Current multi-view 3D object detection methods often fail to detect objects in the overlap region properly, and the networks' understanding of the scene is often limited to that of a monocular detection network.
Ranked #6 on Robust Camera Only 3D Object Detection on nuScenes-C
no code implementations • 2 Jun 2022 • Jinkyu Kim, Reza Mahjourian, Scott Ettinger, Mayank Bansal, Brandyn White, Ben Sapp, Dragomir Anguelov
A whole-scene sparse input representation allows StopNet to scale to predicting trajectories for hundreds of road agents with reliable latency.
no code implementations • 20 Apr 2022 • Seung Hyun Lee, Gyeongrok Oh, Wonmin Byeon, Chanyoung Kim, Won Jeong Ryoo, Sang Ho Yoon, Hyunjun Cho, Jihyun Bae, Jinkyu Kim, Sangpil Kim
The recent success in StyleGAN demonstrates that pre-trained StyleGAN latent space is useful for realistic video generation.
no code implementations • 8 Mar 2022 • Reza Mahjourian, Jinkyu Kim, Yuning Chai, Mingxing Tan, Ben Sapp, Dragomir Anguelov
We propose Occupancy Flow Fields, a new representation for motion forecasting of multiple agents, an important task in autonomous driving.
1 code implementation • 10 Jan 2022 • Geeho Kim, Jinkyu Kim, Bohyung Han
To address this challenge, we propose a simple but effective federated learning framework, which improves the consistency across clients and facilitates the convergence of the server model.
1 code implementation • CVPR 2022 • Seung Hyun Lee, Wonseok Roh, Wonmin Byeon, Sang Ho Yoon, Chan Young Kim, Jinkyu Kim, Sangpil Kim
Our audio encoder is trained to produce a latent representation from an audio input, which is forced to be aligned with image and text representations in the multi-modal embedding space.
no code implementations • 28 Oct 2021 • Francis Indaheng, Edward Kim, Kesav Viswanadha, Jay Shenoy, Jinkyu Kim, Daniel J. Fremont, Sanjit A. Seshia
Hence, it is important that these prediction models are extensively tested in various test scenarios involving interactive behaviors prior to deployment.
2 code implementations • ICCV 2021 • Daehee Kim, Seunghyun Park, Jinkyu Kim, Jaekoo Lee
However, the performance of contrastive learning fundamentally depends on quality and quantity of negative data pairs.
Ranked #58 on Domain Generalization on PACS
no code implementations • CVPR 2020 • Jinkyu Kim, Suhong Moon, Anna Rohrbach, Trevor Darrell, John Canny
Humans learn to drive through both practice and theory, e. g. by studying the rules, while most self-driving systems are limited to the former.
no code implementations • 8 May 2020 • Jinkyu Kim, Mayank Bansal
Deep neural networks are a key component of behavior prediction and motion generation for self-driving cars.
no code implementations • CVPR 2019 • Jinkyu Kim, Teruhisa Misu, Yi-Ting Chen, Ashish Tawari, John Canny
We show that taking advice improves the performance of the end-to-end network, while the network cues on a variety of visual features that are provided by advice.
3 code implementations • 7 Aug 2019 • Raehyun Kim, Chan Ho So, Minbyul Jeong, Sang-Hoon Lee, Jinkyu Kim, Jaewoo Kang
Methods that use relational data for stock market prediction have been recently proposed, but they are still in their infancy.
1 code implementation • 24 Mar 2019 • Ye Xia, Jinkyu Kim, John Canny, Karl Zipser, David Whitney
Inspired by human vision, we propose a new periphery-fovea multi-resolution driving model that predicts vehicle speed from dash camera videos.
2 code implementations • ECCV 2018 • Jinkyu Kim, Anna Rohrbach, Trevor Darrell, John Canny, Zeynep Akata
Finally, we explore a version of our model that generates rationalizations, and compare with introspective explanations on the same video segments.
2 code implementations • 17 Nov 2017 • Ye Xia, Danqing Zhang, Jinkyu Kim, Ken Nakayama, Karl Zipser, David Whitney
Because critical driving moments are so rare, collecting enough data for these situations is difficult with the conventional in-car data collection protocol---tracking eye movements during driving.
no code implementations • ICCV 2017 • Jinkyu Kim, John Canny
The attention model highlights image regions that potentially influence the network's output.