Search Results for author: Jinkyu Kim

Found 35 papers, 11 papers with code

Learning Temporal Cues by Predicting Objects Move for Multi-camera 3D Object Detection

no code implementations2 Apr 2024 Seokha Moon, Hongbeen Park, Jungphil Kwon, Jaekoo Lee, Jinkyu Kim

To this end, we propose a model called DAP (Detection After Prediction), consisting of a two-branch network: (i) a branch responsible for forecasting the current objects' poses given past observations and (ii) another branch that detects objects based on the current and past observations.

3D Object Detection Autonomous Driving +1

Just Add $100 More: Augmenting NeRF-based Pseudo-LiDAR Point Cloud for Resolving Class-imbalance Problem

no code implementations18 Mar 2024 Mincheol Chang, Siyeong Lee, Jinkyu Kim, Namil Kim

Typical LiDAR-based 3D object detection models are trained in a supervised manner with real-world data collection, which is often imbalanced over classes (or long-tailed).

3D Object Detection object-detection

CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-Based 3D Object Detection

no code implementations6 Mar 2024 Gyusam Chang, Wonseok Roh, Sujin Jang, Dongwook Lee, Daehyun Ji, Gyeongrok Oh, Jinsun Park, Jinkyu Kim, Sangpil Kim

Recent LiDAR-based 3D Object Detection (3DOD) methods show promising results, but they often do not generalize well to target domains outside the source (or training) data distribution.

3D Object Detection object-detection +1

Mitigating the Linguistic Gap with Phonemic Representations for Robust Multilingual Language Understanding

no code implementations22 Feb 2024 Haeji Jung, Changdae Oh, Jooeon Kang, Jimin Sohn, Kyungwoo Song, Jinkyu Kim, David R. Mortensen

Approaches to improving multilingual language understanding often require multiple languages during the training phase, rely on complicated training techniques, and -- importantly -- struggle with significant performance gaps between high-resource and low-resource languages.

Language Modelling

Relaxed Contrastive Learning for Federated Learning

no code implementations10 Jan 2024 Seonguk Seo, Jinkyu Kim, Geeho Kim, Bohyung Han

We propose a novel contrastive learning framework to effectively address the challenges of data heterogeneity in federated learning.

Contrastive Learning Federated Learning

MTVG : Multi-text Video Generation with Text-to-Video Models

no code implementations7 Dec 2023 Gyeongrok Oh, Jaehwan Jeong, Sieun Kim, Wonmin Byeon, Jinkyu Kim, Sungwoong Kim, Hyeokmin Kwon, Sangpil Kim

Concerning the characteristics of video, multi-text conditioning incorporating sequential events is necessary for next-step video generation.

Video Generation

InstructBooth: Instruction-following Personalized Text-to-Image Generation

no code implementations4 Dec 2023 Daewon Chae, Nokyung Park, Jinkyu Kim, Kimin Lee

In this work, we introduce InstructBooth, a novel method designed to enhance image-text alignment in personalized text-to-image models without sacrificing the personalization ability.

Instruction Following Text-to-Image Generation

Clustering-based Image-Text Graph Matching for Domain Generalization

no code implementations4 Oct 2023 Nokyung Park, Daewon Chae, Jeongyong Shim, Sangpil Kim, Eun-Sol Kim, Jinkyu Kim

However, they use pivot embedding in global manner (i. e., aligning an image embedding with sentence-level text embedding), not fully utilizing the semantic cues of given text description.

Clustering Domain Generalization +2

The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion

no code implementations ICCV 2023 Yujin Jeong, Wonjeong Ryoo, SeungHyun Lee, Dabin Seo, Wonmin Byeon, Sangpil Kim, Jinkyu Kim

Hence, we propose The Power of Sound (TPoS) model to incorporate audio input that includes both changeable temporal semantics and magnitude.

Video Generation

FPANet: Frequency-based Video Demoireing using Frame-level Post Alignment

no code implementations18 Jan 2023 Gyeongrok Oh, Heon Gu, Jinkyu Kim, Sangpil Kim

Interference between overlapping gird patterns creates moire patterns, degrading the visual quality of an image that captures a screen of a digital display device by an ordinary digital camera.

SSIM

Ensuring Visual Commonsense Morality for Text-to-Image Generation

no code implementations7 Dec 2022 Seongbeom Park, Suhong Moon, Jinkyu Kim

Text-to-image generation methods produce high-resolution and high-quality images, but these methods should not produce immoral images that may contain inappropriate content from the perspective of commonsense morality.

Image Manipulation Text-to-Image Generation

LISA: Localized Image Stylization with Audio via Implicit Neural Representation

no code implementations21 Nov 2022 Seung Hyun Lee, Chanyoung Kim, Wonmin Byeon, Sang Ho Yoon, Jinkyu Kim, Sangpil Kim

We present a novel framework, Localized Image Stylization with Audio (LISA) which performs audio-driven localized image stylization.

Image Stylization Object +1

Zero-shot Visual Commonsense Immorality Prediction

1 code implementation10 Nov 2022 Yujin Jeong, Seongbeom Park, Suhong Moon, Jinkyu Kim

Here, we propose a model that predicts visual commonsense immorality in a zero-shot manner.

Ethics

Resolving Class Imbalance for LiDAR-based Object Detector by Dynamic Weight Average and Contextual Ground Truth Sampling

no code implementations7 Oct 2022 Daeun Lee, Jongwon Park, Jinkyu Kim

An autonomous driving system requires a 3D object detector, which must perceive all present road agents reliably to navigate an environment safely.

Autonomous Driving Navigate

Robust Sound-Guided Image Manipulation

no code implementations30 Aug 2022 Seung Hyun Lee, Gyeongrok Oh, Wonmin Byeon, Sang Ho Yoon, Jinkyu Kim, Sangpil Kim

Our extensive experiments show that our sound-guided image manipulation approach produces semantically and visually more plausible manipulation results than the state-of-the-art text and sound-guided image manipulation methods, which are further confirmed by our human evaluations.

Image Manipulation

Grounding Visual Representations with Texts for Domain Generalization

1 code implementation21 Jul 2022 Seonwoo Min, Nokyung Park, Siwon Kim, Seunghyun Park, Jinkyu Kim

In this work, we advocate for leveraging natural language supervision for the domain generalization task.

Domain Generalization

Multi-Level Branched Regularization for Federated Learning

2 code implementations ICML 2022 Jinkyu Kim, Geeho Kim, Bohyung Han

A critical challenge of federated learning is data heterogeneity and imbalance across clients, which leads to inconsistency between local networks and unstable convergence of global models.

Federated Learning Knowledge Distillation

An Embedding-Dynamic Approach to Self-supervised Learning

no code implementations7 Jul 2022 Suhong Moon, Domas Buracas, Seunghyun Park, Jinkyu Kim, John Canny

It also uses a purely-dynamic local dispersive force (Brownian motion) that shows improved performance over other methods and does not require knowledge of other particle coordinates.

Classification Image Classification +7

ORA3D: Overlap Region Aware Multi-view 3D Object Detection

no code implementations2 Jul 2022 Wonseok Roh, Gyusam Chang, Seokha Moon, Giljoo Nam, Chanyoung Kim, Younghyun Kim, Jinkyu Kim, Sangpil Kim

Current multi-view 3D object detection methods often fail to detect objects in the overlap region properly, and the networks' understanding of the scene is often limited to that of a monocular detection network.

3D Object Detection Disparity Estimation +4

StopNet: Scalable Trajectory and Occupancy Prediction for Urban Autonomous Driving

no code implementations2 Jun 2022 Jinkyu Kim, Reza Mahjourian, Scott Ettinger, Mayank Bansal, Brandyn White, Ben Sapp, Dragomir Anguelov

A whole-scene sparse input representation allows StopNet to scale to predicting trajectories for hundreds of road agents with reliable latency.

Motion Forecasting

Sound-Guided Semantic Video Generation

no code implementations20 Apr 2022 Seung Hyun Lee, Gyeongrok Oh, Wonmin Byeon, Chanyoung Kim, Won Jeong Ryoo, Sang Ho Yoon, Hyunjun Cho, Jihyun Bae, Jinkyu Kim, Sangpil Kim

The recent success in StyleGAN demonstrates that pre-trained StyleGAN latent space is useful for realistic video generation.

Video Editing Video Generation

Occupancy Flow Fields for Motion Forecasting in Autonomous Driving

no code implementations8 Mar 2022 Reza Mahjourian, Jinkyu Kim, Yuning Chai, Mingxing Tan, Ben Sapp, Dragomir Anguelov

We propose Occupancy Flow Fields, a new representation for motion forecasting of multiple agents, an important task in autonomous driving.

Motion Estimation Motion Forecasting

Communication-Efficient Federated Learning with Accelerated Client Gradient

1 code implementation10 Jan 2022 Geeho Kim, Jinkyu Kim, Bohyung Han

To address this challenge, we propose a simple but effective federated learning framework, which improves the consistency across clients and facilitates the convergence of the server model.

Federated Learning

Sound-Guided Semantic Image Manipulation

1 code implementation CVPR 2022 Seung Hyun Lee, Wonseok Roh, Wonmin Byeon, Sang Ho Yoon, Chan Young Kim, Jinkyu Kim, Sangpil Kim

Our audio encoder is trained to produce a latent representation from an audio input, which is forced to be aligned with image and text representations in the multi-modal embedding space.

Audio Classification Image Classification +2

A Scenario-Based Platform for Testing Autonomous Vehicle Behavior Prediction Models in Simulation

no code implementations28 Oct 2021 Francis Indaheng, Edward Kim, Kesav Viswanadha, Jay Shenoy, Jinkyu Kim, Daniel J. Fremont, Sanjit A. Seshia

Hence, it is important that these prediction models are extensively tested in various test scenarios involving interactive behaviors prior to deployment.

Probabilistic Programming

Attentional Bottleneck: Towards an Interpretable Deep Driving Network

no code implementations8 May 2020 Jinkyu Kim, Mayank Bansal

Deep neural networks are a key component of behavior prediction and motion generation for self-driving cars.

Self-Driving Cars

Grounding Human-to-Vehicle Advice for Self-driving Vehicles

no code implementations CVPR 2019 Jinkyu Kim, Teruhisa Misu, Yi-Ting Chen, Ashish Tawari, John Canny

We show that taking advice improves the performance of the end-to-end network, while the network cues on a variety of visual features that are provided by advice.

HATS: A Hierarchical Graph Attention Network for Stock Movement Prediction

3 code implementations7 Aug 2019 Raehyun Kim, Chan Ho So, Minbyul Jeong, Sang-Hoon Lee, Jinkyu Kim, Jaewoo Kang

Methods that use relational data for stock market prediction have been recently proposed, but they are still in their infancy.

Graph Attention Graph Classification +2

Periphery-Fovea Multi-Resolution Driving Model guided by Human Attention

1 code implementation24 Mar 2019 Ye Xia, Jinkyu Kim, John Canny, Karl Zipser, David Whitney

Inspired by human vision, we propose a new periphery-fovea multi-resolution driving model that predicts vehicle speed from dash camera videos.

Textual Explanations for Self-Driving Vehicles

2 code implementations ECCV 2018 Jinkyu Kim, Anna Rohrbach, Trevor Darrell, John Canny, Zeynep Akata

Finally, we explore a version of our model that generates rationalizations, and compare with introspective explanations on the same video segments.

Predicting Driver Attention in Critical Situations

2 code implementations17 Nov 2017 Ye Xia, Danqing Zhang, Jinkyu Kim, Ken Nakayama, Karl Zipser, David Whitney

Because critical driving moments are so rare, collecting enough data for these situations is difficult with the conventional in-car data collection protocol---tracking eye movements during driving.

Autonomous Driving Driver Attention Monitoring

Cannot find the paper you are looking for? You can Submit a new open access paper.