no code implementations • 18 Mar 2024 • Abhay Zala, Jaemin Cho, Han Lin, Jaehong Yoon, Mohit Bansal
Instead of directly employing LLMs as agents, can we use LLMs' reasoning capabilities to adaptively create training environments to help smaller embodied RL agents learn useful skills that they are weak at?
no code implementations • 11 Mar 2024 • Jialu Li, Jaemin Cho, Yi-Lin Sung, Jaehong Yoon, Mohit Bansal
In this paper, we introduce SELMA: Skill-Specific Expert Learning and Merging with Auto-Generated Data, a novel paradigm to improve the faithfulness of T2I models by fine-tuning models on automatically generated, multi-skill image-text datasets, with skill-specific expert learning and merging.
no code implementations • 13 Feb 2024 • Daeun Lee, Jaehong Yoon, Sung Ju Hwang
We validate our method outperforms multiple CTTA scenarios including disjoint and gradual domain shits, while only requiring ~98% fewer trainable parameters.
1 code implementation • 8 Feb 2024 • Shoubin Yu, Jaehong Yoon, Mohit Bansal
Furthermore, we propose a fusion module designed to compress multimodal queries, maintaining computational efficiency in the LLM while combining additional modalities.
Ranked #1 on Question Answering on SQA3D
1 code implementation • 19 Jan 2024 • Xiyao Wang, YuHang Zhou, Xiaoyu Liu, Hongjin Lu, Yuancheng Xu, Feihong He, Jaehong Yoon, Taixi Lu, Gedas Bertasius, Mohit Bansal, Huaxiu Yao, Furong Huang
However, current MLLM benchmarks are predominantly designed to evaluate reasoning based on static information about a single image, and the ability of modern MLLMs to extrapolate from image sequences, which is essential for understanding our ever-changing world, has been less investigated.
2 code implementations • 19 Dec 2023 • Haeyong Kang, Jaehong Yoon, Sung Ju Hwang, Chang D. Yoo
Inspired by the Lottery Ticket Hypothesis (LTH), which highlights the existence of efficient subnetworks within larger, dense networks, a high-performing Winning Subnetwork (WSN) in terms of task performance under appropriate sparsity conditions is considered for various continual learning tasks.
1 code implementation • 17 Nov 2023 • Xiaohui Zhang, Jaehong Yoon, Mohit Bansal, Huaxiu Yao
This optimization process is controlled by a gradient modification mechanism to prevent the shared head from losing previously acquired information.
no code implementations • 14 Nov 2023 • Yujin Kim, Jaehong Yoon, Seonghyeon Ye, Sangmin Bae, Namgyu Ho, Sung Ju Hwang, Se-Young Yun
The dynamic nature of knowledge in an ever-changing world presents challenges for language models trained on static data; the model in the real world often requires not only acquiring new knowledge but also overwriting outdated information into updated ones.
no code implementations • 12 Oct 2023 • Jaewoo Lee, Jaehong Yoon, Wonjae Kim, Yunji Kim, Sung Ju Hwang
Continuously learning a variety of audio-video semantics over time is crucial for audio-related reasoning tasks in our ever-evolving world.
no code implementations • 4 Oct 2023 • Yi-Lin Sung, Jaehong Yoon, Mohit Bansal
We first determine the sparsity ratios of different layers or blocks by leveraging the global importance score, which is efficiently computed based on the zeroth-order approximation of the global model gradients.
1 code implementation • 1 Oct 2023 • Yiyang Zhou, Chenhang Cui, Jaehong Yoon, Linjun Zhang, Zhun Deng, Chelsea Finn, Mohit Bansal, Huaxiu Yao
Large vision-language models (LVLMs) have shown remarkable abilities in understanding visual information with human languages.
no code implementations • 21 Jun 2023 • Jaehong Yoon, Sung Ju Hwang, Yue Cao
We believe this paper breaks the barriers between pre-training and fine-tuning steps and leads to a sustainable learning framework in which the continual learner incrementally improves model generalization, yielding better transfer to unseen tasks.
2 code implementations • 20 Jun 2023 • Haeyong Kang, Jaehong Yoon, Dahyun Kim, Sung Ju Hwang, Chang D Yoo
Motivated by continual learning, this work investigates how to accumulate and transfer neural implicit representations for multiple complex video data over sequential encoding sessions.
no code implementations • ICCV 2023 • Jaewoong Lee, Sangwon Jang, Jaehyeong Jo, Jaehong Yoon, Yunji Kim, Jin-Hwa Kim, Jung-Woo Ha, Sung Ju Hwang
Token-based masked generative models are gaining popularity for their fast inference time with parallel decoding.
1 code implementation • 27 Mar 2023 • Haeyong Kang, Jaehong Yoon, Sultan Rizky Madjid, Sung Ju Hwang, Chang D. Yoo
Inspired by Regularized Lottery Ticket Hypothesis (RLTH), which states that competitive smooth (non-binary) subnetworks exist within a dense network in continual learning tasks, we investigate two proposed architecture-based continual learning methods which sequentially learn and select adaptive binary- (WSN) and non-binary Soft-Subnetworks (SoftNet) for each task.
1 code implementation • 19 Nov 2022 • Sunil Hwang, Jaehong Yoon, Youngwan Lee, Sung Ju Hwang
Masked Video Autoencoder (MVA) approaches have demonstrated their potential by significantly outperforming previous video representation learning methods.
Ranked #1 on Object State Change Classification on Ego4D
Object State Change Classification Object State Change Classification on Ego4D +4
2 code implementations • 15 Sep 2022 • Haeyong Kang, Jaehong Yoon, Sultan Rizky Hikmawan Madjid, Sung Ju Hwang, Chang D. Yoo
Inspired by Regularized Lottery Ticket Hypothesis (RLTH), which hypothesizes that there exist smooth (non-binary) subnetworks within a dense network that achieve the competitive performance of the dense network, we propose a few-shot class incremental learning (FSCIL) method referred to as \emph{Soft-SubNetworks (SoftNet)}.
no code implementations • 4 Jul 2022 • Geon Park, Jaehong Yoon, Haiyang Zhang, Xing Zhang, Sung Ju Hwang, Yonina C. Eldar
Neural network quantization aims to transform high-precision weights and activations of a given neural network into low-precision weights/activations for reduced memory usage and computation, while preserving the performance of the original model.
1 code implementation • International Conference on Machine Learning 2022 • Haeyong Kang, Rusty John Lloyd Mina, Sultan Rizky Hikmawan Madjid, Jaehong Yoon, Mark Hasegawa-Johnson, Sung Ju Hwang, Chang D. Yoo
Inspired by Lottery Ticket Hypothesis that competitive subnetworks exist within a dense network, we propose a continual learning method referred to as Winning SubNetworks (WSN), which sequentially learns and selects an optimal subnetwork for each task.
1 code implementation • 21 Jun 2022 • Jinheon Baek, Wonyong Jeong, Jiongdao Jin, Jaehong Yoon, Sung Ju Hwang
To this end, we introduce a new subgraph FL problem, personalized subgraph FL, which focuses on the joint improvement of the interrelated local GNNs rather than learning a single global model, and propose a novel framework, FEDerated Personalized sUBgraph learning (FED-PUB), to tackle it.
no code implementations • 23 Feb 2022 • Jaehong Yoon, Geon Park, Wonyong Jeong, Sung Ju Hwang
We introduce a pragmatic FL scenario with bitwidth heterogeneity across the participating devices, dubbed as Bitwidth Heterogeneous Federated Learning (BHFL).
2 code implementations • ICLR 2022 • Divyam Madaan, Jaehong Yoon, Yuanchun Li, Yunxin Liu, Sung Ju Hwang
Continual learning (CL) aims to learn a sequence of tasks without forgetting the previously acquired knowledge.
no code implementations • ICLR 2022 • Jaehong Yoon, Divyam Madaan, Eunho Yang, Sung Ju Hwang
We validate the effectiveness of our coreset selection mechanism over various standard, imbalanced, and noisy datasets against strong continual learning baselines, demonstrating that it improves task adaptation and prevents catastrophic forgetting in a sample-efficient manner.
no code implementations • 1 Jan 2021 • Minyoung Song, Jaehong Yoon, Eunho Yang, Sung Ju Hwang
As deep neural networks are growing in size and being increasingly deployed to more resource-limited devices, there has been a recent surge of interest in network pruning methods, which aim to remove less important weights or activations of a given network.
1 code implementation • ICLR 2021 • Wonyong Jeong, Jaehong Yoon, Eunho Yang, Sung Ju Hwang
Through extensive experimental validation of our method in the two different scenarios, we show that our method outperforms both local semi-supervised learning and baselines which naively combine federated learning with semi-supervised learning.
no code implementations • 22 Jun 2020 • Minyoung Song, Jaehong Yoon, Eunho Yang, Sung Ju Hwang
As deep neural networks are growing in size and being increasingly deployed to more resource-limited devices, there has been a recent surge of interest in network pruning methods, which aim to remove less important weights or activations of a given network.
1 code implementation • 6 Mar 2020 • Jaehong Yoon, Wonyong Jeong, Giwoong Lee, Eunho Yang, Sung Ju Hwang
There has been a surge of interest in continual learning and federated learning, both of which are important in deep neural networks in real-world scenarios.
1 code implementation • ICLR 2020 • Jaehong Yoon, Saehoon Kim, Eunho Yang, Sung Ju Hwang
First, a continual learning model should effectively handle catastrophic forgetting and be efficient to train even with a large number of tasks.
no code implementations • 27 Sep 2018 • Juho Lee, Saehoon Kim, Jaehong Yoon, Hae Beom Lee, Eunho Yang, Sung Ju Hwang
With such input-independent dropout, each neuron is evolved to be generic across inputs, which makes it difficult to sparsify networks without accuracy loss.
1 code implementation • 28 May 2018 • Juho Lee, Saehoon Kim, Jaehong Yoon, Hae Beom Lee, Eunho Yang, Sung Ju Hwang
With such input-independent dropout, each neuron is evolved to be generic across inputs, which makes it difficult to sparsify networks without accuracy loss.
3 code implementations • ICLR 2018 • Jaehong Yoon, Eunho Yang, Jeongtae Lee, Sung Ju Hwang
We propose a novel deep network architecture for lifelong learning which we refer to as Dynamically Expandable Network (DEN), that can dynamically decide its network capacity as it trains on a sequence of tasks, to learn a compact overlapping knowledge sharing structure among tasks.
1 code implementation • ICML 2017 • Jaehong Yoon, Sung Ju Hwang
The number of parameters in a deep neural network is usually very large, which helps with its learning capacity but also hinders its scalability and practicality due to memory/time inefficiency and overfitting.