no code implementations • 2 Dec 2023 • Lingyu Zhang, Ting Hua, Yilin Shen, Hongxia Jin
In order to achieve this goal, a model has to be "smart" and "knowledgeable".
no code implementations • 12 Apr 2023 • James Seale Smith, Yen-Chang Hsu, Lingyu Zhang, Ting Hua, Zsolt Kira, Yilin Shen, Hongxia Jin
We show that C-LoRA not only outperforms several baselines for our proposed setting of text-to-image continual customization, which we refer to as Continual Diffusion, but that we achieve a new state-of-the-art in the well-established rehearsal-free continual learning setting for image classification.
no code implementations • 2 Nov 2022 • Ting Hua, Yen-Chang Hsu, Felicity Wang, Qian Lou, Yilin Shen, Hongxia Jin
However, standard SVD treats the parameters within the matrix with equal importance, which is a simple but unrealistic assumption.
no code implementations • ICLR 2022 • Yen-Chang Hsu, Ting Hua, SungEn Chang, Qian Lou, Yilin Shen, Hongxia Jin
In other words, the optimization objective of SVD is not aligned with the trained model's task accuracy.
no code implementations • NAACL 2021 • Ting Hua, Yilin Shen, Changsheng Zhao, Yen-Chang Hsu, Hongxia Jin
Most existing continual learning approaches suffer from low accuracy and performance fluctuation, especially when the distributions of old and new data are significantly different.
no code implementations • CVPR 2022 • Qian Lou, Yen-Chang Hsu, Burak Uzkent, Ting Hua, Yilin Shen, Hongxia Jin
The key primitive is that Dictionary-Lookup-Transformormations (DLT) is proposed to replace Linear Transformation (LT) in multi-modal detectors where each weight in Linear Transformation (LT) is approximately factorized into a smaller dictionary, index, and coefficient.
no code implementations • 30 Dec 2021 • Changsheng Zhao, Ting Hua, Yilin Shen, Qian Lou, Hongxia Jin
Knowledge distillation, Weight pruning, and Quantization are known to be the main directions in model compression.
no code implementations • ICLR 2022 • Qian Lou, Ting Hua, Yen-Chang Hsu, Yilin Shen, Hongxia Jin
DictFormer significantly reduces the redundancy in the transformer's parameters by replacing the prior transformer's parameters with compact, shared dictionary, a few unshared coefficients, and indices.