1 code implementation • 22 Feb 2024 • Xuxi Chen, Zhendong Wang, Daouda Sow, Junjie Yang, Tianlong Chen, Yingbin Liang, Mingyuan Zhou, Zhangyang Wang
Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets, with a specific focus on selective retention of samples that incur moderately high losses.
no code implementations • 19 Feb 2024 • Paul Krzakala, Junjie Yang, Rémi Flamary, Florence d'Alché-Buc, Charlotte Laclau, Matthieu Labeau
We present a novel end-to-end deep learning-based approach for Supervised Graph Prediction (SGP).
1 code implementation • 8 Jan 2024 • Junjie Yang, Jiajun Jiang, Zeyu Sun, Junjie Chen
Specifically, we target the widely-used application scenario of image classification, and utilized three different datasets and five commonly-used performance metrics to assess in total 13 methods from diverse categories.
1 code implementation • 3 Dec 2023 • Junjie Yang, Tianlong Chen, Xuxi Chen, Zhangyang Wang, Yingbin Liang
Based on that, we further propose a new raw gradient descent (RGD) algorithm that eliminates the use of sign.
1 code implementation • 3 Dec 2023 • Junjie Yang, Jinze Zhao, Peihao Wang, Zhangyang Wang, Yingbin Liang
However, vanilla ControlNet generally requires extensive training of around 5000 steps to achieve a desirable control for a single task.
no code implementations • 15 Nov 2023 • Junjie Yang, Zhihao Zhao, Siyuan Shen, Daniel Zapp, Mathias Maier, Kai Huang, Nassir Navab, M. Ali Nasseri
Robotic ophthalmic surgery is an emerging technology to facilitate high-precision interventions such as retina penetration in subretinal injection and removal of floating tissues in retinal detachment depending on the input imaging modalities such as microscopy and intraoperative OCT (iOCT).
no code implementations • 28 Sep 2023 • Junjie Yang, Matthieu Labeau, Florence d'Alché-Buc
Pairwise comparison of graphs is key to many applications in Machine learning ranging from clustering, kernel-based classification/regression and more recently supervised graph prediction.
1 code implementation • 28 Feb 2023 • Junjie Yang, Xuxi Chen, Tianlong Chen, Zhangyang Wang, Yingbin Liang
This data-driven procedure yields L2O that can efficiently solve problems similar to those seen in training, that is, drawn from the same ``task distribution".
1 code implementation • 22 Feb 2023 • Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, DaCheng Tao, Yingbin Liang, Zhangyang Wang
While the optimizer generalization has been recently studied, the optimizee generalization (or learning to generalize) has not been rigorously studied in the L2O context, which is the aim of this paper.
no code implementations • 18 Oct 2022 • Shengjie Zheng, Ling Liu, Junjie Yang, Jianwei Zhang, Tao Su, Bin Yue, Xiaojian Li
The development of artificial intelligence (AI) and robotics are both based on the tenet of "science and technology are people-oriented", and both need to achieve efficient communication with the human brain.
4 code implementations • 12 Jun 2022 • Yuxiang Yang, Junjie Yang, Yufei Xu, Jing Zhang, Long Lan, DaCheng Tao
Based on APT-36K, we benchmark several representative models on the following three tracks: (1) supervised animal pose estimation on a single frame under intra- and inter-domain transfer learning settings, (2) inter-species domain generalization test for unseen animals, and (3) animal pose estimation with animal tracking.
no code implementations • 30 Nov 2021 • Shervin Dehghani, Michael Sommersperger, Junjie Yang, Benjamin Busam, Kai Huang, Peter Gehlbach, Iulian Iordachita, Nassir Navab, M. Ali Nasseri
For this purpose, we present a platform for autonomous trocar docking that combines computer vision and a robotic setup.
no code implementations • 29 Sep 2021 • Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, DaCheng Tao, Yingbin Liang, Zhangyang Wang
Learning to optimize (L2O) has gained increasing popularity in various optimization tasks, since classical optimizers usually require laborious, problem-specific design and hyperparameter tuning.
1 code implementation • NeurIPS 2021 • Junjie Yang, Kaiyi Ji, Yingbin Liang
Bilevel optimization has been widely applied in many important machine learning applications such as hyperparameter optimization and meta-learning.
no code implementations • 12 Apr 2021 • Dheevatsa Mudigere, Yuchen Hao, Jianyu Huang, Zhihao Jia, Andrew Tulloch, Srinivas Sridharan, Xing Liu, Mustafa Ozdal, Jade Nie, Jongsoo Park, Liang Luo, Jie Amy Yang, Leon Gao, Dmytro Ivchenko, Aarti Basant, Yuxi Hu, Jiyan Yang, Ehsan K. Ardestani, Xiaodong Wang, Rakesh Komuravelli, Ching-Hsiang Chu, Serhat Yilmaz, Huayu Li, Jiyuan Qian, Zhuobo Feng, Yinbin Ma, Junjie Yang, Ellie Wen, Hong Li, Lin Yang, Chonglin Sun, Whitney Zhao, Dimitry Melts, Krishna Dhulipala, KR Kishore, Tyler Graf, Assaf Eisenman, Kiran Kumar Matam, Adi Gangidi, Guoqiang Jerry Chen, Manoj Krishnan, Avinash Nayak, Krishnakumar Nair, Bharath Muthiah, Mahmoud khorashadi, Pallab Bhattacharya, Petr Lapukhov, Maxim Naumov, Ajit Mathews, Lin Qiao, Mikhail Smelyanskiy, Bill Jia, Vijay Rao
Deep learning recommendation models (DLRMs) are used across many business-critical services at Facebook and are the single largest AI application in terms of infrastructure demand in its data-centers.
no code implementations • 13 Nov 2020 • Cheng Chen, Junjie Yang, Yi Zhou
Specifically, we find that the optimization trajectories of successful DNN trainings consistently obey a certain regularity principle that regularizes the model update direction to be aligned with the trajectory direction.
2 code implementations • 15 Oct 2020 • Kaiyi Ji, Junjie Yang, Yingbin Liang
For the AID-based method, we orderwisely improve the previous convergence rate analysis due to a more practical parameter selection as well as a warm start strategy, and for the ITD-based method we establish the first theoretical convergence rate.
no code implementations • 28 Sep 2020 • Kaiyi Ji, Junjie Yang, Yingbin Liang
For the AID-based method, we orderwisely improve the previous finite-time convergence analysis due to a more practical parameter selection as well as a warm start strategy, and for the ITD-based method we establish the first theoretical convergence rate.
no code implementations • 15 Sep 2020 • Junjie Yang, Zhuosheng Zhang, Hai Zhao
Generative machine reading comprehension (MRC) requires a model to generate well-formed answers.
2 code implementations • 18 Feb 2020 • Kaiyi Ji, Junjie Yang, Yingbin Liang
As a popular meta-learning approach, the model-agnostic meta-learning (MAML) algorithm has been widely used due to its simplicity and effectiveness.
2 code implementations • 27 Jan 2020 • Zhuosheng Zhang, Junjie Yang, Hai Zhao
Inspired by how humans solve reading comprehension questions, we proposed a retrospective reader (Retro-Reader) that integrates two stages of reading and verification strategies: 1) sketchy reading that briefly investigates the overall interactions of passage and question, and yield an initial judgment; 2) intensive reading that verifies the answer and gives the final prediction.
Ranked #7 on Question Answering on SQuAD2.0
no code implementations • 5 Nov 2019 • Junjie Yang, Hai Zhao
Transformer-based pre-trained language models have proven to be effective for learning contextualized language representation.
no code implementations • 25 Sep 2019 • Cheng Chen, Junjie Yang, Yi Zhou
In particular, we observe that the trainings that apply the training techniques achieve accelerated convergence and obey the principle with a large $\gamma$, which is consistent with the $\mathcal{O}(1/\gamma K)$ convergence rate result under the optimization principle.
no code implementations • ICLR 2019 • Yi Zhou, Junjie Yang, Huishuai Zhang, Yingbin Liang, Vahid Tarokh
Stochastic gradient descent (SGD) has been found to be surprisingly effective in training a variety of deep neural networks.