Search Results for author: Junjie Yang

Found 24 papers, 11 papers with code

Take the Bull by the Horns: Hard Sample-Reweighted Continual Training Improves LLM Generalization

1 code implementation • 22 Feb 2024 • Xuxi Chen, Zhendong Wang, Daouda Sow, Junjie Yang, Tianlong Chen, Yingbin Liang, Mingyuan Zhou, Zhangyang Wang

Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets, with a specific focus on selective retention of samples that incur moderately high losses.

Paper
Code

End-to-end Supervised Prediction of Arbitrary-size Graphs with Partially-Masked Fused Gromov-Wasserstein Matching

no code implementations • 19 Feb 2024 • Paul Krzakala, Junjie Yang, Rémi Flamary, Florence d'Alché-Buc, Charlotte Laclau, Matthieu Labeau

We present a novel end-to-end deep learning-based approach for Supervised Graph Prediction (SGP).

Paper
Add Code

A Large-Scale Empirical Study on Improving the Fairness of Image Classification Models

1 code implementation • 8 Jan 2024 • Junjie Yang, Jiajun Jiang, Zeyu Sun, Junjie Chen

Specifically, we target the widely-used application scenario of image classification, and utilized three different datasets and five commonly-used performance metrics to assess in total 13 methods from diverse categories.

Fairness Image Classification

Paper
Code

Rethinking PGD Attack: Is Sign Function Necessary?

1 code implementation • 3 Dec 2023 • Junjie Yang, Tianlong Chen, Xuxi Chen, Zhangyang Wang, Yingbin Liang

Based on that, we further propose a new raw gradient descent (RGD) algorithm that eliminates the use of sign.

Paper
Code

Meta ControlNet: Enhancing Task Adaptation via Meta Learning

1 code implementation • 3 Dec 2023 • Junjie Yang, Jinze Zhao, Peihao Wang, Zhangyang Wang, Yingbin Liang

However, vanilla ControlNet generally requires extensive training of around 5000 steps to achieve a desirable control for a single task.

Edge Detection Image Generation +1

Paper
Code

EyeLS: Shadow-Guided Instrument Landing System for Intraocular Target Approaching in Robotic Eye Surgery

no code implementations • 15 Nov 2023 • Junjie Yang, Zhihao Zhao, Siyuan Shen, Daniel Zapp, Mathias Maier, Kai Huang, Nassir Navab, M. Ali Nasseri

Robotic ophthalmic surgery is an emerging technology to facilitate high-precision interventions such as retina penetration in subretinal injection and removal of floating tissues in retinal detachment depending on the input imaging modalities such as microscopy and intraoperative OCT (iOCT).

Paper
Add Code

Exploiting Edge Features in Graphs with Fused Network Gromov-Wasserstein Distance

no code implementations • 28 Sep 2023 • Junjie Yang, Matthieu Labeau, Florence d'Alché-Buc

Pairwise comparison of graphs is key to many applications in Machine learning ranging from clustering, kernel-based classification/regression and more recently supervised graph prediction.

Paper
Add Code

M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast Self-Adaptation

1 code implementation • 28 Feb 2023 • Junjie Yang, Xuxi Chen, Tianlong Chen, Zhangyang Wang, Yingbin Liang

This data-driven procedure yields L2O that can efficiently solve problems similar to those seen in training, that is, drawn from the same ``task distribution".

Paper
Code

Learning to Generalize Provably in Learning to Optimize

1 code implementation • 22 Feb 2023 • Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, DaCheng Tao, Yingbin Liang, Zhangyang Wang

While the optimizer generalization has been recently studied, the optimizee generalization (or learning to generalize) has not been rigorously studied in the L2O context, which is the aim of this paper.

247

Paper
Code

Embedded Silicon-Organic Integrated Neuromorphic System

no code implementations • 18 Oct 2022 • Shengjie Zheng, Ling Liu, Junjie Yang, Jianwei Zhang, Tao Su, Bin Yue, Xiaojian Li

The development of artificial intelligence (AI) and robotics are both based on the tenet of "science and technology are people-oriented", and both need to achieve efficient communication with the human brain.

Paper
Add Code

APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking

4 code implementations • 12 Jun 2022 • Yuxiang Yang, Junjie Yang, Yufei Xu, Jing Zhang, Long Lan, DaCheng Tao

Based on APT-36K, we benchmark several representative models on the following three tracks: (1) supervised animal pose estimation on a single frame under intra- and inter-domain transfer learning settings, (2) inter-species domain generalization test for unseen animals, and (3) animal pose estimation with animal tracking.

Animal Pose Estimation Domain Generalization +1

123

Paper
Code

ColibriDoc: An Eye-in-Hand Autonomous Trocar Docking System

no code implementations • 30 Nov 2021 • Shervin Dehghani, Michael Sommersperger, Junjie Yang, Benjamin Busam, Kai Huang, Peter Gehlbach, Iulian Iordachita, Nassir Navab, M. Ali Nasseri

For this purpose, we present a platform for autonomous trocar docking that combines computer vision and a robotic setup.

Medical Procedure Navigate +1

Paper
Add Code

Generalizable Learning to Optimize into Wide Valleys

no code implementations • 29 Sep 2021 • Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, DaCheng Tao, Yingbin Liang, Zhangyang Wang

Learning to optimize (L2O) has gained increasing popularity in various optimization tasks, since classical optimizers usually require laborious, problem-specific design and hyperparameter tuning.

Paper
Add Code

Provably Faster Algorithms for Bilevel Optimization

1 code implementation • NeurIPS 2021 • Junjie Yang, Kaiyi Ji, Yingbin Liang

Bilevel optimization has been widely applied in many important machine learning applications such as hyperparameter optimization and meta-learning.

Bilevel Optimization Hyperparameter Optimization +1

Paper
Code

Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models

no code implementations • 12 Apr 2021 • Dheevatsa Mudigere, Yuchen Hao, Jianyu Huang, Zhihao Jia, Andrew Tulloch, Srinivas Sridharan, Xing Liu, Mustafa Ozdal, Jade Nie, Jongsoo Park, Liang Luo, Jie Amy Yang, Leon Gao, Dmytro Ivchenko, Aarti Basant, Yuxi Hu, Jiyan Yang, Ehsan K. Ardestani, Xiaodong Wang, Rakesh Komuravelli, Ching-Hsiang Chu, Serhat Yilmaz, Huayu Li, Jiyuan Qian, Zhuobo Feng, Yinbin Ma, Junjie Yang, Ellie Wen, Hong Li, Lin Yang, Chonglin Sun, Whitney Zhao, Dimitry Melts, Krishna Dhulipala, KR Kishore, Tyler Graf, Assaf Eisenman, Kiran Kumar Matam, Adi Gangidi, Guoqiang Jerry Chen, Manoj Krishnan, Avinash Nayak, Krishnakumar Nair, Bharath Muthiah, Mahmoud khorashadi, Pallab Bhattacharya, Petr Lapukhov, Maxim Naumov, Ajit Mathews, Lin Qiao, Mikhail Smelyanskiy, Bill Jia, Vijay Rao

Deep learning recommendation models (DLRMs) are used across many business-critical services at Facebook and are the single largest AI application in terms of infrastructure demand in its data-centers.

Paper
Add Code

Neural Network Training Techniques Regularize Optimization Trajectory: An Empirical Study

no code implementations • 13 Nov 2020 • Cheng Chen, Junjie Yang, Yi Zhou

Specifically, we find that the optimization trajectories of successful DNN trainings consistently obey a certain regularity principle that regularizes the model update direction to be aligned with the trajectory direction.

Paper
Add Code

Bilevel Optimization: Convergence Analysis and Enhanced Design

2 code implementations • 15 Oct 2020 • Kaiyi Ji, Junjie Yang, Yingbin Liang

For the AID-based method, we orderwisely improve the previous convergence rate analysis due to a more practical parameter selection as well as a warm start strategy, and for the ITD-based method we establish the first theoretical convergence rate.

Bilevel Optimization Hyperparameter Optimization +1

Paper
Code

Provably Faster Algorithms for Bilevel Optimization and Applications to Meta-Learning

no code implementations • 28 Sep 2020 • Kaiyi Ji, Junjie Yang, Yingbin Liang

For the AID-based method, we orderwisely improve the previous finite-time convergence analysis due to a more practical parameter selection as well as a warm start strategy, and for the ITD-based method we establish the first theoretical convergence rate.

Bilevel Optimization Hyperparameter Optimization +1

Paper
Add Code

Multi-span Style Extraction for Generative Reading Comprehension

no code implementations • 15 Sep 2020 • Junjie Yang, Zhuosheng Zhang, Hai Zhao

Generative machine reading comprehension (MRC) requires a model to generate well-formed answers.

Answer Generation Machine Reading Comprehension

Paper
Add Code

Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning

2 code implementations • 18 Feb 2020 • Kaiyi Ji, Junjie Yang, Yingbin Liang

As a popular meta-learning approach, the model-agnostic meta-learning (MAML) algorithm has been widely used due to its simplicity and effectiveness.

Meta-Learning

Paper
Code

Retrospective Reader for Machine Reading Comprehension

2 code implementations • 27 Jan 2020 • Zhuosheng Zhang, Junjie Yang, Hai Zhao

Inspired by how humans solve reading comprehension questions, we proposed a retrospective reader (Retro-Reader) that integrates two stages of reading and verification strategies: 1) sketchy reading that briefly investigates the overall interactions of passage and question, and yield an initial judgment; 2) intensive reading that verifies the answer and gives the final prediction.

Ranked #7 on Question Answering on SQuAD2.0

Machine Reading Comprehension Question Answering

360

Paper
Code

Deepening Hidden Representations from Pre-trained Language Models

no code implementations • 5 Nov 2019 • Junjie Yang, Hai Zhao

Transformer-based pre-trained language models have proven to be effective for learning contextualized language representation.

Natural Language Understanding

Paper
Add Code

An Optimization Principle Of Deep Learning?

no code implementations • 25 Sep 2019 • Cheng Chen, Junjie Yang, Yi Zhou

In particular, we observe that the trainings that apply the training techniques achieve accelerated convergence and obey the principle with a large $\gamma$, which is consistent with the $\mathcal{O}(1/\gamma K)$ convergence rate result under the optimization principle.

Paper
Add Code

SGD Converges to Global Minimum in Deep Learning via Star-convex Path

no code implementations • ICLR 2019 • Yi Zhou, Junjie Yang, Huishuai Zhang, Yingbin Liang, Vahid Tarokh

Stochastic gradient descent (SGD) has been found to be surprisingly effective in training a variety of deep neural networks.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.