Search Results for author: Yimin Hu

Found 6 papers, 2 papers with code

The NiuTrans System for the WMT 2021 Efficiency Task

no code implementations • WMT (EMNLP) 2021 • Chenglong Wang, Chi Hu, Yongyu Mu, Zhongxiang Yan, Siming Wu, Yimin Hu, Hang Cao, Bei Li, Ye Lin, Tong Xiao, Jingbo Zhu

This paper describes the NiuTrans system for the WMT21 translation efficiency task.

Knowledge Distillation Translation

Paper
Add Code

Prior Constraints-based Reward Model Training for Aligning Large Language Models

1 code implementation • 1 Apr 2024 • Hang Zhou, Chenglong Wang, Yimin Hu, Tong Xiao, Chunliang Zhang, Jingbo Zhu

Reinforcement learning with human feedback for aligning large language models (LLMs) trains a reward model typically using ranking loss with comparison pairs. However, the training procedure suffers from an inherent problem: the uncontrolled scaling of reward scores during reinforcement learning due to the lack of constraints while training the reward model. This paper proposes a Prior Constraints-based Reward Model (namely PCRM) training method to mitigate this problem.

reinforcement-learning

Paper
Code

BioDrone: A Bionic Drone-based Single Object Tracking Benchmark for Robust Vision

no code implementations • 7 Feb 2024 • Xin Zhao, Shiyu Hu, Yipei Wang, Jing Zhang, Yimin Hu, Rongshuai Liu, Haibin Ling, Yin Li, Renshu Li, Kun Liu, Jiadong Li

These challenges are especially manifested in videos captured by unmanned aerial vehicles (UAV), where the target is usually far away from the camera and often with significant motion relative to the camera.

Autonomous Driving Object Tracking +1

Paper
Add Code

Anomaly Detection of Particle Orbit in Accelerator using LSTM Deep Learning Technology

no code implementations • 28 Jan 2024 • ZhiYuan Chen, Wei Lu, Radhika Bhong, Yimin Hu, Brian Freeman, Adam Carpenter

A stable, reliable, and controllable orbit lock system is crucial to an electron (or ion) accelerator because the beam orbit and beam energy instability strongly affect the quality of the beam delivered to experimental halls.

Anomaly Detection Fault Detection

Paper
Add Code

ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation

3 code implementations • 4 Aug 2023 • Chenglong Wang, Hang Zhou, Yimin Hu, Yifu Huo, Bei Li, Tongran Liu, Tong Xiao, Jingbo Zhu

Applying Reinforcement Learning (RL) to sequence generation models enables the direct optimization of long-term rewards (\textit{e. g.,} BLEU and human feedback), but typically requires large-scale sampling over a space of action sequences.

Abstractive Text Summarization Language Modelling +5

19,912

Paper
Code

Improved Knowledge Distillation for Pre-trained Language Models via Knowledge Selection

no code implementations • 1 Feb 2023 • Chenglong Wang, Yi Lu, Yongyu Mu, Yimin Hu, Tong Xiao, Jingbo Zhu

Knowledge distillation addresses the problem of transferring knowledge from a teacher model to a student model.

Knowledge Distillation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.