no code implementations • 16 Apr 2024 • Shusheng Xu, Wei Fu, Jiaxuan Gao, Wenjie Ye, Weilin Liu, Zhiyu Mei, Guangju Wang, Chao Yu, Yi Wu
However, in academic benchmarks, state-of-the-art results are often achieved via reward-free methods, such as Direct Preference Optimization (DPO).
no code implementations • 13 Dec 2021 • Shusheng Xu, Yancheng Liang, Yunfei Li, Simon Shaolei Du, Yi Wu
A ubiquitous requirement in many practical reinforcement learning (RL) applications, including medical treatment, recommendation system, education and robotics, is that the deployed policy that actually interacts with the environment cannot change frequently.
no code implementations • 13 Dec 2021 • Shusheng Xu, Yichen Liu, Xiaoyu Yi, Siyuan Zhou, Huizi Li, Yi Wu
We present Native Chinese Reader (NCR), a new machine reading comprehension (MRC) dataset with particularly long articles in both modern and classical Chinese.
no code implementations • 3 Nov 2021 • Yingying Wu, Shusheng Xu, Shing-Tung Yau, Yi Wu
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused an ongoing pandemic infecting 219 million people as of 10/19/21, with a 3. 6% mortality rate.
no code implementations • 8 Sep 2021 • Shusheng Xu, Xingxing Zhang, Yi Wu, Furu Wei
In this paper, we propose a contrastive learning model for supervised abstractive text summarization, where we view a document, its gold summary and its model generated summaries as different views of the same mean representation and maximize the similarities between them during training.
no code implementations • 1 Jan 2021 • Shusheng Xu, Simon Shaolei Du, Yi Wu
We initiate the study on deep reinforcement learning problems that require low switching cost, i. e., small number of policy switches during training.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Shusheng Xu, Xingxing Zhang, Yi Wu, Furu Wei, Ming Zhou
We also find in experiments that our model is less dependent on sentence positions.