no code implementations • 1 Apr 2024 • Yu Zhou, Haoran Yin, Nanhao Zhou, Yanqun Tang, Xiaoying Zhang, Weijie Yuan
The recently developed affine frequency division multiplexing (AFDM) can achieve full diversity in doubly selective channels, providing a comprehensive sparse representation of the delay-Doppler domain channel.
no code implementations • 12 Mar 2024 • Wei Shen, Xiaoying Zhang, Yuanshun Yao, Rui Zheng, Hongyi Guo, Yang Liu
Reinforcement learning from human feedback (RLHF) is the mainstream paradigm used to align large language models (LLMs) with human preferences.
no code implementations • 8 Mar 2024 • Xiaoying Zhang, Jean-Francois Ton, Wei Shen, Hongning Wang, Yang Liu
We introduce Adversarial Policy Optimization (AdvPO), a novel solution to the pervasive issue of reward over-optimization in Reinforcement Learning from Human Feedback (RLHF) for Large Language Models (LLMs).
no code implementations • 14 Feb 2024 • Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Lifeng Jin, Linfeng Song, Haitao Mi, Helen Meng
Despite showing increasingly human-like abilities, large language models (LLMs) often struggle with factual inaccuracies, i. e. "hallucinations", even when they hold relevant knowledge.
no code implementations • 6 Jan 2024 • Hongyi Guo, Yuanshun Yao, Wei Shen, Jiaheng Wei, Xiaoying Zhang, Zhaoran Wang, Yang Liu
The key idea is to first retrieve high-quality samples related to the target domain and use them as In-context Learning examples to generate more samples.
no code implementations • 29 Aug 2023 • Jingyan Zhou, Minda Hu, Junan Li, Xiaoying Zhang, Xixin Wu, Irwin King, Helen Meng
Our analysis exhibits the potentials and flaws in existing resources (models and datasets) in developing explainable moral judgment-making systems.
1 code implementation • 10 Aug 2023 • Yang Liu, Yuanshun Yao, Jean-Francois Ton, Xiaoying Zhang, Ruocheng Guo, Hao Cheng, Yegor Klochkov, Muhammad Faaiz Taufiq, Hang Li
However, a major challenge faced by practitioners is the lack of clear guidance on evaluating whether LLM outputs align with social norms, values, and regulations.
no code implementations • 15 May 2023 • Xiaoying Zhang, Baolin Peng, Kun Li, Jingyan Zhou, Helen Meng
Building end-to-end task bots and maintaining their integration with new functionalities using minimal human efforts is a long-standing challenge in dialog research.
1 code implementation • 10 Feb 2023 • Qing Zhang, Xiaoying Zhang, Yang Liu, Hongning Wang, Min Gao, Jiheng Zhang, Ruocheng Guo
Confounding bias arises due to the presence of unmeasured variables (e. g., the socio-economic status of a user) that can affect both a user's exposure and feedback.
1 code implementation • 13 Jan 2023 • Xiaoying Zhang, Hongning Wang, Hang Li
This calls for a fine-grained understanding of a user's preferences over items, where one needs to recognize the user's choice is driven by the quality of the item itself, or the pre-selected attributes of the item.
no code implementations • 20 Jan 2022 • Haidong Xie, Jia Tan, Xiaoying Zhang, Nan Ji, Haihua Liao, Zuguo Yu, Xueshuang Xiang, Naijin Liu
This leads to the problem of a malicious third party using a deep learning model to easily recognize the modulation format of the transmitted waveform.
no code implementations • SIGDIAL (ACL) 2022 • Xiaoying Zhang, Baolin Peng, Jianfeng Gao, Helen Meng
In this paper, we study the problem of automatically adapting task bots to changing environments by learning from human-bot interactions with minimum or zero human annotations.
no code implementations • 14 Jul 2021 • Kai Mei, Jun Liu, Xiaoying Zhang, Kuo Cao, Nandana Rajatheva, Jibo Wei
Besides, a training data construction approach utilizing least square (LS) estimation results is proposed so that the training data can be collected during the data transmission.
1 code implementation • 15 Jan 2021 • Mudit Chaudhary, Borislav Dzodzo, Sida Huang, Chun Hei Lo, Mingzhi Lyu, Lun Yiu Nie, Jinbo Xing, Tianhua Zhang, Xiaoying Zhang, Jingyan Zhou, Hong Cheng, Wai Lam, Helen Meng
Dialog systems enriched with external knowledge can handle user queries that are outside the scope of the supporting databases/APIs.
no code implementations • 4 Jun 2019 • Xiaoying Zhang, Hong Xie, Hang Li, John C. S. Lui
Here, a key-term can relate to a subset of arms, for example, a category of articles in news recommendation.