Search Results for author: Yuanshun Yao

Found 24 papers, 6 papers with code

Learning to Watermark LLM-generated Text via Reinforcement Learning

1 code implementation13 Mar 2024 Xiaojun Xu, Yuanshun Yao, Yang Liu

While prior works focus on token-level watermark that embeds signals into the output, we design a model-level watermark that embeds signals into the LLM weights, and such signals can be detected by a paired detector.

reinforcement-learning

Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards

no code implementations12 Mar 2024 Wei Shen, Xiaoying Zhang, Yuanshun Yao, Rui Zheng, Hongyi Guo, Yang Liu

Reinforcement learning from human feedback (RLHF) is the mainstream paradigm used to align large language models (LLMs) with human preferences.

reinforcement-learning

Fair Classifiers Without Fair Training: An Influence-Guided Data Sampling Approach

no code implementations20 Feb 2024 Jinlong Pang, Jialu Wang, Zhaowei Zhu, Yuanshun Yao, Chen Qian, Yang Liu

A fair classifier should ensure the benefit of people from different groups, while the group information is often sensitive and unsuitable for model training.

Attribute Fairness

Measuring and Reducing LLM Hallucination without Gold-Standard Answers via Expertise-Weighting

no code implementations16 Feb 2024 Jiaheng Wei, Yuanshun Yao, Jean-Francois Ton, Hongyi Guo, Andrew Estornell, Yang Liu

In this work, we propose Factualness Evaluations via Weighting LLMs (FEWL), the first hallucination metric that is specifically designed for the scenario when gold-standard answers are absent.

Hallucination In-Context Learning

Human-Instruction-Free LLM Self-Alignment with Limited Samples

no code implementations6 Jan 2024 Hongyi Guo, Yuanshun Yao, Wei Shen, Jiaheng Wei, Xiaoying Zhang, Zhaoran Wang, Yang Liu

The key idea is to first retrieve high-quality samples related to the target domain and use them as In-context Learning examples to generate more samples.

In-Context Learning Instruction Following

Large Language Model Unlearning

1 code implementation14 Oct 2023 Yuanshun Yao, Xiaojun Xu, Yang Liu

To the best of our knowledge, our work is among the first to explore LLM unlearning.

Language Modelling Large Language Model

Fair Classifiers that Abstain without Harm

no code implementations9 Oct 2023 Tongxin Yin, Jean-François Ton, Ruocheng Guo, Yuanshun Yao, Mingyan Liu, Yang Liu

To generalize the abstaining decisions to test samples, we then train a surrogate model to learn the abstaining decisions based on the IP solutions in an end-to-end manner.

Decision Making Fairness

Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment

1 code implementation10 Aug 2023 Yang Liu, Yuanshun Yao, Jean-Francois Ton, Xiaoying Zhang, Ruocheng Guo, Hao Cheng, Yegor Klochkov, Muhammad Faaiz Taufiq, Hang Li

However, a major challenge faced by practitioners is the lack of clear guidance on evaluating whether LLM outputs align with social norms, values, and regulations.

Fairness Models Alignment

On the Cause of Unfairness: A Training Sample Perspective

no code implementations30 Jun 2023 Yuanshun Yao, Yang Liu

Identifying the causes of a model's unfairness is an important yet relatively unexplored task.

counterfactual Fairness

Label Inference Attack against Split Learning under Regression Setting

1 code implementation18 Jan 2023 Shangyu Xie, Xin Yang, Yuanshun Yao, Tianyi Liu, Taiqing Wang, Jiankai Sun

In this work, we step further to study the leakage in the scenario of the regression model, where the private labels are continuous numbers (instead of discrete labels in classification).

Inference Attack regression +1

Learning to Counterfactually Explain Recommendations

no code implementations17 Nov 2022 Yuanshun Yao, Chong Wang, Hang Li

The key idea is to train a surrogate model to learn the effect of removing a subset of user history on the recommendation.

counterfactual Recommendation Systems +1

Weak Proxies are Sufficient and Preferable for Fairness with Missing Sensitive Attributes

1 code implementation6 Oct 2022 Zhaowei Zhu, Yuanshun Yao, Jiankai Sun, Hang Li, Yang Liu

Our theoretical analyses show that directly using proxy models can give a false sense of (un)fairness.

Fairness

DPAUC: Differentially Private AUC Computation in Federated Learning

1 code implementation25 Aug 2022 Jiankai Sun, Xin Yang, Yuanshun Yao, Junyuan Xie, Di wu, Chong Wang

Federated learning (FL) has gained significant attention recently as a privacy-enhancing tool to jointly train a machine learning model by multiple participants.

Federated Learning

Differentially Private Multi-Party Data Release for Linear Regression

no code implementations16 Jun 2022 Ruihan Wu, Xin Yang, Yuanshun Yao, Jiankai Sun, Tianyi Liu, Kilian Q. Weinberger, Chong Wang

Differentially Private (DP) data release is a promising technique to disseminate data without compromising the privacy of data subjects.

regression

Differentially Private AUC Computation in Vertical Federated Learning

no code implementations24 May 2022 Jiankai Sun, Xin Yang, Yuanshun Yao, Junyuan Xie, Di wu, Chong Wang

In this work, we propose two evaluation algorithms that can more accurately compute the widely used AUC (area under curve) metric when using label DP in vFL.

Vertical Federated Learning

Differentially Private Label Protection in Split Learning

no code implementations4 Mar 2022 Xin Yang, Jiankai Sun, Yuanshun Yao, Junyuan Xie, Chong Wang

Split learning is a distributed training framework that allows multiple parties to jointly train a machine learning model over vertically partitioned data (partitioned by attributes).

Label Leakage and Protection from Forward Embedding in Vertical Federated Learning

no code implementations2 Mar 2022 Jiankai Sun, Xin Yang, Yuanshun Yao, Chong Wang

As the raw labels often contain highly sensitive information, some recent work has been proposed to prevent the label leakage from the backpropagated gradients effectively in vFL.

Vertical Federated Learning

Defending against Reconstruction Attack in Vertical Federated Learning

no code implementations21 Jul 2021 Jiankai Sun, Yuanshun Yao, Weihao Gao, Junyuan Xie, Chong Wang

Recently researchers have studied input leakage problems in Federated Learning (FL) where a malicious party can reconstruct sensitive training inputs provided by users from shared gradient.

Privacy Preserving Reconstruction Attack +1

Vertical Federated Learning without Revealing Intersection Membership

no code implementations10 Jun 2021 Jiankai Sun, Xin Yang, Yuanshun Yao, Aonan Zhang, Weihao Gao, Junyuan Xie, Chong Wang

In this paper, we propose a vFL framework based on Private Set Union (PSU) that allows each party to keep sensitive membership information to itself.

Vertical Federated Learning

Backdoor Attacks Against Deep Learning Systems in the Physical World

no code implementations CVPR 2021 Emily Wenger, Josephine Passananti, Arjun Bhagoji, Yuanshun Yao, Haitao Zheng, Ben Y. Zhao

A critical question remains unanswered: can backdoor attacks succeed using physical objects as triggers, thus making them a credible threat against deep learning systems in the real world?

Transfer Learning

Regula Sub-rosa: Latent Backdoor Attacks on Deep Neural Networks

no code implementations24 May 2019 Yuanshun Yao, Huiying Li, Hai-Tao Zheng, Ben Y. Zhao

Recent work has proposed the concept of backdoor attacks on deep neural networks (DNNs), where misbehaviors are hidden inside "normal" models, only to be triggered by very specific inputs.

Backdoor Attack Traffic Sign Recognition +1

Automated Crowdturfing Attacks and Defenses in Online Review Systems

no code implementations27 Aug 2017 Yuanshun Yao, Bimal Viswanath, Jenna Cryan, Hai-Tao Zheng, Ben Y. Zhao

Malicious crowdsourcing forums are gaining traction as sources of spreading misinformation online, but are limited by the costs of hiring and managing human workers.

Cryptography and Security Social and Information Networks

Cannot find the paper you are looking for? You can Submit a new open access paper.