Search Results for author: Hongyi Guo

Found 10 papers, 4 papers with code

Diverse Randomized Value Functions: A Provably Pessimistic Approach for Offline Reinforcement Learning

no code implementations9 Apr 2024 Xudong Yu, Chenjia Bai, Hongyi Guo, Changhong Wang, Zhen Wang

Offline Reinforcement Learning (RL) faces distributional shift and unreliable value estimation, especially for out-of-distribution (OOD) actions.

Reinforcement Learning (RL) Uncertainty Quantification

Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards

no code implementations12 Mar 2024 Wei Shen, Xiaoying Zhang, Yuanshun Yao, Rui Zheng, Hongyi Guo, Yang Liu

Reinforcement learning from human feedback (RLHF) is the mainstream paradigm used to align large language models (LLMs) with human preferences.

reinforcement-learning

Can Large Language Models Play Games? A Case Study of A Self-Play Approach

no code implementations8 Mar 2024 Hongyi Guo, Zhihan Liu, Yufeng Zhang, Zhaoran Wang

Large Language Models (LLMs) harness extensive data from the Internet, storing a broad spectrum of prior knowledge.

Decision Making Hallucination

Measuring and Reducing LLM Hallucination without Gold-Standard Answers via Expertise-Weighting

no code implementations16 Feb 2024 Jiaheng Wei, Yuanshun Yao, Jean-Francois Ton, Hongyi Guo, Andrew Estornell, Yang Liu

In this work, we propose Factualness Evaluations via Weighting LLMs (FEWL), the first hallucination metric that is specifically designed for the scenario when gold-standard answers are absent.

Hallucination In-Context Learning

Human-Instruction-Free LLM Self-Alignment with Limited Samples

no code implementations6 Jan 2024 Hongyi Guo, Yuanshun Yao, Wei Shen, Jiaheng Wei, Xiaoying Zhang, Zhaoran Wang, Yang Liu

The key idea is to first retrieve high-quality samples related to the target domain and use them as In-context Learning examples to generate more samples.

In-Context Learning Instruction Following

Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency

1 code implementation29 Sep 2023 Zhihan Liu, Hao Hu, Shenao Zhang, Hongyi Guo, Shuqi Ke, Boyi Liu, Zhaoran Wang

Specifically, we design a prompt template for reasoning that learns from the memory buffer and plans a future trajectory over a long horizon ("reason for future").

Behavior Contrastive Learning for Unsupervised Skill Discovery

1 code implementation8 May 2023 Rushuai Yang, Chenjia Bai, Hongyi Guo, Siyuan Li, Bin Zhao, Zhen Wang, Peng Liu, Xuelong Li

Under mild assumptions, our objective maximizes the MI between different behaviors based on the same skill, which serves as an upper bound of the previous MI objective.

Continuous Control Contrastive Learning

Policy Learning Using Weak Supervision

1 code implementation NeurIPS 2021 Jingkang Wang, Hongyi Guo, Zhaowei Zhu, Yang Liu

Most existing policy learning solutions require the learning agents to receive high-quality supervision signals such as well-designed rewards in reinforcement learning (RL) or high-quality expert demonstrations in behavioral cloning (BC).

Reinforcement Learning (RL)

Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates

2 code implementations ICML 2020 Yang Liu, Hongyi Guo

In this work, we introduce a new family of loss functions that we name as peer loss functions, which enables learning from noisy labels and does not require a priori specification of the noise rates.

Learning with noisy labels

Signal Instructed Coordination in Cooperative Multi-agent Reinforcement Learning

no code implementations10 Sep 2019 Liheng Chen, Hongyi Guo, Yali Du, Fei Fang, Haifeng Zhang, Yaoming Zhu, Ming Zhou, Wei-Nan Zhang, Qing Wang, Yong Yu

Although existing works formulate this problem into a centralized learning with decentralized execution framework, which avoids the non-stationary problem in training, their decentralized execution paradigm limits the agents' capability to coordinate.

Multi-agent Reinforcement Learning reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.