Search Results for author: Weihan Shen

Found 2 papers, 2 papers with code

Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

1 code implementation22 Nov 2023 Kai Yang, Jian Tao, Jiafei Lyu, Chunjiang Ge, Jiaxin Chen, Qimai Li, Weihan Shen, Xiaolong Zhu, Xiu Li

The direct preference optimization (DPO) method, effective in fine-tuning large language models, eliminates the necessity for a reward model.

Denoising

Emergent collective intelligence from massive-agent cooperation and competition

1 code implementation4 Jan 2023 HanMo Chen, Stone Tao, Jiaxin Chen, Weihan Shen, Xihui Li, Chenghui Yu, Sikai Cheng, Xiaolong Zhu, Xiu Li

Since these learned group strategies arise from individual decisions without an explicit coordination mechanism, we claim that artificial collective intelligence emerges from massive-agent cooperation and competition.

reinforcement-learning Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.