Search Results for author: Weiyu Ma

Token-level Direct Preference Optimization

Fine-tuning pre-trained Large Language Models (LLMs) is essential to align them with human values and intentions.

Paper
Code

StarCraft II is a challenging benchmark for AI agents due to the necessity of both precise micro level operations and strategic macro awareness.

147

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.