no code implementations • 29 Feb 2024 • Xumei Xi, Christina Lee Yu, Yudong Chen
Our bounds characterize the hardness of estimating each entry as a function of the localized sampling probabilities.
no code implementations • 26 Jul 2023 • Xumei Xi, Yuke Zhao, Quan Liu, Liwen Ouyang, Yang Wu
To this end, we train a farsighted recommender by using an offline RL algorithm with the policy network in our model architecture that has been initialized from a pre-trained transformer model.
no code implementations • 24 May 2023 • Xumei Xi, Christina Lee Yu, Yudong Chen
We consider offline Reinforcement Learning (RL), where the agent does not interact with the environment and must rely on offline data collected using a behavior policy.
no code implementations • 28 Sep 2020 • Yudong Chen, Dogyoon Song, Xumei Xi, Yuqian Zhang
As the objective function is non-convex, there can be multiple local minima that are not globally optimal, even for well-separated mixture models.