no code implementations • 22 Apr 2024 • Subhojyoti Mukherjee, Anusha Lalitha, Kousha Kalantari, Aniket Deshmukh, Ge Liu, Yifei Ma, Branislav Kveton
Learning of preference models from human feedback has been central to recent advances in artificial intelligence.
no code implementations • 12 Apr 2024 • Subhojyoti Mukherjee, Ge Liu, Aniket Deshmukh, Anusha Lalitha, Yifei Ma, Branislav Kveton
We design the LLM prompt by adaptively choosing few-shot examples for a given inference query.
no code implementations • 28 Oct 2023 • Shima Alizadeh, Aniruddha Bhargava, Karthick Gopalswamy, Lalit Jain, Branislav Kveton, Ge Liu
The pessimistic estimator can be optimized by policy gradients and performs well in all of our experiments.
no code implementations • ICLR 2022 • Yifei Ma, Ge Liu, Anoop Deoras
RIM allows us to rethink recommendation in a Matching (Mtch) scenario, where the benefits of the users (e. g., ItemRec relevance) and item providers (e. g., item-exposure guarantees) are considered at the same time.
1 code implementation • ICLR 2022 • Ge Liu, Alexander Dimitrakakis, Brandon Carter, David Gifford
We introduce the maximum $n$-times coverage problem that selects $k$ overlays to maximize the summed coverage of weighted elements, where each element must be covered at least $n$ times.
1 code implementation • 18 Feb 2020 • Siddhartha Jain, Ge Liu, David Gifford
We introduce Information Condensing Active Learning (ICAL), a batch mode model agnostic Active Learning (AL) method targeted at Deep Bayesian Active Learning that focuses on acquiring labels for points which have as much information as possible about the still unacquired points.
no code implementations • 12 Feb 2020 • Ge Liu, Rui Wu, Heng-Tze Cheng, Jing Wang, Jayden Ooi, Lihong Li, Ang Li, Wai Lok Sibon Li, Craig Boutilier, Ed Chi
Deep Reinforcement Learning (RL) is proven powerful for decision making in simulated environments.
no code implementations • 18 Jun 2019 • Siddhartha Jain, Ge Liu, Jonas Mueller, David Gifford
The inaccuracy of neural network models on inputs that do not stem from the training data distribution is both problematic and at times unrecognized.