no code implementations • 8 Feb 2024 • Abhishek Panigrahi, Nikunj Saunshi, Kaifeng Lyu, Sobhan Miryoosefi, Sashank Reddi, Satyen Kale, Sanjiv Kumar
RaPTr achieves better pre-training loss for BERT and UL2 language models while requiring 20-33% fewer FLOPs compared to standard training, and is competitive or better than other efficient training methods.
no code implementations • 15 Dec 2023 • Renat Aksitov, Sobhan Miryoosefi, Zonglin Li, Daliang Li, Sheila Babayan, Kavya Kopparapu, Zachary Fisher, Ruiqi Guo, Sushant Prakash, Pranesh Srinivasan, Manzil Zaheer, Felix Yu, Sanjiv Kumar
Answering complex natural language questions often necessitates multi-step reasoning and integrating external information.
Ranked #1 on Question Answering on Bamboogle
no code implementations • 8 Feb 2022 • Yonathan Efroni, Chi Jin, Akshay Krishnamurthy, Sobhan Miryoosefi
Real-world sequential decision making problems commonly involve partial observability, which requires the agent to maintain a memory of history in order to infer the latent states, plan and make good decisions.
no code implementations • 12 Jul 2021 • Sobhan Miryoosefi, Chi Jin
In constrained reinforcement learning (RL), a learning agent seeks to not only optimize the overall reward but also satisfy the additional safety, diversity, or budget constraints.
no code implementations • NeurIPS 2021 • Chi Jin, Qinghua Liu, Sobhan Miryoosefi
Finding the minimal structural assumptions that empower sample-efficient learning is one of the most important research directions in Reinforcement Learning (RL).
1 code implementation • NeurIPS 2020 • Kianté Brantley, Miroslav Dudik, Thodoris Lykouris, Sobhan Miryoosefi, Max Simchowitz, Aleksandrs Slivkins, Wen Sun
We propose an algorithm for tabular episodic reinforcement learning with constraints.
1 code implementation • NeurIPS 2019 • Sobhan Miryoosefi, Kianté Brantley, Hal Daumé III, Miroslav Dudik, Robert Schapire
In standard reinforcement learning (RL), a learning agent seeks to optimize the overall reward.