no code implementations • 20 Mar 2024 • Shivam Ratnakant Mhaskar, Nirmesh J. Shah, Mohammadi Zaki, Ashishkumar P. Gudmalwar, Pankaj Wasnik, Rajiv Ratn Shah
In this paper, we present the development of an isometric NMT system using Reinforcement Learning (RL), with a focus on optimizing the alignment of phoneme counts in the source and target language sentence pairs.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +6
no code implementations • 19 Jul 2022 • Mohammadi Zaki, Avinash Mohan, Aditya Gopalan, Shie Mannor
For the AC-based approach we provide convergence rate guarantees to a stationary point in the basic AC case and to a global optimum in the NAC case.
no code implementations • 16 Feb 2021 • Mohammadi Zaki, Avinash Mohan, Aditya Gopalan, Shie Mannor
We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process, and wishes to combine them optimally to produce a potentially new controller that can outperform each of the base ones.
no code implementations • 13 Jun 2020 • Mohammadi Zaki, Avi Mohan, Aditya Gopalan
We study the problem of best arm identification in linearly parameterised multi-armed bandits.
no code implementations • 5 Nov 2019 • Mohammadi Zaki, Avinash Mohan, Aditya Gopalan
We give a new algorithm for best arm identification in linearly parameterised bandits in the fixed confidence setting.
no code implementations • 6 Sep 2016 • Aditya Gopalan, Odalric-Ambrym Maillard, Mohammadi Zaki
This induces a low-rank structure on the matrix of expected rewards r a, b from recommending item a to user b.