Search Results for author: Mohammadi Zaki

Found 6 papers, 0 papers with code

Isometric Neural Machine Translation using Phoneme Count Ratio Reward-based Reinforcement Learning

no code implementations20 Mar 2024 Shivam Ratnakant Mhaskar, Nirmesh J. Shah, Mohammadi Zaki, Ashishkumar P. Gudmalwar, Pankaj Wasnik, Rajiv Ratn Shah

In this paper, we present the development of an isometric NMT system using Reinforcement Learning (RL), with a focus on optimizing the alignment of phoneme counts in the source and target language sentence pairs.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Actor-Critic based Improper Reinforcement Learning

no code implementations19 Jul 2022 Mohammadi Zaki, Avinash Mohan, Aditya Gopalan, Shie Mannor

For the AC-based approach we provide convergence rate guarantees to a stationary point in the basic AC case and to a global optimum in the NAC case.

reinforcement-learning Reinforcement Learning (RL)

Improper Reinforcement Learning with Gradient-based Policy Optimization

no code implementations16 Feb 2021 Mohammadi Zaki, Avinash Mohan, Aditya Gopalan, Shie Mannor

We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process, and wishes to combine them optimally to produce a potentially new controller that can outperform each of the base ones.

reinforcement-learning Reinforcement Learning (RL)

Explicit Best Arm Identification in Linear Bandits Using No-Regret Learners

no code implementations13 Jun 2020 Mohammadi Zaki, Avi Mohan, Aditya Gopalan

We study the problem of best arm identification in linearly parameterised multi-armed bandits.

Multi-Armed Bandits

Towards Optimal and Efficient Best Arm Identification in Linear Bandits

no code implementations5 Nov 2019 Mohammadi Zaki, Avinash Mohan, Aditya Gopalan

We give a new algorithm for best arm identification in linearly parameterised bandits in the fixed confidence setting.

Low-rank Bandits with Latent Mixtures

no code implementations6 Sep 2016 Aditya Gopalan, Odalric-Ambrym Maillard, Mohammadi Zaki

This induces a low-rank structure on the matrix of expected rewards r a, b from recommending item a to user b.

Recommendation Systems

Cannot find the paper you are looking for? You can Submit a new open access paper.