1 code implementation • 12 Jan 2024 • Dmitry Ivanov, Omer Ben-Porat
In an r-MDP, we cater to a diverse user population, each with unique preferences, through interaction with a small set of representative policies.
1 code implementation • 14 Jun 2023 • Dmitry Ivanov, Ilya Zisman, Kirill Chernyshev
The majority of Multi-Agent Reinforcement Learning (MARL) literature equates the cooperation of self-interested agents in mixed environments to the problem of social welfare maximization, allowing agents to arbitrarily share rewards and private information.
no code implementations • 25 May 2022 • Dmitry Ivanov, Aleksandr Chezhegov, Andrey Grunin, Mikhail Kiselev, Denis Larionov
Modern AI systems, based on von Neumann architecture and classical neural networks, have a number of fundamental limitations in comparison with the brain.
no code implementations • 21 Mar 2022 • Georgiy Pshikhachev, Dmitry Ivanov, Vladimir Egorov, Aleksei Shpilman
Modern LfD algorithms require meticulous tuning of hyperparameters that control the influence of demonstrations and, as we show in the paper, struggle with learning from suboptimal demonstrations.
1 code implementation • 14 Mar 2022 • Farid Bagirov, Dmitry Ivanov, Aleksei Shpilman
The former only learns from labeled positive data, whereas the latter also utilizes unlabeled data to improve the overall performance.
1 code implementation • 26 Feb 2022 • Dmitry Ivanov, Iskander Safiulin, Igor Filippov, Ksenia Balabaeva
The second is a loss function that requires explicit specification of an acceptable IC violation denoted as regret budget.
no code implementations • 7 Jan 2022 • Dmitry Ivanov, Mikhail Kiselev, Denis Larionov
This article proposes a sparse computation-based method for optimizing neural networks for reinforcement learning (RL) tasks.
no code implementations • 30 Mar 2021 • Florian Laurent, Manuel Schneider, Christian Scheller, Jeremy Watson, Jiaoyang Li, Zhe Chen, Yi Zheng, Shao-Hung Chan, Konstantin Makhnev, Oleg Svidchenko, Vladimir Egorov, Dmitry Ivanov, Aleksei Shpilman, Evgenija Spirovska, Oliver Tanevski, Aleksandar Nikov, Ramon Grunder, David Galevski, Jakov Mitrovski, Guillaume Sartoretti, Zhiyao Luo, Mehul Damani, Nilabha Bhattacharya, Shivam Agarwal, Adrian Egli, Erik Nygren, Sharada Mohanty
However, the coordination of hundreds of agents in a real-life setting like a railway network remains challenging and the Flatland environment used for the competition models these real-world properties in a simplified manner.
1 code implementation • 24 Feb 2021 • Dmitry Ivanov, Vladimir Egorov, Aleksei Shpilman
Recent reinforcement learning studies extensively explore the interplay between cooperative and competitive behaviour in mixed environments.
1 code implementation • 19 Feb 2019 • Dmitry Ivanov
The objectives are to classify the unlabeled sample and train an unbiased PN classifier, which generally requires to identify the mixing proportions of positives and negatives first.