Search Results for author: Denis Tarasov

Found 8 papers, 7 papers with code

Distilling LLMs' Decomposition Abilities into Compact Language Models

1 code implementation2 Feb 2024 Denis Tarasov, Kumar Shridhar

Large Language Models (LLMs) have demonstrated proficiency in their reasoning abilities, yet their large size presents scalability challenges and limits any further customization.

Katakomba: Tools and Benchmarks for Data-Driven NetHack

1 code implementation NeurIPS 2023 Vladislav Kurenkov, Alexander Nikulin, Denis Tarasov, Sergey Kolesnikov

NetHack is known as the frontier of reinforcement learning research where learning-based methods still need to catch up to rule-based solutions.

D4RL NetHack +2

Revisiting the Minimalist Approach to Offline Reinforcement Learning

1 code implementation NeurIPS 2023 Denis Tarasov, Vladislav Kurenkov, Alexander Nikulin, Sergey Kolesnikov

Recent years have witnessed significant advancements in offline reinforcement learning (RL), resulting in the development of numerous algorithms with varying degrees of complexity.

D4RL Offline RL +2

Anti-Exploration by Random Network Distillation

3 code implementations31 Jan 2023 Alexander Nikulin, Vladislav Kurenkov, Denis Tarasov, Sergey Kolesnikov

Despite the success of Random Network Distillation (RND) in various domains, it was shown as not discriminative enough to be used as an uncertainty estimator for penalizing out-of-distribution actions in offline reinforcement learning.

D4RL

Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flows

2 code implementations20 Nov 2022 Dmitriy Akimov, Vladislav Kurenkov, Alexander Nikulin, Denis Tarasov, Sergey Kolesnikov

This Normalizing Flows action encoder is pre-trained in a supervised manner on the offline dataset, and then an additional policy model - controller in the latent space - is trained via reinforcement learning.

Offline RL reinforcement-learning +1

Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch Size

2 code implementations20 Nov 2022 Alexander Nikulin, Vladislav Kurenkov, Denis Tarasov, Dmitry Akimov, Sergey Kolesnikov

Training large neural networks is known to be time-consuming, with the learning duration taking days or even weeks.

Offline RL

CORL: Research-oriented Deep Offline Reinforcement Learning Library

3 code implementations NeurIPS 2023 Denis Tarasov, Alexander Nikulin, Dmitry Akimov, Vladislav Kurenkov, Sergey Kolesnikov

CORL is an open-source library that provides thoroughly benchmarked single-file implementations of both deep offline and offline-to-online reinforcement learning algorithms.

Benchmarking D4RL +1

Inception Architecture and Residual Connections in Classification of Breast Cancer Histology Images

no code implementations10 Dec 2019 Mohammad Ibrahim Sarker, Hyongsuk Kim, Denis Tarasov, Dinar Akhmetzanov

This paper presents results of applying Inception v4 deep convolutional neural network to ICIAR-2018 Breast Cancer Classification Grand Challenge, part a.

Binary Classification Classification +3

Cannot find the paper you are looking for? You can Submit a new open access paper.