Search Results for author: Anssi Kanervisto

Found 28 papers, 18 papers with code

Toward Human-AI Alignment in Large-Scale Multi-Player Games

no code implementations • 5 Feb 2024 • Sugandha Sharma, Guy Davidson, Khimya Khetarpal, Anssi Kanervisto, Udit Arora, Katja Hofmann, Ida Momennejad

First, we analyze extensive human gameplay data from Xbox's Bleeding Edge (100K+ games), uncovering behavioral patterns in a complex task space.

Paper
Add Code

BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks

1 code implementation • NeurIPS 2023 • Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Rohin Shah

Given the completion of two years of BASALT competitions, we offer to the community a formalized benchmark through the BASALT Evaluation and Demonstrations Dataset (BEDD), which serves as a resource for algorithm development and performance assessment.

Benchmarking

Paper
Code

Visual Encoders for Data-Efficient Imitation Learning in Modern Video Games

no code implementations • 4 Dec 2023 • Lukas Schäfer, Logan Jones, Anssi Kanervisto, Yuhan Cao, Tabish Rashid, Raluca Georgescu, Dave Bignell, Siddhartha Sen, Andrea Treviño Gavito, Sam Devlin

Video games have served as useful benchmarks for the decision making community, but going beyond Atari games towards training agents in modern games has been prohibitively expensive for the vast majority of the research community.

Atari Games Imitation Learning

Paper
Add Code

Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition

no code implementations • 23 Mar 2023 • Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Sharada Mohanty, Byron Galbraith, Ke Chen, Yan Song, Tianze Zhou, Bingquan Yu, He Liu, Kai Guan, Yujing Hu, Tangjie Lv, Federico Malato, Florian Leopold, Amogh Raut, Ville Hautamäki, Andrew Melnik, Shu Ishida, João F. Henriques, Robert Klassert, Walter Laurito, Ellen Novoseller, Vinicius G. Goecks, Nicholas Waytowich, David Watkins, Josh Miller, Rohin Shah

To facilitate research in the direction of fine-tuning foundation models from human feedback, we held the MineRL BASALT Competition on Fine-Tuning from Human Feedback at NeurIPS 2022.

Paper
Add Code

Imitating Human Behaviour with Diffusion Models

1 code implementation • 25 Jan 2023 • Tim Pearce, Tabish Rashid, Anssi Kanervisto, Dave Bignell, Mingfei Sun, Raluca Georgescu, Sergio Valcarcel Macua, Shan Zheng Tan, Ida Momennejad, Katja Hofmann, Sam Devlin

This paper studies their application as observation-to-action models for imitating human behaviour in sequential environments.

106

Paper
Code

A2C is a special case of PPO

1 code implementation • 18 May 2022 • Shengyi Huang, Anssi Kanervisto, Antonin Raffin, Weixun Wang, Santiago Ontañón, Rousslan Fernand Julien Dossa

Advantage Actor-critic (A2C) and Proximal Policy Optimization (PPO) are popular deep reinforcement learning algorithms used for game AI in recent years.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

GAN-Aimbots: Using Machine Learning for Cheating in First Person Shooters

1 code implementation • 14 May 2022 • Anssi Kanervisto, Tomi Kinnunen, Ville Hautamäki

Playing games with cheaters is not fun, and in a multi-billion-dollar video game industry with hundreds of millions of players, game developers aim to improve the security and, consequently, the user experience of their games by preventing cheating.

BIG-bench Machine Learning

Paper
Code

Retrospective on the 2021 BASALT Competition on Learning from Human Feedback

no code implementations • 14 Apr 2022 • Rohin Shah, Steven H. Wang, Cody Wild, Stephanie Milani, Anssi Kanervisto, Vinicius G. Goecks, Nicholas Waytowich, David Watkins-Valls, Bharat Prakash, Edmund Mills, Divyansh Garg, Alexander Fries, Alexandra Souly, Chan Jun Shern, Daniel del Castillo, Tom Lieberum

The goal of the competition was to promote research towards agents that use learning from human feedback (LfHF) techniques to solve open-world tasks.

Paper
Add Code

Insights From the NeurIPS 2021 NetHack Challenge

1 code implementation • 22 Mar 2022 • Eric Hambro, Sharada Mohanty, Dmitrii Babaev, Minwoo Byeon, Dipam Chakraborty, Edward Grefenstette, Minqi Jiang, DaeJin Jo, Anssi Kanervisto, Jongmin Kim, Sungwoong Kim, Robert Kirk, Vitaly Kurin, Heinrich Küttler, Taehwon Kwon, Donghoon Lee, Vegard Mella, Nantas Nardelli, Ivan Nazarov, Nikita Ovsov, Jack Parker-Holder, Roberta Raileanu, Karolis Ramanauskas, Tim Rocktäschel, Danielle Rothermel, Mikayel Samvelyan, Dmitry Sorokin, Maciej Sypetkowski, Michał Sypetkowski

In this report, we summarize the takeaways from the first NeurIPS 2021 NetHack Challenge.

NetHack Reinforcement Learning (RL)

Paper
Code

MineRL Diamond 2021 Competition: Overview, Results, and Lessons Learned

no code implementations • 17 Feb 2022 • Anssi Kanervisto, Stephanie Milani, Karolis Ramanauskas, Nicholay Topin, Zichuan Lin, Junyou Li, Jianing Shi, Deheng Ye, Qiang Fu, Wei Yang, Weijun Hong, Zhongyue Huang, Haicheng Chen, Guangjun Zeng, Yue Lin, Vincent Micheli, Eloi Alonso, François Fleuret, Alexander Nikulin, Yury Belousov, Oleg Svidchenko, Aleksei Shpilman

With this in mind, we hosted the third edition of the MineRL ObtainDiamond competition, MineRL Diamond 2021, with a separate track in which we permitted any solution to promote the participation of newcomers.

Paper
Add Code

Optimizing Tandem Speaker Verification and Anti-Spoofing Systems

no code implementations • 24 Jan 2022 • Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen, Junichi Yamagishi

As automatic speaker verification (ASV) systems are vulnerable to spoofing attacks, they are typically used in conjunction with spoofing countermeasure (CM) systems to improve security.

Speaker Verification

Paper
Add Code

Agents that Listen: High-Throughput Reinforcement Learning with Multiple Sensory Systems

1 code implementation • 5 Jul 2021 • Shashank Hegde, Anssi Kanervisto, Aleksei Petrenko

We are currently in the process of merging the augmented simulator with the main ViZDoom code repository.

Game of Doom reinforcement-learning +2

Paper
Code

The MineRL BASALT Competition on Learning from Human Feedback

no code implementations • 5 Jul 2021 • Rohin Shah, Cody Wild, Steven H. Wang, Neel Alex, Brandon Houghton, William Guss, Sharada Mohanty, Anssi Kanervisto, Stephanie Milani, Nicholay Topin, Pieter Abbeel, Stuart Russell, Anca Dragan

Rather than training AI systems using a predefined reward function or using a labeled dataset with a predefined set of categories, we instead train the AI system using a learning signal derived from some form of human feedback, which can evolve over time as the understanding of the task changes, or as the capabilities of the AI system improve.

Imitation Learning

Paper
Add Code

Distilling Reinforcement Learning Tricks for Video Games

1 code implementation • 1 Jul 2021 • Anssi Kanervisto, Christian Scheller, Yanick Schraner, Ville Hautamäki

Reinforcement learning (RL) research focuses on general solutions that can be applied across different domains.

Q-Learning reinforcement-learning +1

Paper
Code

Towards robust and domain agnostic reinforcement learning competitions

no code implementations • 7 Jun 2021 • William Hebgen Guss, Stephanie Milani, Nicholay Topin, Brandon Houghton, Sharada Mohanty, Andrew Melnik, Augustin Harter, Benoit Buschmaas, Bjarne Jaster, Christoph Berganski, Dennis Heitkamp, Marko Henning, Helge Ritter, Chengjie WU, Xiaotian Hao, Yiming Lu, Hangyu Mao, Yihuan Mao, Chao Wang, Michal Opanowicz, Anssi Kanervisto, Yanick Schraner, Christian Scheller, Xiren Zhou, Lu Liu, Daichi Nishio, Toi Tsuneda, Karolis Ramanauskas, Gabija Juceviciute

Reinforcement learning competitions have formed the basis for standard research benchmarks, galvanized advances in the state-of-the-art, and shaped the direction of the field.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Multi-task Learning with Attention for End-to-end Autonomous Driving

no code implementations • 21 Apr 2021 • Keishi Ishihara, Anssi Kanervisto, Jun Miura, Ville Hautamäki

This does not only improve the success rate of standard benchmarks, but also the ability to react to traffic lights, which we show with standard benchmarks.

Autonomous Driving Imitation Learning +1

Paper
Add Code

Back to Square One: Superhuman Performance in Chutes and Ladders Through Deep Neural Networks and Tree Search

1 code implementation • 1 Apr 2021 • Dylan Ashley, Anssi Kanervisto, Brendan Bennett

We present AlphaChute: a state-of-the-art algorithm that achieves superhuman performance in the ancient game of Chutes and Ladders.

Paper
Code

General Characterization of Agents by States they Visit

1 code implementation • 2 Dec 2020 • Anssi Kanervisto, Tomi Kinnunen, Ville Hautamäki

Behavioural characterizations (BCs) of decision-making agents, or their policies, are used to study outcomes of training algorithms and as part of the algorithms themselves to encourage unique policies, match expert policy or restrict changes to policy per update.

Imitation Learning

Paper
Code

Playing Minecraft with Behavioural Cloning

1 code implementation • 7 May 2020 • Anssi Kanervisto, Janne Karttunen, Ville Hautamäki

MineRL 2019 competition challenged participants to train sample-efficient agents to play Minecraft, by using a dataset of human gameplay and a limit number of steps the environment.

Behavioural cloning

Paper
Code

Benchmarking End-to-End Behavioural Cloning on Video Games

1 code implementation • 2 Apr 2020 • Anssi Kanervisto, Joonas Pussinen, Ville Hautamäki

We take a step towards a general approach and study the general applicability of behavioural cloning on twelve video games, including six modern video games (published after 2010), by using human demonstrations as training data.

Behavioural cloning Benchmarking

Paper
Code

Action Space Shaping in Deep Reinforcement Learning

1 code implementation • 2 Apr 2020 • Anssi Kanervisto, Christian Scheller, Ville Hautamäki

In this work, we aim to gain insight on these action space modifications by conducting extensive experiments in video-game environments.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

An initial investigation on optimizing tandem speaker verification and countermeasure systems using reinforcement learning

1 code implementation • 6 Feb 2020 • Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen, Junichi Yamagishi

The spoofing countermeasure (CM) systems in automatic speaker verification (ASV) are not typically used in isolation of each other.

Speaker Verification

Paper
Code

Towards Debugging Deep Neural Networks by Generating Speech Utterances

1 code implementation • 6 Jul 2019 • Bilal Soomro, Anssi Kanervisto, Trung Ngo Trong, Ville Hautamäki

One such debugging method used with image classification DNNs is activation maximization, which generates example-images that are classified as one of the classes.

General Classification Image Classification

Paper
Code

Do Autonomous Agents Benefit from Hearing?

no code implementations • 10 May 2019 • Abraham Woubie, Anssi Kanervisto, Janne Karttunen, Ville Hautamaki

In this work, we propose the use of audio as complementary information to visual only in state representation.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

From Video Game to Real Robot: The Transfer between Action Spaces

1 code implementation • 2 May 2019 • Janne Karttunen, Anssi Kanervisto, Ville Kyrki, Ville Hautamäki

Deep reinforcement learning has proven to be successful for learning tasks in simulated environments, but applying same techniques for robots in real-world domain is more challenging, as they require hours of training.

Transfer Learning

Paper
Code

Who Do I Sound Like? Showcasing Speaker Recognition Technology by YouTube Voice Search

1 code implementation • 8 Nov 2018 • Ville Vestman, Bilal Soomro, Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen

The popularization of science can often be disregarded by scientists as it may be challenging to put highly sophisticated research into words that general public can understand.

Audio and Speech Processing Sound