Search Results for author: Tony T. Wang

Found 3 papers, 1 papers with code

Cliff-Learning

no code implementations14 Feb 2023 Tony T. Wang, Igor Zablotchi, Nir Shavit, Jonathan S. Rosenfeld

We conduct an in-depth investigation of foundation-model cliff-learning and study toy models of the phenomenon.

Transfer Learning

Adversarial Policies Beat Superhuman Go AIs

2 code implementations1 Nov 2022 Tony T. Wang, Adam Gleave, Tom Tseng, Kellin Pelrine, Nora Belrose, Joseph Miller, Michael D. Dennis, Yawen Duan, Viktor Pogrebniak, Sergey Levine, Stuart Russell

The core vulnerability uncovered by our attack persists even in KataGo agents adversarially trained to defend against our attack.

Cannot find the paper you are looking for? You can Submit a new open access paper.