Search Results for author: Ruoyao Wang

Found 6 papers, 5 papers with code

Self-Supervised Behavior Cloned Transformers are Path Crawlers for Text Games

no code implementations • 7 Dec 2023 • Ruoyao Wang, Peter Jansen

In this work, we introduce a self-supervised behavior cloning transformer for text games, which are challenging benchmarks for multi-step reasoning in virtual environments.

Paper
Add Code

ByteSized32: A Corpus and Challenge Task for Generating Task-Specific World Models Expressed as Text Games

1 code implementation • 24 May 2023 • Ruoyao Wang, Graham Todd, Eric Yuan, Ziang Xiao, Marc-Alexandre Côté, Peter Jansen

In this work, we investigate the capacity of language models to generate explicit, interpretable, and interactive world models of scientific and common-sense reasoning tasks.

Code Generation Common Sense Reasoning +2

Paper
Code

Behavior Cloned Transformers are Neurosymbolic Reasoners

1 code implementation • 13 Oct 2022 • Ruoyao Wang, Peter Jansen, Marc-Alexandre Côté, Prithviraj Ammanabrolu

In this work, we explore techniques for augmenting interactive agents with information from symbolic modules, much like humans use tools like calculators and GPS systems to assist with arithmetic and navigation.

Common Sense Reasoning

Paper
Code

ScienceWorld: Is your Agent Smarter than a 5th Grader?

1 code implementation • 14 Mar 2022 • Ruoyao Wang, Peter Jansen, Marc-Alexandre Côté, Prithviraj Ammanabrolu

We present ScienceWorld, a benchmark to test agents' scientific reasoning abilities in a new interactive text environment at the level of a standard elementary school science curriculum.

Question Answering

174

Paper
Code

FIBER: Fill-in-the-Blanks as a Challenging Video Understanding Evaluation Framework

1 code implementation • ACL 2022 • Santiago Castro, Ruoyao Wang, Pingxuan Huang, Ian Stewart, Oana Ignat, Nan Liu, Jonathan C. Stroud, Rada Mihalcea

We propose fill-in-the-blanks as a video understanding evaluation framework and introduce FIBER -- a novel dataset consisting of 28, 000 videos and descriptions in support of this evaluation framework.

Language Modelling Multiple-choice +4

Paper
Code

LifeQA: A Real-life Dataset for Video Question Answering

1 code implementation • LREC 2020 • Santiago Castro, Mahmoud Azab, Jonathan Stroud, Cristina Noujaim, Ruoyao Wang, Jia Deng, Rada Mihalcea

We introduce LifeQA, a benchmark dataset for video question answering that focuses on day-to-day real-life situations.

Multiple-choice Question Answering +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.