PHYRE: A New Benchmark for Physical Reasoning

Understanding and reasoning about physics is an important ability of intelligent agents. We develop the PHYRE benchmark for physical reasoning that contains a set of simple classical mechanics puzzles in a 2D physical environment. The benchmark is designed to encourage the development of learning algorithms that are sample-efficient and generalize well across puzzles. We test several modern learning algorithms on PHYRE and find that these algorithms fall short in solving the puzzles efficiently. We expect that PHYRE will encourage the development of novel sample-efficient agents that learn efficient but useful models of physics. For code and to play PHYRE for yourself, please visit https://player.phyre.ai.

PDF Abstract NeurIPS 2019 PDF NeurIPS 2019 Abstract

Datasets


Introduced in the Paper:

PHYRE
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Visual Reasoning PHYRE-1B-Cross DQN AUCCESS 36.8 # 4
Visual Reasoning PHYRE-1B-Within DQN AUCCESS 77.6 # 4

Methods


No methods listed for this paper. Add relevant methods here