OpenEQA Dataset | Papers With Code

Name:*

Full name (optional):

Description (Markdown and $\LaTeX$ enabled):*

The **OpenEQA** dataset is a significant contribution in the field of **Embodied Question Answering (EQA)**. Let me provide you with some details:

1. **Definition**:
   - **Embodied Question Answering (EQA)** involves understanding an environment well enough to answer questions about it in natural language.
   - EQA agents can achieve this understanding through either **episodic memory** (as seen in agents using smart glasses) or **active exploration** of the environment (as in the case of mobile robots).

2. **OpenEQA Dataset**:
   - **OpenEQA** is the **first open-vocabulary benchmark dataset** for EQA that supports both episodic memory and active exploration use cases.
   - It contains over **1600 high-quality human-generated questions** drawn from more than **180 real-world environments**.
   - The dataset consists of **question-answer pairs** ($Q, A^*$) and corresponding **episode histories** ($H$).
   - You can find the question-answer pairs in the file `data/open-eqa-v0.json`.
   - To access the episode histories, follow the instructions provided [here](https://github.com/facebookresearch/open-eqa#dataset).

3. **Evaluation Protocol**:
   - OpenEQA also provides an **automatic evaluation protocol** powered by **language model-based evaluation** (LLM).
   - This evaluation protocol correlates well with human judgment.

4. **Foundation Models Evaluation**:
   - Researchers evaluated several state-of-the-art foundation models, including **GPT-4V**, using the OpenEQA dataset.
   - The findings revealed that these models significantly lag behind **human-level performance** in EQA tasks.

5. **Significance**:
   - OpenEQA serves as a **straightforward, measurable, and practically relevant benchmark** for current-generation foundation models.
   - It poses a considerable challenge and inspires research at the intersection of **Embodied AI**, **conversational agents**, and **world models**.

Homepage URL (optional):

Paper where the dataset was introduced:

Introduction date:

Dataset license:

URL to full license terms:

Image

---

OpenEQA

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages

OpenEQA

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

License Edit

Modalities Edit

Languages Edit