no code implementations • 31 Oct 2023 • Daman Arora, Anush Kini, Sayak Ray Chowdhury, Nagarajan Natarajan, Gaurav Sinha, Amit Sharma
Given a query and a document corpus, the information retrieval (IR) task is to output a ranked list of relevant documents.
no code implementations • 26 May 2023 • Daman Arora, Subbarao Kambhampati
By randomly sampling actions from the same dataset, we generate examples of invalid actions which are then used to train a verifier which can check for action applicability.
1 code implementation • 24 May 2023 • Daman Arora, Himanshu Gaurav Singh, Mausam
In response, we present JEEBench, a considerably more challenging benchmark dataset for evaluating the problem solving abilities of LLMs.
Ranked #1 on Overall - Test on JEEBench