SOEVAL is created by us by mining questions from StackOverflow. Our goal was to create a prompt dataset that reflects the real-life needs of software developers. To build this dataset, we first collected 500 popular and recent questions with Python and Java tags for each. From these 1,000 questions, we applied a set of inclusion and exclusion criteria. The inclusion criteria were: the question has to (1) explicitly ask “how to do X” in Python or Java; (2) include code in its body; (3) have an accepted answer that includes code. We excluded questions that were (1) open-ended and asking for best practices/guidelines for a specific problem in Python/Java; (2) related to finding a specific API/module for a given task; (3) related to errors due to environment configuration (e.g., missing dependency library); (4) related to configuring libraries/API; (5) syntax specific types of questions. By applying the criteria above to these 1K questions, we obtained 28 and 42 prompts for Java and Python, respectively.
Paper | Code | Results | Date | Stars |
---|