Browse State-of-the-Art
Datasets
Methods
More
Newsletter
RC2022
About
Trends
Portals
Libraries
Sign In
Subscribe to the PwC Newsletter
×
Stay informed on the latest trending ML papers with code, research developments, libraries, methods, and datasets.
Read previous issues
Join the community
×
You need to
log in
to edit.
You can
create a new account
if you don't have one.
Browse SoTA
> Reasoning
Reasoning
127 benchmarks • 68 tasks • 180 datasets • 4154 papers with code
Classification
Classification
380 benchmarks
3231 papers with code
Text Classification
233 benchmarks
1101 papers with code
Graph Classification
69 benchmarks
379 papers with code
Audio Classification
26 benchmarks
131 papers with code
Medical Image Classification
8 benchmarks
122 papers with code
See all 18 tasks
Question Answering
Question Answering
238 benchmarks
2861 papers with code
Open-Ended Question Answering
209 papers with code
Open-Domain Question Answering
15 benchmarks
194 papers with code
Conversational Question Answering
1 benchmark
60 papers with code
Answer Selection
6 benchmarks
47 papers with code
See all 19 tasks
Decision Making
Decision Making
1 benchmark
2023 papers with code
Imitation Learning
519 papers with code
Natural Language Inference
Natural Language Inference
45 benchmarks
729 papers with code
Answer Generation
2 benchmarks
55 papers with code
Visual Entailment
3 benchmarks
27 papers with code
Cross-Lingual Natural Language Inference
4 benchmarks
16 papers with code
Logical Reasoning
Navigate
404 papers with code
Logical Reasoning
19 benchmarks
182 papers with code
Novel Concepts
51 papers with code
Temporal Sequences
51 papers with code
StrategyQA
13 papers with code
See all 23 tasks
Multi-Label Classification
Multi-Label Classification
33 benchmarks
374 papers with code
Missing Labels
39 papers with code
Extreme Multi-Label Classification
29 papers with code
Medical Code Prediction
7 benchmarks
15 papers with code
Hierarchical Multi-label Classification
16 benchmarks
13 papers with code
General Reinforcement Learning
Offline RL
2 benchmarks
224 papers with code
Model-based Reinforcement Learning
195 papers with code
Conformal Prediction
146 papers with code
Text Simplification
11 benchmarks
117 papers with code
Music Source Separation
3 benchmarks
53 papers with code
Audio Source Separation
8 benchmarks
44 papers with code
Decision Making Under Uncertainty
43 papers with code
See all 9 tasks
Common Sense Reasoning
Common Sense Reasoning
37 benchmarks
253 papers with code
Physical Commonsense Reasoning
1 benchmark
6 papers with code
Riddle Sense
2 benchmarks
5 papers with code
Winowhy
4 papers with code
Anachronisms
3 papers with code
See all 16 tasks
Visual Reasoning
Visual Reasoning
19 benchmarks
212 papers with code
Visual Commonsense Reasoning
7 benchmarks
29 papers with code
Program Synthesis
Program Synthesis
10 benchmarks
138 papers with code
Type prediction
3 benchmarks
42 papers with code
Program Repair
3 benchmarks
34 papers with code
Value prediction
1 benchmark
15 papers with code
Enumerative Search
5 papers with code
See all 6 tasks
Mathematical Reasoning
Mathematical Reasoning
21 benchmarks
111 papers with code
Math Word Problem Solving
11 benchmarks
62 papers with code
Formal Logic
1 benchmark
11 papers with code
Geometry Problem Solving
8 papers with code
Abstract Algebra
1 benchmark
3 papers with code
See all 8 tasks
Video Question Answering
Video Question Answering
32 benchmarks
150 papers with code
Zero-Shot Video Question Answer
12 benchmarks
33 papers with code
Few-shot Video Question Answering
1 papers with code
Multi-Label Learning
Multi-Label Learning
1 benchmark
81 papers with code
Missing Labels
39 papers with code
Mathematical Proofs
Automated Theorem Proving
10 benchmarks
68 papers with code
Mathematical Proofs
10 benchmarks
17 papers with code
Arithmetic Reasoning
Arithmetic Reasoning
2 benchmarks
69 papers with code
Math Word Problem Solving
Math Word Problem Solving
11 benchmarks
62 papers with code
Mathematical Question Answering
Math Word Problem Solving
11 benchmarks
62 papers with code
Program Repair
Program Repair
3 benchmarks
34 papers with code
Fault localization
15 papers with code
Variable misuse
9 papers with code
Exception type
2 papers with code
Function-docstring mismatch
1 papers with code
See all 7 tasks
Systematic Generalization
Systematic Generalization
61 papers with code
Decision Making Under Uncertainty
Decision Making Under Uncertainty
43 papers with code
Uncertainty Visualization
3 papers with code
Video-based Generative Performance Benchmarking
Video-based Generative Performance Benchmarking (Consistency)
1 benchmark
9 papers with code
Video-based Generative Performance Benchmarking (Contextual Understanding)
1 benchmark
9 papers with code
Video-based Generative Performance Benchmarking (Correctness of Information)
1 benchmark
9 papers with code
Video-based Generative Performance Benchmarking (Detail Orientation))
1 benchmark
9 papers with code
Video-based Generative Performance Benchmarking (Temporal Understanding)
1 benchmark
9 papers with code
Multimodal Reasoning
Multimodal Reasoning
3 benchmarks
34 papers with code
Natural Language Visual Grounding
Natural Language Visual Grounding
16 papers with code
Discrete Choice Models
Discrete Choice Models
14 papers with code
Generative Visual Question Answering
Video-based Generative Performance Benchmarking
6 benchmarks
13 papers with code
Causal Identification
Causal Identification
12 papers with code
Odd One Out
Odd One Out
1 benchmark
10 papers with code
Geometry Problem Solving
Geometry Problem Solving
8 papers with code
Autonomous Navigation
Sequential Place Recognition
5 papers with code
Autonomous Flight (Dense Forest)
1 benchmark
1 papers with code
Autonomous Web Navigation
Abstract Argumentation
Abstract Argumentation
4 papers with code
Analogical Similarity
Analogical Similarity
1 benchmark
4 papers with code
Theory of Mind Modeling
Theory of Mind Modeling
4 papers with code
Anachronisms
Anachronisms
3 papers with code
Human Judgment Correlation
Human Judgment Correlation
2 benchmarks
3 papers with code
Human Judgment Classification
Human Judgment Classification
1 benchmark
2 papers with code
Identify Odd Metapor
Identify Odd Metapor
1 benchmark
2 papers with code
Commonsense Reasoning for RL
Commonsense Reasoning for RL
1 benchmark
1 papers with code
Pre-election ratings estimation
Pre-election ratings estimation
1 papers with code