Browse State-of-the-Art
Datasets
More
About
Methods
RC2020
Trends
Portals
Sign In
Subscribe to the PwC Newsletter
×
Stay informed on the latest trending ML papers with code, research developments, libraries, methods, and datasets.
Read previous issues
Join the community
×
You need to
log in
to edit.
You can
create a new account
if you don't have one.
Or, discuss a change on
Slack
.
Edit task
×
Task name:
*
Top-level area:
*
---------
Adversarial
Audio
Computer Code
Computer Vision
Graphs
Knowledge Base
Medical
Methodology
Miscellaneous
Music
Natural Language Processing
Playing Games
Reasoning
Robots
Speech
Time Series
Parent task (if any):
---------
Description with markdown (optional):
Dialogue is notoriously hard to evaluate. Past approaches have used human evaluation.
Image
Add a new evaluation result row
×
Paper title:
*
---------
Dataset:
*
---------
Model name:
*
Metric name:
*
---------
Higher is better (for the metric)
Metric value:
*
Uses extra training data
Data evaluated on
Dialogue
Edit Task
Natural Language Processing
0
benchmarks
0
datasets
About
Edit
Dialogue is notoriously hard to evaluate. Past approaches have used human evaluation.
Benchmarks
Add a Result
You can find evaluation results in the subtasks. You can also
submitting evaluation metrics
for this task.
Subtasks
Dialogue Generation
9 benchmarks
74 papers with code
Dialogue State Tracking
2 benchmarks
49 papers with code
Task-Oriented Dialogue Systems
1 benchmark
41 papers with code
Visual Dialog
7 benchmarks
36 papers with code
Goal-Oriented Dialog
1 benchmark
20 papers with code
See all 15 subtasks
Greatest papers with code
Greatest
Latest
Without code