Methodology

Chatbot

168 papers with code • 0 benchmarks • 8 datasets

Chatbot or conversational AI is a language model designed and implemented to have conversations with humans.

Source: Open Data Chatbot

Image source

Benchmarks

Add a Result

These leaderboards are used to track progress in Chatbot

You can find evaluation results in the subtasks. You can also submitting evaluation metrics for this task.

Libraries

Use these libraries to find Chatbot models and implementations

PaddlePaddle/Knover

3 papers

672

facebookresearch/ParlAI

2 papers

10,426

Datasets

Subtasks

Dialogue Generation

Latest papers

Most implemented Social Latest No code

Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators

tatsu-lab/alpaca_eval • 6 Apr 2024

Even simple, known confounders such as preference for longer outputs remain in existing automated evaluation metrics.

1,093

06 Apr 2024

Paper
Code

Physics Event Classification Using Large Language Models

ai4eic/ai4eichackathon2023-streamlit • 5 Apr 2024

The 2023 AI4EIC hackathon was the culmination of the third annual AI4EIC workshop at The Catholic University of America.

05 Apr 2024

Paper
Code

Facilitating Pornographic Text Detection for Open-Domain Dialogue Systems via Knowledge Distillation of Large Language Models

qiuhuachuan/CensorChat • • 20 Mar 2024

Pornographic content occurring in human-machine interaction dialogues can cause severe side effects for users in open-domain dialogue systems.

20 Mar 2024

Paper
Code

Characteristic AI Agents via Large Language Models

nuaa-nlp/character100 • • 19 Mar 2024

In response to this research gap, we create a benchmark for the characteristic AI agents task, including dataset, techniques, and evaluation metrics.

19 Mar 2024

Paper
Code

DeepSeek-VL: Towards Real-World Vision-Language Understanding

deepseek-ai/deepseek-vl • • 8 Mar 2024

The DeepSeek-VL family (both 1. 3B and 7B models) showcases superior user experiences as a vision-language chatbot in real-world applications, achieving state-of-the-art or competitive performance across a wide range of visual-language benchmarks at the same model size while maintaining robust performance on language-centric benchmarks.

1,497

08 Mar 2024

Paper
Code

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

lm-sys/fastchat • • 7 Mar 2024

To address this issue, we introduce Chatbot Arena, an open platform for evaluating LLMs based on human preferences.

33,792

07 Mar 2024

Paper
Code

Yi: Open Foundation Models by 01.AI

01-ai/yi • • 7 Mar 2024

The Yi model family is based on 6B and 34B pretrained language models, then we extend them to chat models, 200K long context models, depth-upscaled models, and vision-language models.

7,112

07 Mar 2024

Paper
Code

KoDialogBench: Evaluating Conversational Understanding of Language Models with Korean Dialogue Benchmark

sb-jang/kodialogbench • 27 Feb 2024

As language models are often deployed as chatbot assistants, it becomes a virtue for models to engage in conversations in a user's first language.

27 Feb 2024

Paper
Code

ASEM: Enhancing Empathy in Chatbot through Attention-based Sentiment and Emotion Modeling

mirah-official/empathetic-chatbot-asem • • 25 Feb 2024

Effective feature representations play a critical role in enhancing the performance of text generation models that rely on deep neural networks.

25 Feb 2024

Paper
Code

HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs

cemuluoglakci/hypotermqa • • 25 Feb 2024

We leverage LLMs to generate challenging tasks related to hypothetical phenomena, subsequently employing them as agents for efficient hallucination detection.

25 Feb 2024

Paper
Code

Chatbot

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result