Datasets

9,772 machine learning datasets
Filter by Task (clear)
Data Augmentation Question Answering 145 Text Classification 70 Text Generation 67 Language Modelling 54 Visual Question Answering (VQA) 51 Machine Translation 43 Named Entity Recognition (NER) 41 Text Summarization 40 Reading Comprehension 39 Information Retrieval 35 Natural Language Inference 34 Sentiment Analysis 34 Natural Language Understanding 30 Relation Extraction 29 Common Sense Reasoning 27 Abstractive Text Summarization 22 Misinformation 22 Classification 21 Code Generation 20 Coreference Resolution 20 Entity Linking 20 Hate Speech Detection 20 Image Captioning 19 Stance Detection 17 Machine Reading Comprehension 16 Open-Domain Question Answering 16 Document Summarization 15 Word Embeddings 15 Image Retrieval 14 Video Question Answering 14 Data-to-Text Generation 13 Fake News Detection 13 Relation Classification 13 Domain Adaptation 12 Paraphrase Identification 12 Retrieval 12 Visual Reasoning 12 Dialogue Generation 11 Joint Entity and Relation Extraction 11 Link Prediction 11 Part-Of-Speech Tagging 11 Semantic Parsing 11 Speech Recognition 11 Token Classification 11 Emotion Recognition in Conversation 10 Math Word Problem Solving 10 Multi-Task Learning 10 Paraphrase Generation 10 Sequence-to-sequence Language Modeling 10 Video Captioning 10 Automatic Post-Editing 9 Few-Shot Learning 9 Knowledge Graphs 9 Mathematical Reasoning 9 Recommendation Systems 9 Sarcasm Detection 9 Visual Question Answering 9 Zero-shot Text Search 9 Aspect-Based Sentiment Analysis (ABSA) 8 Cross-Lingual Transfer 8 Dialogue State Tracking 8 Document Classification 8 Emotion Classification 8 Emotion Recognition 8 Entity Typing 8 NER 8 Question Generation 8 Semantic Textual Similarity 8 Task-Oriented Dialogue Systems 8 Translation 8 Video Retrieval 8 Explanation Generation 7 Fact Verification 7 Image Classification 7 Multi-Document Summarization 7 Multiple-choice 7 Passage Retrieval 7 Sentence Embeddings 7 Text Simplification 7 Text-to-Image Generation 7 Topic Models 7 Ad-hoc video search 6 Dialogue Understanding 6 Event Extraction 6 Intent Detection 6 Medical Visual Question Answering 6 Optical Character Recognition (OCR) 6 Referring Expression Segmentation 6 Sentence Classification 6 Sentiment Classification 6 Slot Filling 6 Stance Classification 6 Visual Dialog 6 Word Sense Disambiguation 6 Zero-Shot Learning 6 AMR Parsing 5 AMR-to-Text Generation 5 Abusive Language 5 Chinese Reading Comprehension 5 Code Completion 5 Conversational Question Answering 5 Conversational Response Selection 5 Cross-Modal Retrieval 5 Dependency Parsing 5 Extractive Text Summarization 5 Goal-Oriented Dialog 5 Image-to-Text Retrieval 5 Instruction Following 5 Knowledge Base Question Answering 5 Mathematical Question Answering 5 Meeting Summarization 5 Multimodal Deep Learning 5 Nested Named Entity Recognition 5 Response Generation 5 Scene Text Recognition 5 Semantic Role Labeling 5 Spoken Language Understanding 5 Story Generation 5 Table-to-Text Generation 5 Toxic Comment Classification 5 Twitter Sentiment Analysis 5 Video Grounding 5 Vision and Language Navigation 5 Vision-Language Navigation 5 Abstractive Dialogue Summarization 4 Answer Selection 4 Automated Theorem Proving 4 Chart Question Answering 4 Citation Recommendation 4 Code Search 4 Composed Image Retrieval (CoIR) 4 Cross-Lingual NER 4 Dialogue Act Classification 4 Dialogue Evaluation 4 Discourse Parsing 4 Discourse Segmentation 4 Entity Disambiguation 4 Extreme Summarization 4 Fact Checking 4 Few-Shot Relation Classification 4 Few-shot NER 4 Gesture Generation 4 Image Generation 4 KG-to-Text Generation 4 Language Identification 4 Logical Reasoning 4 Low Resource Named Entity Recognition 4 Moment Retrieval 4 Multi-Label Classification 4 Multi-Label Text Classification 4 Multi-task Language Understanding 4 Multimodal Reasoning 4 Multimodal Sentiment Analysis 4 Natural Questions 4 Nested Mention Recognition 4 News Classification 4 Node Classification 4 Object Detection 4 Open Information Extraction 4 Open Intent Discovery 4 Open-Domain Dialog 4 Paper generation 4 Reading Comprehension (Few-Shot) 4 Reading Comprehension (One-Shot) 4 Reading Comprehension (Zero-Shot) 4 Referring Expression Comprehension 4 Scene Text Detection 4 Speech Synthesis 4 Text to Audio Retrieval 4 Text-To-SQL 4 Text-to-Code Generation 4 Text-to-Video Generation 4 Video Understanding 4 Visual Commonsense Reasoning 4 Weakly-Supervised Named Entity Recognition 4 Zero-Shot Video Question Answer 4 Zero-Shot Video Retrieval 4 2D Object Detection 3 Anomaly Detection 3 Argument Mining 3 Aspect Category Detection 3 Aspect Category Polarity 3 Aspect Term Extraction and Sentiment Classification 3 Aspect-Category-Opinion-Sentiment Quadruple Extraction 3 Audio captioning 3 Audio to Text Retrieval 3 Automatic Speech Recognition (ASR) 3 Bias Detection 3 Binary Classification 3 Biomedical Information Retrieval 3 Chinese Named Entity Recognition 3 Citation Intent Classification 3 Click-Through Rate Prediction 3 Code Classification 3 Code Repair 3 Cross Document Coreference Resolution 3 Cross-Lingual Question Answering 3 Decision Making 3 Definition Extraction 3 Dense Video Captioning 3 Document-level Closed Information Extraction 3 Entity Resolution 3 Entity Retrieval 3 Event Coreference Resolution 3 Explainable artificial intelligence 3 Fairness 3 Few-Shot Image Classification 3 Formal Logic 3 Gender Bias Detection 3 Generative Question Answering 3 Graph Classification 3 Intent Classification 3 Language Acquisition 3 Long-range modeling 3 Medical Concept Normalization 3 Motion Synthesis 3 Multi-Domain Recommender Systems 3 Multi-class Classification 3 Multilingual NLP 3 Multimodal Intent Recognition 3 Multiple Choice Question Answering (MCQA) 3 Natural Language Visual Grounding 3 Negation Detection 3 Object Counting 3 Open Intent Detection 3 Opinion Mining 3 Phrase Grounding 3 Recognizing Emotion Cause in Conversations 3 Referring Expression 3 Review Generation 3 Scientific Document Summarization 3 Semantic Segmentation 3 Semantic Similarity 3 Stochastic Optimization 3 Term Extraction 3 Text Retrieval 3 Text-To-Speech Synthesis 3 Time Series Forecasting 3 Topic Classification 3 Translation deu-eng 3 Translation eng-deu 3 Unsupervised Extractive Summarization 3 Unsupervised Text Classification 3 Video-Text Retrieval 3 Visual Storytelling 3 Word Alignment 3 Zero-Shot Composed Image Retrieval (ZS-CIR) 3 Zero-Shot Text Classification 3 Zero-shot Named Entity Recognition (NER) 3 2 2D Semantic Segmentation 2 3D Anomaly Detection 2 3D Face Animation 2 AbbreviationDetection 2 Action Recognition 2 Active Learning 2 Adversarial Robustness 2 Aggression Identification 2 Answer Generation 2 Argument Retrieval 2 Arithmetic Reasoning 2 Aspect Extraction 2 Astronomy 2 Bayesian Inference 2 Binary Relation Extraction 2 Binary text classification 2 Causal Inference 2 Causal Language Modeling 2 Chatbot 2 Chunking 2 Citation Prediction 2 Claim Verification 2 Code Comment Generation 2 Code Documentation Generation 2 Code Summarization 2 Code Translation 2 Commonsense Knowledge Base Construction 2 Community Question Answering 2 Conditional Text Generation 2 ContextNER 2 Continual Pretraining 2 Conversational Search 2 Cross-Lingual Abstractive Summarization 2 Cross-Lingual POS Tagging 2 Cross-Lingual Paraphrase Identification 2 Dark Humor Detection 2 Dialog Relation Extraction 2 Document Ranking 2 Dynamic Link Prediction 2 Embeddings Evaluation 2 End-To-End Dialogue Modelling 2 Entity Alignment 2 Event Argument Extraction 2 Extractive Document Summarization 2 FG-1-PG-1 2 Fact-based Text Editing 2 Factual Visual Question Answering 2 Few-Shot NLI 2 Few-Shot Text Classification 2 General Knowledge 2 Goal-Oriented Dialogue Systems 2 Grammatical Error Correction 2 Graph Embedding 2 Graph Generation 2 Handwriting Recognition 2 Headline Generation 2 Humor Detection 2 Implicit Discourse Relation Classification 2 Instance Segmentation 2 Intent Discovery 2 Interpretable Machine Learning 2 Key Information Extraction 2 Knowledge Probing 2 Learning-To-Rank 2 Lemmatization 2 Lexical Entailment 2 Low-Resource Neural Machine Translation 2 MNLI-m 2 MNLI-mm 2 MRPC 2 Masked Language Modeling 2 Medical Code Prediction 2 Medical Diagnosis 2 Medical Report Generation 2 Memorization 2 Model Compression 2 Mortality Prediction 2 Motion Captioning 2 Multi-Label Learning 2 Multi-domain Dialogue State Tracking 2 Multi-hop Question Answering 2 Multi-modal Dialogue Generation 2 Multilingual Named Entity Recognition 2 Multilingual text classification 2 Multimodal Abstractive Text Summarization 2 Multimodal Emotion Recognition 2 Multimodal Machine Translation 2 Multimodal Text Prediction 2 Multiview Contextual Commonsense Inference 2 Named Entity Recognition 2 Network Embedding 2 New Product Sales Forecasting 2 News Annotation 2 News Recommendation 2 Node Clustering 2 Object Localization 2 Object Recognition 2 Out of Distribution (OOD) Detection 2 Out-of-Distribution Detection 2 POS 2 Paper generation (Conclusion-to-title) 2 Paper generation (Title-to-abstract) 2 Paper generation (abstract-to-conclusion) 2 Person Re-Identification 2 Person Retrieval 2 Person Search 2 Probing Language Models 2 Product Recommendation 2 Program Synthesis 2 Prosody Prediction 2 QNLI 2 QQP 2 Quantization 2 Query-Based Extractive Summarization 2 Reinforcement Learning (RL) 2 Relational Reasoning 2 Rumour Detection 2 SQL Parsing 2 SST-2 2 Science Question Answering 2 Scientific Concept Extraction 2 Scientific Results Extraction 2 Semantic Textual Similarity within Bi-Encoder 2 Semi Supervised Learning for Image Captioning 2 Semi-Supervised Text Classification 2 Sentence Embedding 2 Sentence-Embedding 2 Sequential Recommendation 2 Sign Language Translation 2 Speaker Diarization 2 Speaker Identification 2 Speech Emotion Recognition 2 Speech-to-Speech Translation 2 Spoken Dialogue Systems 2 Story Completion 2 Table Detection 2 Temporal Action Localization 2 Temporal Information Extraction 2 Temporal Relation Classification 2 Temporal/Casual QA 2 Text Categorization 2 Text Style Transfer 2 Text based Person Retrieval 2 Text to Video Retrieval 2 Text-to-Music Generation 2 Timex normalization 2 Translation eng-hrv 2 Translation eng-srp_Cyrl 2 Transliteration 2 Unsupervised Domain Adaptation 2 Unsupervised Machine Translation 2 Unsupervised Opinion Summarization 2 ValNov 2 Visual Entailment 2 Visual Navigation 2 Visual Relationship Detection 2 Weakly Supervised Classification 2 Word Translation 2 Zero-Shot Cross-Lingual Transfer 2 Zero-Shot Image Classification 2 Zero-shot Relation Classification 2 Zero-shot Relation Triplet Extraction 2 dialog state tracking 2 knowledge editing 2 multimodal generation 2 text annotation 2 text similarity 2 text-based games 2 text2text-generation 2 2D Human Pose Estimation 1 3D Object Classification 1 3D Object Detection 1 3D Object Recognition 1 3D Shape Reconstruction 1 3D dense captioning 1 Abstract Algebra 1 Abuse Detection 1 Accented Speech Recognition 1 Action Anticipation 1 Action Classification 1 Action Detection 1 Action Segmentation 1 Actionable Phrase Detection 1 Ad-Hoc Information Retrieval 1 Adversarial Attack 1 Adversarial Text 1 Aesthetic Image Captioning 1 Aesthetics Quality Assessment 1 Age And Gender Classification 1 Analogical Similarity 1 Analytic Entailment 1 Anatomy 1 Annotated Code Search 1 Argument Pair Extraction (APE) 1 Aspect Category Sentiment Analysis 1 Aspect Sentiment Triplet Extraction 1 Aspect-Based Sentiment Analysis 1 Aspect-oriented Opinion Extraction 1 Attribute Value Extraction 1 Audio Classification 1 Audio Super-Resolution 1 Audio-Visual Speech Recognition 1 Authorship Verification 1 Auto Debugging 1 AutoML 1 Automated Essay Scoring 1 Automatic Lyrics Transcription 1 Automatic Speech Recognition 1 Autonomous Driving 1 Bandwidth Extension 1 Blackout Poetry Generation 1 Board Games 1 Bridging Anaphora Resolution 1 Business Ethics 1 Caption Generation 1 Card Games 1 Causal Discovery 1 Causal Emotion Entailment 1 Causal Identification 1 Cell Segmentation 1 Cell Tracking 1 Chat-based Image Retrieval 1 Chemical Entity Recognition 1 Chemical Indexing 1 Claim Extraction with Stance Classification (CESC) 1 Claim-Evidence Pair Extraction (CEPE) 1 Clinical Knowledge 1 Clinical Note Phenotyping 1 Clinical Section Identification 1 Cloze (multi-choices) (Few-Shot) 1 Cloze (multi-choices) (One-Shot) 1 Cloze (multi-choices) (Zero-Shot) 1 Clustering 1 Clustering Algorithms Evaluation 1 CoLA 1 CodeSearchNet - Java 1 College Biology 1 College Chemistry 1 College Computer Science 1 College Mathematics 1 College Medicine 1 College Physics 1 Common Sense Reasoning (Zero-Shot) 1 Commonsense Causal Reasoning 1 Complex Query Answering 1 Component Classification 1 Composed Video Retrieval (CoVR) 1 Source Code Summarization 1
Filter by Language (clear)
English German 5 Italian 3 Chinese 2 French 2 Russian 2 Arabic 1 Czech 1 Dutch 1 Latin 1 Persian 1 Portuguese 1 Romanian 1 Spanish 1 Turkish 1 Vietnamese 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Afrikaans 0 Akan 0 Akkadian 0 Akuntsu 0 Albanian 0 American Sign Language 0 Amharic 0 Ancient Greek 0 Ancient Hebrew 0 Apurinã 0 Aragonese 0 Argentine Sign Language 0 Armenian 0 Arpitan 0 Assamese 0 Assyrian Neo-Aramaic 0 Asturian 0 Avaric 0 Aymara 0 Azerbaijani 0 Bambara 0 Bangala 0 Bangladeshi Sign Language 0 Banjar 0 Bashkir 0 Basque 0 Bavarian 0 Belarusian 0 Bemba (Zambia) 0 Bengali 0 Bhojpuri 0 Bishnupriya 0 Bislama 0 Bodo (India) 0 Bosnian 0 Breton 0 Buginese 0 Bulgarian 0 Burmese 0 Catalan 0 Cebuano 0 Central Bikol 0 Central Khmer 0 Central Kurdish 0 Central Pashto 0 Chamorro 0 Chavacano 0 Chechen 0 Cherokee 0 Cheyenne 0 Choctaw 0 Chukot 0 Church Slavic 0 Chuvash 0 Congo Swahili 0 Coptic 0 Cornish 0 Corsican 0 Cree 0 Creek 0 Crimean Tatar 0 Croatian 0 Danish 0 Dhivehi 0 Dimli (individual language) 0 Dogri (individual language) 0 Dogri (macrolanguage) 0 Dzongkha 0 Eastern Mari 0 Egyptian Arabic 0 Erzya 0 Esperanto 0 Estonian 0 Ewe 0 Extremaduran 0 Faroese 0 Fiji Hindi 0 Fijian 0 Filipino 0 Finnish 0 Fon 0 Friulian 0 Fulah 0 Gagauz 0 Galician 0 Gan Chinese 0 Ganda 0 Geez 0 Georgian 0 German Sign Language 0 Gilaki 0 Goan Konkani 0 Gothic 0 Greek 0 Greek Sign Language 0 Guarani 0 Gujarati 0 Gulf Arabic 0 Haitian 0 Hakha Chin 0 Hakka Chinese 0 Halh Mongolian 0 Hausa 0 Hawaiian 0 Hebrew 0 Herero 0 Hindi 0 Hiri Motu 0 Hungarian 0 Icelandic 0 Ido 0 Igbo 0 Iloko 0 Indonesian 0 Interlingua (International Auxiliary Language Association) 0 Interlingue 0 Inuktitut 0 Inupiaq 0 Iranian Persian 0 Irish 0 Jamaican Creole English 0 Japanese 0 Javanese 0 Jejueo 0 Kabardian 0 Kabuverdianu 0 Kabyle 0 Kachin 0 Kalaallisut 0 Kalmyk 0 Kannada 0 Kanuri 0 Kara-Kalpak 0 Karachay-Balkar 0 Karelian 0 Kashmiri 0 Kashubian 0 Kazakh 0 Khunsari 0 Kikuyu 0 Kinyarwanda 0 Kirghiz 0 Komi 0 Komi-Permyak 0 Komi-Zyrian 0 Kongo 0 Korean 0 Krio 0 Kuanyama 0 Kurdish 0 Kölsch 0 Ladino 0 Lak 0 Lao 0 Latgalian 0 Latvian 0 Lezghian 0 Ligurian 0 Limburgan 0 Lingala 0 Literary Chinese 0 Lithuanian 0 Livvi 0 Lojban 0 Lombard 0 Low German 0 Lower Sorbian 0 Lozi 0 Lunda 0 Luo (Cameroon) 0 Luo (Kenya and Tanzania) 0 Lushai 0 Luxembourgish 0 Macedonian 0 Maithili 0 Malagasy 0 Malay (individual language) 0 Malay (macrolanguage) 0 Malayalam 0 Maltese 0 Mandarin Chinese 0 Manipuri 0 Manx 0 Maori 0 Marathi 0 Marshallese 0 Mazanderani 0 Mbyá Guaraní 0 Mesopotamian Arabic 0 Min Dong Chinese 0 Minangkabau 0 Mingrelian 0 Mirandese 0 Modern Greek 0 Modern Greek (1453-) 0 Moksha 0 Mongolian 0 Moroccan Arabic 0 Multilingual 0 Mundurukú 0 Najdi Arabic 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Nayini 0 Ndonga 0 Neapolitan 0 Nepali (individual language) 0 Nepali (macrolanguage) 0 Newari 0 Nigerian Fulfulde 0 Nigerian Pidgin 0 North Azerbaijani 0 North Levantine Arabic 0 Northern Frisian 0 Northern Huishui Hmong 0 Northern Kurdish 0 Northern Luri 0 Northern Sami 0 Northern Uzbek 0 Norwegian 0 Norwegian Bokmål 0 Norwegian Nynorsk 0 Novial 0 Nyanja 0 Occitan (post 1500) 0 Odia 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Old French 0 Old Russian 0 Old Turkish 0 Oriya (macrolanguage) 0 Oromo 0 Ossetian 0 Pali 0 Pampanga 0 Pangasinan 0 Papiamento 0 Pedi 0 Pennsylvania German 0 Pfaelzisch 0 Picard 0 Piemontese 0 Pitcairn-Norfolk 0 Plateau Malagasy 0 Polish 0 Pontic 0 Portuguse 0 Punjabi 0 Pushto 0 Quechua 0 Rajasthani 0 Romansh 0 Rundi 0 Russia Buriat 0 Rusyn 0 Saidi Arabic 0 Samoan 0 Sango 0 Sanskrit 0 Santali 0 Sardinian 0 Saterfriesisch 0 Scots 0 Scottish Gaelic 0 Serbian 0 Serbo-Croatian 0 Shan 0 Shona 0 Sichuan Yi 0 Sicilian 0 Silesian 0 Sindhi 0 Sinhala 0 Skolt Sami 0 Slovak 0 Slovenian 0 Soi 0 Somali 0 South Azerbaijani 0 South Levantine Arabic 0 Southern Pashto 0 Southern Sotho 0 Sranan Tongo 0 Standard Arabic 0 Standard Latvian 0 Sundanese 0 Swahili 0 Swahili (macrolanguage) 0 Swati 0 Swedish 0 Swedish Sign Language 0 Swiss German 0 Swiss-German Sign Language 0 Tagalog 0 Tahitian 0 Tai 0 Tajik 0 Tamil 0 Tatar 0 Telugu 0 Tetum 0 Thai 0 Tibetan 0 Tigrinya 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tonga (Zambia) 0 Tosk Albanian 0 Tsonga 0 Tswana 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Tupinambá 0 Turkish Sign Language 0 Turkmen 0 Tuvinian 0 Twi 0 Udmurt 0 Uighur 0 Ukrainian 0 Upper Sorbian 0 Urdu 0 Uzbek 0 Venda 0 Venetian 0 Veps 0 Vlaams 0 Vlax Romani 0 Volapük 0 Votic 0 Walloon 0 Waray (Philippines) 0 Warlpiri 0 Welsh 0 West Central Oromo 0 Western Frisian 0 Western Mari 0 Western Panjabi 0 Wolof 0 Wu Chinese 0 Xhosa 0 Yakut 0 Yiddish 0 Yoruba 0 Yue Chinese 0 Zaza 0 Zeeuws 0 Zhuang 0 Zulu 0

12 dataset results for Data Augmentation AND Texts AND English