Datasets

9,773 machine learning datasets
Filter by Task (clear)
Language Modelling Question Answering 19 Reading Comprehension 13 Machine Translation 12 Text Generation 12 Machine Reading Comprehension 9 Text Summarization 9 Text Classification 8 Named Entity Recognition (NER) 7 Natural Language Inference 7 Task-Oriented Dialogue Systems 7 Dialogue Generation 6 Chinese Named Entity Recognition 5 Cross-Lingual Transfer 5 Natural Language Understanding 5 SSTOD 5 Translation 5 Chinese Reading Comprehension 4 Conversational Response Selection 4 Domain Adaptation 4 Open-Domain Dialog 4 Part-Of-Speech Tagging 4 Reading Comprehension (Few-Shot) 4 Reading Comprehension (One-Shot) 4 Reading Comprehension (Zero-Shot) 4 Sentiment Analysis 4 Slot Filling 4 Abstractive Text Summarization 3 Chinese Word Segmentation 3 Cloze (multi-choices) (Few-Shot) 3 Cloze (multi-choices) (One-Shot) 3 Cloze (multi-choices) (Zero-Shot) 3 Code Generation 3 Coreference Resolution 3 Cross-Lingual NER 3 Cross-Lingual Question Answering 3 Document Summarization 3 Fake News Detection 3 Grammatical Error Correction 3 Intent Detection 3 Math Word Problem Solving 3 Misinformation 3 Open-Domain Question Answering 3 Question Generation 3 Relation Extraction 3 Speech Recognition 3 Token Classification 3 Video Captioning 3 Visual Question Answering (VQA) 3 Visual Reasoning 3 Word Embeddings 3 Zero-Shot Cross-Lingual Transfer 3 Automatic Post-Editing 2 Chinese Sentence Pair Classification 2 Conversational Response Generation 2 Cross-Lingual Natural Language Inference 2 Cross-Lingual POS Tagging 2 Data Augmentation 2 Dialogue Understanding 2 Emotion Recognition in Conversation 2 Entity Alignment 2 Entity Linking 2 Event Detection 2 Event Extraction 2 FLUE 2 Gesture Generation 2 Hate Speech Detection 2 Intent Classification 2 Mathematical Reasoning 2 Max-Shot Cross-Lingual Visual Reasoning 2 Multi-Document Summarization 2 Multi-Label Classification 2 Multi-Task Learning 2 Multimodal Emotion Recognition 2 Multiple-choice 2 Optical Character Recognition (OCR) 2 Recommendation Systems 2 Relation Classification 2 Scene Text Detection 2 Scene Text Recognition 2 Semantic Similarity 2 Speech-to-Text Translation 2 Spelling Correction 2 Spoken Language Understanding 2 Text Matching 2 Text-To-Speech Synthesis 2 Zero-Shot Cross-Lingual Visual Reasoning 2 Zero-Shot Learning 2 3D Face Animation 1 Abusive Language 1 Anchor link prediction 1 Answer Generation 1 Arithmetic Reasoning 1 Aspect-Based Sentiment Analysis (ABSA) 1 Aspect-Category-Opinion-Sentiment Quadruple Extraction 1 Bias Detection 1 Citation Recommendation 1 Code Search 1 Common Sense Reasoning 1 Common Sense Reasoning (Few-Shot) 1 Common Sense Reasoning (One-Shot) 1 Common Sense Reasoning (Zero-Shot) 1 Connective Detection 1 Conversational Sentiment Quadruple Extraction 1 Core Psychological Reasoning 1 Cross-Lingual Abstractive Summarization 1 Cross-Lingual Bitext Mining 1 Cross-Lingual Document Classification 1 Cross-Lingual Entity Linking 1 Cross-Lingual Paraphrase Identification 1 Cross-Lingual Sentiment Classification 1 Cross-lingual zero-shot dependency parsing 1 Curved Text Detection 1 Decipherment 1 Dependency Parsing 1 Dialog Act Classification 1 Dialog Relation Extraction 1 Dialogue Act Classification 1 Dialogue State Tracking 1 Discourse Parsing 1 Discourse Segmentation 1 Document Classification 1 Document Level Machine Translation 1 Document Translation 1 Document-level Event Extraction 1 Emotion Recognition 1 Emotion-Cause Pair Extraction 1 Emotional Dialogue Acts 1 End-To-End Dialogue Modelling 1 Entity Disambiguation 1 Entity Resolution 1 Entity Typing 1 Explanation Generation 1 FG-1-PG-1 1 Few-shot NER 1 Font Generation 1 Generalized Zero-Shot Learning 1 Genre classification 1 Gloss-free Sign Language Translation 1 Grammatical Error Detection 1 Hallucination Evaluation 1 Image Classification 1 Implicit Discourse Relation Classification 1 Instance Segmentation 1 Intent Recognition 1 Keyword Extraction 1 Knowledge Graphs 1 LABELED_DEPENDENCIES 1 LEMMA 1 Language Identification 1 License Plate Detection 1 Link Prediction 1 Low-Resource Neural Machine Translation 1 MORPH 1 Max-Shot Cross-Lingual Image-to-Text Retrieval 1 Max-Shot Cross-Lingual Text-to-Image Retrieval 1 Max-Shot Cross-Lingual Visual Natural Language Inference 1 Max-Shot Cross-Lingual Visual Question Answering 1 Medical Concept Normalization 1 Medical Relation Extraction 1 Medical Visual Question Answering 1 Meta-Learning 1 Metric Learning 1 Morphological Analysis 1 Multi-Choice MRC 1 Multi-Label Text Classification 1 Multi-modal Dialogue Generation 1 Multi-modal Entity Alignment 1 Multi-task Language Understanding 1 Multilingual Machine Comprehension in English Hindi 1 Multilingual NLP 1 Multilingual Named Entity Recognition 1 Multilingual text classification 1 Multimodal Sentiment Analysis 1 Multiple Instance Learning 1 Music Source Separation 1 NER 1 Natural Language Inference (Few-Shot) 1 Natural Language Inference (One-Shot) 1 Natural Language Inference (Zero-Shot) 1 Natural Questions 1 Neural Architecture Search 1 Node Classification 1 Object Detection 1 Open Information Extraction 1 Open-set video tagging 1 Outlier Detection 1 POS 1 Paraphrase Generation 1 Paraphrase Identification 1 Passage Retrieval 1 Personality Recognition in Conversation 1 Personality Trait Recognition 1 Personalized and Emotional Conversation 1 Phrase Grounding 1 Poll Generation 1 Pretrained Multilingual Language Models 1 Propaganda detection 1 Propaganda technique identification 1 Question Similarity 1 Record linking 1 Retrieval 1 SENTS 1 Self-Supervised Learning 1 Semantic Frame Parsing 1 Semantic Parsing 1 Semantic Role Labeling 1 Semantic Segmentation 1 Semantic Textual Similarity 1 Sentence Classification 1 Sentence Embeddings 1 Sign Language Recognition 1 Sign Language Retrieval 1 Sign Language Translation 1 Span-Extraction MRC 1 Speaker Diarization 1 Speaker Recognition 1 Speech Emotion Recognition 1 Speech Separation 1 Speech-to-Speech Translation 1 Story Completion 1 TAG 1 Term Extraction 1 Text Simplification 1 Text Spotting 1 Text Style Transfer 1 Text to Video Retrieval 1 Text-To-SQL 1 Text-to-video search 1 UNLABELED_DEPENDENCIES 1 Unsupervised Domain Adaptation 1 Video Question Answering 1 Video Retrieval 1 Vietnamese Machine Reading Comprehension 1 Vision-Language Navigation 1 Visual Question Answering 1 Weakly-Supervised Named Entity Recognition 1 XLM-R 1 Zero-Shot Cross-Lingual Image-to-Text Retrieval 1 Zero-Shot Cross-Lingual Text-to-Image Retrieval 1 Zero-Shot Cross-Lingual Visual Natural Language Inference 1 Zero-Shot Cross-Lingual Visual Question Answering 1 Zero-Shot Video Retrieval 1 coreference-resolution 1 text annotation 1 text2text-generation 1
Filter by Language (clear)
Chinese English 54 German 7 Russian 6 Spanish 6 Malayalam 5 Turkish 5 Vietnamese 5 Czech 4 French 4 Gujarati 4 Hindi 4 Kannada 4 Marathi 4 Punjabi 4 Tamil 4 Telugu 4 Arabic 3 Bengali 3 Dutch 3 Estonian 3 Finnish 3 Indonesian 3 Japanese 3 Korean 3 Multilingual 3 Persian 3 Romanian 3 Sindhi 3 Sinhala 3 Swedish 3 Thai 3 Urdu 3 Afrikaans 2 Albanian 2 Amharic 2 Armenian 2 Assamese 2 Azerbaijani 2 Basque 2 Belarusian 2 Bosnian 2 Breton 2 Bulgarian 2 Burmese 2 Catalan 2 Croatian 2 Danish 2 Esperanto 2 Galician 2 Georgian 2 Greek 2 Guarani 2 Haitian 2 Hebrew 2 Hungarian 2 Icelandic 2 Irish 2 Italian 2 Javanese 2 Kazakh 2 Kurdish 2 Lao 2 Latin 2 Latvian 2 Lithuanian 2 Macedonian 2 Malagasy 2 Mongolian 2 Norwegian 2 Oriya (macrolanguage) 2 Polish 2 Portuguese 2 Quechua 2 Romansh 2 Sanskrit 2 Scottish Gaelic 2 Serbian 2 Slovak 2 Slovenian 2 Somali 2 Sundanese 2 Swahili 2 Tagalog 2 Tatar 2 Ukrainian 2 Uzbek 2 Welsh 2 Yiddish 2 Yoruba 2 Aragonese 1 Asturian 1 Avaric 1 Bashkir 1 Bavarian 1 Bishnupriya 1 Cebuano 1 Central Bikol 1 Central Khmer 1 Central Kurdish 1 Chavacano 1 Chechen 1 Chuvash 1 Cornish 1 Dhivehi 1 Dimli (individual language) 1 Eastern Mari 1 Egyptian Arabic 1 Erzya 1 Filipino 1 Fulah 1 Ganda 1 Goan Konkani 1 Hausa 1 Ido 1 Igbo 1 Iloko 1 Interlingue 1 Kabyle 1 Kalmyk 1 Karachay-Balkar 1 Kirghiz 1 Komi 1 Lezghian 1 Limburgan 1 Lingala 1 Lojban 1 Lombard 1 Low German 1 Lower Sorbian 1 Luxembourgish 1 Maithili 1 Malay (individual language) 1 Maltese 1 Mazanderani 1 Minangkabau 1 Mingrelian 1 Mirandese 1 Modern Greek 1 Neapolitan 1 Newari 1 Northern Frisian 1 Northern Luri 1 Norwegian Nynorsk 1 Occitan (post 1500) 1 Oromo 1 Ossetian 1 Pampanga 1 Piemontese 1 Pushto 1 Russia Buriat 1 Sardinian 1 Serbo-Croatian 1 Sicilian 1 South Azerbaijani 1 Swati 1 Tajik 1 Tibetan 1 Tswana 1 Turkmen 1 Tuvinian 1 Uighur 1 Upper Sorbian 1 Venetian 1 Volapük 1 Walloon 1 Waray (Philippines) 1 Western Frisian 1 Western Mari 1 Western Panjabi 1 Wolof 1 Wu Chinese 1 Xhosa 1 Yakut 1 Yue Chinese 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Akan 0 Akkadian 0 Akuntsu 0 American Sign Language 0 Ancient Greek 0 Ancient Hebrew 0 Apurinã 0 Argentine Sign Language 0 Arpitan 0 Assyrian Neo-Aramaic 0 Aymara 0 Bambara 0 Bangala 0 Bangladeshi Sign Language 0 Banjar 0 Bemba (Zambia) 0 Bhojpuri 0 Bislama 0 Bodo (India) 0 Buginese 0 Central Pashto 0 Chamorro 0 Cherokee 0 Cheyenne 0 Choctaw 0 Chukot 0 Church Slavic 0 Congo Swahili 0 Coptic 0 Corsican 0 Cree 0 Creek 0 Crimean Tatar 0 Dogri (individual language) 0 Dogri (macrolanguage) 0 Dzongkha 0 Ewe 0 Extremaduran 0 Faroese 0 Fiji Hindi 0 Fijian 0 Fon 0 Friulian 0 Gagauz 0 Gan Chinese 0 Geez 0 German Sign Language 0 Gilaki 0 Gothic 0 Greek Sign Language 0 Gulf Arabic 0 Hakha Chin 0 Hakka Chinese 0 Halh Mongolian 0 Hawaiian 0 Herero 0 Hiri Motu 0 Interlingua (International Auxiliary Language Association) 0 Inuktitut 0 Inupiaq 0 Iranian Persian 0 Jamaican Creole English 0 Jejueo 0 Kabardian 0 Kabuverdianu 0 Kachin 0 Kalaallisut 0 Kanuri 0 Kara-Kalpak 0 Karelian 0 Kashmiri 0 Kashubian 0 Khunsari 0 Kikuyu 0 Kinyarwanda 0 Komi-Permyak 0 Komi-Zyrian 0 Kongo 0 Krio 0 Kuanyama 0 Kölsch 0 Ladino 0 Lak 0 Latgalian 0 Ligurian 0 Literary Chinese 0 Livvi 0 Lozi 0 Lunda 0 Luo (Cameroon) 0 Luo (Kenya and Tanzania) 0 Lushai 0 Malay (macrolanguage) 0 Mandarin Chinese 0 Manipuri 0 Manx 0 Maori 0 Marshallese 0 Mbyá Guaraní 0 Mesopotamian Arabic 0 Min Dong Chinese 0 Modern Greek (1453-) 0 Moksha 0 Moroccan Arabic 0 Mundurukú 0 Najdi Arabic 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Nayini 0 Ndonga 0 Nepali (individual language) 0 Nepali (macrolanguage) 0 Nigerian Fulfulde 0 Nigerian Pidgin 0 North Azerbaijani 0 North Levantine Arabic 0 Northern Huishui Hmong 0 Northern Kurdish 0 Northern Sami 0 Northern Uzbek 0 Norwegian Bokmål 0 Novial 0 Nyanja 0 Odia 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Old French 0 Old Russian 0 Old Turkish 0 Pali 0 Pangasinan 0 Papiamento 0 Pedi 0 Pennsylvania German 0 Pfaelzisch 0 Picard 0 Pitcairn-Norfolk 0 Plateau Malagasy 0 Pontic 0 Portuguse 0 Rajasthani 0 Rundi 0 Rusyn 0 Saidi Arabic 0 Samoan 0 Sango 0 Santali 0 Saterfriesisch 0 Scots 0 Shan 0 Shona 0 Sichuan Yi 0 Silesian 0 Skolt Sami 0 Soi 0 South Levantine Arabic 0 Southern Pashto 0 Southern Sotho 0 Sranan Tongo 0 Standard Arabic 0 Standard Latvian 0 Swahili (macrolanguage) 0 Swedish Sign Language 0 Swiss German 0 Swiss-German Sign Language 0 Tahitian 0 Tai 0 Tetum 0 Tigrinya 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tonga (Zambia) 0 Tosk Albanian 0 Tsonga 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Tupinambá 0 Turkish Sign Language 0 Twi 0 Udmurt 0 Venda 0 Veps 0 Vlaams 0 Vlax Romani 0 Votic 0 Warlpiri 0 West Central Oromo 0 Zaza 0 Zeeuws 0 Zhuang 0 Zulu 0

14 dataset results for Language Modelling AND Texts AND Chinese