Datasets

9,501 machine learning datasets
Filter by Task
Machine Translation 14 Question Answering 11 Text Classification 7 Text Summarization 6 Information Retrieval 5 Speech Recognition 5 Abstractive Text Summarization 4 Cross-Lingual Transfer 4 Language Modelling 4 Natural Language Inference 4 Part-Of-Speech Tagging 4 Text Generation 4 Domain Adaptation 3 Misinformation 3 Named Entity Recognition (NER) 3 Natural Language Understanding 3 Reading Comprehension 3 Relation Classification 3 Relation Extraction 3 Retrieval 3 Token Classification 3 Translation 3 Word Embeddings 3 Automatic Post-Editing 2 Classification 2 Cross-Lingual NER 2 Cross-Lingual Natural Language Inference 2 Cross-Lingual POS Tagging 2 Data Augmentation 2 Discourse Segmentation 2 Document Summarization 2 Entity Alignment 2 FLUE 2 Fake News Detection 2 Generative Question Answering 2 Handwritten Text Recognition 2 Knowledge Base Question Answering 2 Language Acquisition 2 Language Identification 2 Multilingual NLP 2 Multilingual text classification 2 Open-Domain Question Answering 2 Paraphrase Identification 2 Sentence Embeddings 2 Sentiment Analysis 2 Sequence-to-sequence Language Modeling 2 Slot Filling 2 Speech-to-Text Translation 2 Text Categorization 2 Text Retrieval 2 Accented Speech Recognition 1 Arithmetic Reasoning 1 Automatic Lyrics Transcription 1 Automatic Speech Recognition 1 Automatic Speech Recognition (ASR) 1 Bias Detection 1 Binary text classification 1 COVID-19 Diagnosis 1 Chinese Reading Comprehension 1 Chinese Sentence Pair Classification 1 Citation Recommendation 1 Code Generation 1 Computed Tomography (CT) 1 Connective Detection 1 Constituency Parsing 1 Coreference Resolution 1 Croatian Text Diacritization 1 Cross-Lingual Abstractive Summarization 1 Cross-Lingual Bitext Mining 1 Cross-Lingual Document Classification 1 Cross-Lingual Paraphrase Identification 1 Cross-Lingual Question Answering 1 Cross-Lingual Sentiment Classification 1 Cross-lingual zero-shot dependency parsing 1 Czech Text Diacritization 1 Dependency Parsing 1 Dialect Identification 1 Dialogue Generation 1 Discourse Parsing 1 Document Classification 1 Document Translation 1 Entity Embeddings 1 Entity Linking 1 Event Extraction 1 Fact Verification 1 Few-shot NER 1 French Text Diacritization 1 Handwriting Recognition 1 Hate Speech Detection 1 Humanitarian 1 Hungarian Text Diacritization 1 Image Classification 1 Implicit Discourse Relation Classification 1 Interpretable Machine Learning 1 Irish Text Diacritization 1 Key Information Extraction 1 Keyword Spotting 1 Knowledge Graphs 1 LABELED_DEPENDENCIES 1 LEMMA 1 Latvian Text Diacritization 1 Long Form Question Answering 1 MORPH 1 Machine Reading Comprehension 1 Math Word Problem Solving 1 Max-Shot Cross-Lingual Image-to-Text Retrieval 1 Max-Shot Cross-Lingual Text-to-Image Retrieval 1 Max-Shot Cross-Lingual Visual Natural Language Inference 1 Max-Shot Cross-Lingual Visual Question Answering 1 Max-Shot Cross-Lingual Visual Reasoning 1 Medical Diagnosis 1 Medical Named Entity Recognition 1 Multi-modal Entity Alignment 1 Multi-task Language Understanding 1 Multilabel Text Classification 1 Multilingual Machine Comprehension in English Hindi 1 Multilingual Named Entity Recognition 1 Multimodal Lexical Translation 1 Multimodal Machine Translation 1 Multimodal Text Prediction 1 Multiple Choice Question Answering (MCQA) 1 Multiple-choice 1 Multiview Clustering 1 NER 1 Natural Questions 1 Nested Named Entity Recognition 1 News Classification 1 Node Classification 1 POS 1 Paraphrase Generation 1 Polish Text Diacritization 1 Pretrained Multilingual Language Models 1 Propaganda detection 1 Question-Answer-Generation 1 Reading Comprehension (Few-Shot) 1 Reading Comprehension (One-Shot) 1 Reading Comprehension (Zero-Shot) 1 Romanian Text Diacritization 1 SENTS 1 Science Question Answering 1 Semantic Role Labeling 1 Semantic Segmentation 1 Sentence Embedding 1 Sentence-Pair Classification 1 Sign Language Production 1 Sign Language Recognition 1 Sign Language Translation 1 Slovak Text Diacritization 1 Spanish Text Diacritization 1 Speaker Identification 1 Speaker Verification 1 Speech Synthesis 1 Speech-to-Speech Translation 1 Spoken Language Understanding 1 Spoken language identification 1 TAG 1 Table Retrieval 1 Temporal Relation Classification 1 Temporal Relation Extraction 1 Temporal Tagging 1 Text Pair Classification 1 Text Style Transfer 1 Text-To-SQL 1 Text-To-Speech Synthesis 1 Topic Classification 1 Translation deu-eng 1 Translation eng-deu 1 Turkish Text Diacritization 1 UNLABELED_DEPENDENCIES 1 Unsupervised Machine Translation 1 Vietnamese Machine Reading Comprehension 1 Vietnamese Text Diacritization 1 Visual Reasoning 1 Word Alignment 1 Word Sense Disambiguation 1 XLM-R 1 Zero-Shot Cross-Lingual Image-to-Text Retrieval 1 Zero-Shot Cross-Lingual Text-to-Image Retrieval 1 Zero-Shot Cross-Lingual Transfer 1 Zero-Shot Cross-Lingual Visual Natural Language Inference 1 Zero-Shot Cross-Lingual Visual Question Answering 1 Zero-Shot Cross-Lingual Visual Reasoning 1 Zero-Shot Machine Translation 1 Zero-shot Cross-lingual Fact-checking 1 text annotation 1
Filter by Language (clear)
French English 1445 Chinese 205 German 126 Spanish 93 Russian 88 Portuguese 61 Italian 59 Japanese 56 Arabic 54 Hindi 52 Korean 40 Turkish 39 Vietnamese 34 Dutch 33 Persian 31 Bengali 30 Czech 30 Tamil 30 Danish 29 Polish 28 Indonesian 27 Finnish 24 Romanian 24 Marathi 22 Multilingual 22 Telugu 21 Hungarian 19 Urdu 19 Greek 18 Swedish 18 Thai 18 Estonian 17 Bulgarian 15 Gujarati 15 Hebrew 15 Malayalam 15 Slovak 14 Swahili 14 Basque 13 Croatian 13 Punjabi 13 Ukrainian 13 Latvian 11 Mandarin Chinese 11 Norwegian 11 Slovenian 11 Amharic 10 Catalan 10 Kazakh 10 Lithuanian 10 Serbian 10 Kannada 9 Albanian 8 Armenian 8 Assamese 8 Irish 7 Oriya (macrolanguage) 7 Sanskrit 7 Sinhala 7 Tagalog 7 Welsh 7 Yoruba 7 Burmese 6 Georgian 6 Hausa 6 Icelandic 6 Igbo 6 Iranian Persian 6 Kurdish 6 Macedonian 6 Maltese 6 Mongolian 6 Somali 6 Afrikaans 5 Azerbaijani 5 Galician 5 Guarani 5 Haitian 5 Malay (individual language) 5 Norwegian Bokmål 5 Oromo 5 Sindhi 5 Uzbek 5 American Sign Language 4 Bambara 4 Belarusian 4 Breton 4 Egyptian Arabic 4 Filipino 4 Latin 4 Malagasy 4 Nigerian Pidgin 4 Norwegian Nynorsk 4 Odia 4 Scottish Gaelic 4 Serbo-Croatian 4 Tigrinya 4 Wolof 4 Bangala 3 Cebuano 3 Central Khmer 3 Central Kurdish 3 Chechen 3 Esperanto 3 Fulah 3 Ganda 3 Iloko 3 Javanese 3 Kirghiz 3 Lao 3 Lingala 3 Nepali (macrolanguage) 3 Quechua 3 South Azerbaijani 3 Standard Arabic 3 Sundanese 3 Upper Sorbian 3 Western Panjabi 3 Aragonese 2 Bashkir 2 Bavarian 2 Bhojpuri 2 Bishnupriya 2 Bosnian 2 Dhivehi 2 Erzya 2 Faroese 2 Goan Konkani 2 Jejueo 2 Kabyle 2 Kinyarwanda 2 Luo (Kenya and Tanzania) 2 Maithili 2 Malay (macrolanguage) 2 Modern Greek 2 Moroccan Arabic 2 Nepali (individual language) 2 Nyanja 2 Romansh 2 Russia Buriat 2 Swati 2 Tajik 2 Tatar 2 Tibetan 2 Tsonga 2 Tswana 2 Uighur 2 Waray (Philippines) 2 Xhosa 2 Yiddish 2 Yue Chinese 2 Akkadian 1 Akuntsu 1 Ancient Greek 1 Ancient Hebrew 1 Apurinã 1 Argentine Sign Language 1 Assyrian Neo-Aramaic 1 Asturian 1 Avaric 1 Aymara 1 Bemba (Zambia) 1 Central Bikol 1 Central Pashto 1 Chavacano 1 Chukot 1 Church Slavic 1 Chuvash 1 Congo Swahili 1 Coptic 1 Cornish 1 Dimli (individual language) 1 Dogri (macrolanguage) 1 Eastern Mari 1 Ewe 1 Fon 1 Geez 1 German Sign Language 1 Gothic 1 Gulf Arabic 1 Halh Mongolian 1 Ido 1 Interlingue 1 Inuktitut 1 Kabuverdianu 1 Kachin 1 Kalaallisut 1 Kalmyk 1 Karachay-Balkar 1 Karelian 1 Khunsari 1 Komi 1 Komi-Permyak 1 Komi-Zyrian 1 Krio 1 Lezghian 1 Limburgan 1 Literary Chinese 1 Livvi 1 Lojban 1 Lombard 1 Low German 1 Lower Sorbian 1 Lozi 1 Lunda 1 Luo (Cameroon) 1 Lushai 1 Luxembourgish 1 Manipuri 1 Manx 1 Maori 1 Mazanderani 1 Mbyá Guaraní 1 Mesopotamian Arabic 1 Minangkabau 1 Mingrelian 1 Mirandese 1 Moksha 1 Mundurukú 1 Najdi Arabic 1 Nayini 1 Neapolitan 1 Newari 1 Nigerian Fulfulde 1 North Azerbaijani 1 North Levantine Arabic 1 Northern Frisian 1 Northern Kurdish 1 Northern Luri 1 Northern Sami 1 Northern Uzbek 1 Occitan (post 1500) 1 Old French 1 Old Russian 1 Old Turkish 1 Ossetian 1 Pampanga 1 Pedi 1 Piemontese 1 Plateau Malagasy 1 Pushto 1 Rundi 1 Sardinian 1 Shan 1 Shona 1 Sicilian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Southern Pashto 1 Southern Sotho 1 Standard Latvian 1 Swedish Sign Language 1 Swiss German 1 Swiss-German Sign Language 1 Tonga (Zambia) 1 Tosk Albanian 1 Tupinambá 1 Turkmen 1 Tuvinian 1 Twi 1 Venetian 1 Volapük 1 Walloon 1 Warlpiri 1 West Central Oromo 1 Western Frisian 1 Western Mari 1 Wu Chinese 1 Yakut 1 Zulu 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Akan 0 Arpitan 0 Bangladeshi Sign Language 0 Banjar 0 Bislama 0 Bodo (India) 0 Buginese 0 Chamorro 0 Cherokee 0 Cheyenne 0 Choctaw 0 Corsican 0 Cree 0 Creek 0 Crimean Tatar 0 Dogri (individual language) 0 Dzongkha 0 Extremaduran 0 Fiji Hindi 0 Fijian 0 Friulian 0 Gagauz 0 Gan Chinese 0 Gilaki 0 Greek Sign Language 0 Hakha Chin 0 Hakka Chinese 0 Hawaiian 0 Herero 0 Hiri Motu 0 Interlingua (International Auxiliary Language Association) 0 Inupiaq 0 Jamaican Creole English 0 Kabardian 0 Kanuri 0 Kara-Kalpak 0 Kashmiri 0 Kashubian 0 Kikuyu 0 Kongo 0 Kuanyama 0 Kölsch 0 Ladino 0 Lak 0 Latgalian 0 Ligurian 0 Marshallese 0 Min Dong Chinese 0 Modern Greek (1453-) 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Ndonga 0 Northern Huishui Hmong 0 Novial 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Pali 0 Pangasinan 0 Papiamento 0 Pennsylvania German 0 Pfaelzisch 0 Picard 0 Pitcairn-Norfolk 0 Pontic 0 Portuguse 0 Rajasthani 0 Rusyn 0 Saidi Arabic 0 Samoan 0 Sango 0 Santali 0 Saterfriesisch 0 Scots 0 Sichuan Yi 0 Silesian 0 Sranan Tongo 0 Swahili (macrolanguage) 0 Tahitian 0 Tai 0 Tetum 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Turkish Sign Language 0 Udmurt 0 Venda 0 Veps 0 Vlaams 0 Vlax Romani 0 Votic 0 Zaza 0 Zeeuws 0 Zhuang 0

112 dataset results for Texts AND French