Datasets

4,207 machine learning datasets
Filter by Language (clear)
Multilingual English 849 Chinese 123 German 86 French 63 Spanish 57 Japanese 49 Arabic 44 Russian 43 Italian 40 Portuguese 34 Korean 31 Turkish 30 Vietnamese 25 Hindi 24 Dutch 23 Finnish 23 Czech 20 Persian 19 Polish 18 Romanian 18 Tamil 17 Telugu 16 Thai 16 Indonesian 15 Urdu 15 Estonian 13 Malayalam 13 Swedish 13 Basque 12 Bengali 12 Bulgarian 11 Danish 11 Hungarian 11 Catalan 10 Hebrew 10 Kannada 10 Mandarin Chinese 10 Marathi 10 Greek 9 Norwegian 9 Ukrainian 9 Armenian 8 Breton 8 Gujarati 8 Kazakh 8 Slovak 8 Slovenian 8 Albanian 7 Amharic 7 Assamese 7 Croatian 7 Latvian 7 Lithuanian 7 Punjabi 7 Serbian 7 Swahili 7 Welsh 7 Esperanto 6 Georgian 6 Kurdish 6 Macedonian 6 Mongolian 6 Sinhala 6 Afrikaans 5 Galician 5 Icelandic 5 Irish 5 Maltese 5 Oriya (macrolanguage) 5 Sanskrit 5 Tagalog 5 Yoruba 5 Belarusian 4 Bosnian 4 Chechen 4 Haitian 4 Igbo 4 Latin 4 Malagasy 4 Scottish Gaelic 4 Serbo-Croatian 4 Sindhi 4 Standard Arabic 4 Tatar 4 Wolof 4 Aragonese 3 Azerbaijani 3 Bavarian 3 Bishnupriya 3 Burmese 3 Central Khmer 3 Chuvash 3 Dhivehi 3 Egyptian Arabic 3 Erzya 3 Filipino 3 Guarani 3 Hausa 3 Javanese 3 Kinyarwanda 3 Lao 3 Malay (individual language) 3 Norwegian Nynorsk 3 Quechua 3 Romansh 3 Russia Buriat 3 Somali 3 South Azerbaijani 3 Sundanese 3 Swiss German 3 Uighur 3 Upper Sorbian 3 Uzbek 3 Yiddish 3 American Sign Language 2 Asturian 2 Avaric 2 Bambara 2 Bashkir 2 Cebuano 2 Central Bikol 2 Central Kurdish 2 Cherokee 2 Church Slavic 2 Cornish 2 Dimli (individual language) 2 Eastern Mari 2 Faroese 2 Fon 2 Fulah 2 Ganda 2 Goan Konkani 2 Gothic 2 Ido 2 Iloko 2 Interlingue 2 Iranian Persian 2 Jejueo 2 Kabyle 2 Kalmyk 2 Karachay-Balkar 2 Kirghiz 2 Komi 2 Komi-Permyak 2 Lezghian 2 Limburgan 2 Lingala 2 Livvi 2 Lojban 2 Lombard 2 Low German 2 Lower Sorbian 2 Luxembourgish 2 Maithili 2 Manx 2 Mazanderani 2 Minangkabau 2 Mingrelian 2 Mirandese 2 Modern Greek 2 Moksha 2 Neapolitan 2 Nepali (macrolanguage) 2 Newari 2 Nigerian Pidgin 2 Northern Frisian 2 Northern Luri 2 Northern Sami 2 Occitan (post 1500) 2 Oromo 2 Ossetian 2 Pampanga 2 Piemontese 2 Pushto 2 Sardinian 2 Sicilian 2 Swati 2 Tajik 2 Tibetan 2 Tswana 2 Turkish Sign Language 2 Turkmen 2 Tuvinian 2 Venetian 2 Volapük 2 Walloon 2 Waray (Philippines) 2 Western Frisian 2 Western Mari 2 Western Panjabi 2 Wu Chinese 2 Xhosa 2 Yakut 2 Yue Chinese 2 Abkhazian 1 Achinese 1 Adyghe 1 Afar 1 Akan 1 Akkadian 1 Akuntsu 1 Ancient Greek 1 Apurinã 1 Arpitan 1 Assyrian Neo-Aramaic 1 Aymara 1 Bangladeshi Sign Language 1 Banjar 1 Bhojpuri 1 Bislama 1 Buginese 1 Chamorro 1 Chavacano 1 Cheyenne 1 Choctaw 1 Chukot 1 Coptic 1 Corsican 1 Cree 1 Creek 1 Crimean Tatar 1 Dzongkha 1 Ewe 1 Extremaduran 1 Fiji Hindi 1 Fijian 1 Friulian 1 Gagauz 1 Gan Chinese 1 Geez 1 German Sign Language 1 Gilaki 1 Greek Sign Language 1 Gulf Arabic 1 Hakha Chin 1 Hakka Chinese 1 Hawaiian 1 Herero 1 Hiri Motu 1 Interlingua (International Auxiliary Language Association) 1 Inuktitut 1 Inupiaq 1 Jamaican Creole English 1 Kabardian 1 Kalaallisut 1 Kanuri 1 Kara-Kalpak 1 Karelian 1 Kashmiri 1 Kashubian 1 Khunsari 1 Kikuyu 1 Komi-Zyrian 1 Kongo 1 Kuanyama 1 Kölsch 1 Ladino 1 Lak 1 Latgalian 1 Ligurian 1 Literary Chinese 1 Luo (Cameroon) 1 Luo (Kenya and Tanzania) 1 Malay (macrolanguage) 1 Maori 1 Marshallese 1 Mbyá Guaraní 1 Min Dong Chinese 1 Modern Greek (1453-) 1 Moroccan Arabic 1 Mundurukú 1 Narom 1 Nauru 1 Navajo 1 Nayini 1 Ndonga 1 Nepali (individual language) 1 Northern Kurdish 1 Norwegian Bokmål 1 Novial 1 Nyanja 1 Odia 1 Official Aramaic (700-300 BCE) 1 Old English (ca. 450-1100) 1 Old French 1 Old Russian 1 Old Turkish 1 Pali 1 Pangasinan 1 Papiamento 1 Pedi 1 Pennsylvania German 1 Pfaelzisch 1 Picard 1 Pitcairn-Norfolk 1 Pontic 1 Portuguse 1 Rundi 1 Rusyn 1 Samoan 1 Sango 1 Saterfriesisch 1 Scots 1 Shona 1 Sichuan Yi 1 Silesian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Southern Sotho 1 Sranan Tongo 1 Swahili (macrolanguage) 1 Swedish Sign Language 1 Swiss-German Sign Language 1 Tahitian 1 Tetum 1 Tigrinya 1 Tok Pisin 1 Tonga (Tonga Islands) 1 Tosk Albanian 1 Tsonga 1 Tulu 1 Tumbuka 1 Tunisian Arabic 1 Tupinambá 1 Twi 1 Udmurt 1 Venda 1 Veps 1 Vlaams 1 Vlax Romani 1 Votic 1 Warlpiri 1 Zeeuws 1 Zhuang 1 Zulu 1 Santali 0

16 dataset results for Multilingual