Datasets

9,499 machine learning datasets
Filter by Task (clear)
Action Recognition In Videos Action Recognition 74 Object Tracking 34 Video Understanding 32 Temporal Action Localization 30 Object Detection 29 Pose Estimation 29 Video Retrieval 29 Video Captioning 28 Multi-Object Tracking 26 Video Question Answering 25 Action Detection 22 Semantic Segmentation 22 Action Classification 21 Question Answering 20 Visual Object Tracking 20 Visual Tracking 17 Skeleton Based Action Recognition 16 Visual Question Answering (VQA) 16 3D Human Pose Estimation 15 Activity Recognition 15 Video Classification 15 Video Prediction 15 Video Object Segmentation 14 DeepFake Detection 13 Sign Language Recognition 13 Action Segmentation 12 Person Re-Identification 12 Facial Expression Recognition (FER) 11 Instance Segmentation 11 Anomaly Detection 10 Sign Language Translation 10 Trajectory Prediction 10 Video Generation 10 Video Object Tracking 10 Video Summarization 10 2D Object Detection 9 3D Pose Estimation 9 3D Reconstruction 9 Autonomous Driving 9 Depth Estimation 9 Human-Object Interaction Detection 9 Multiple Object Tracking 9 Video Segmentation 9 Zero-Shot Video Question Answer 9 Activity Detection 8 Ad-hoc video search 8 Face Anti-Spoofing 8 Optical Flow Estimation 8 Video Object Detection 8 Video Quality Assessment 8 3D Action Recognition 7 3D Object Detection 7 Audio Classification 7 Emotion Recognition 7 Emotion Recognition in Conversation 7 Multi-Task Learning 7 Online Multi-Object Tracking 7 Panoptic Segmentation 7 Self-Supervised Learning 7 Semi-Supervised Video Object Segmentation 7 Text-to-Video Generation 7 Unsupervised Video Object Segmentation 7 Video Grounding 7 Video Recognition 7 Zero-Shot Video Retrieval 7 2D Human Pose Estimation 6 2D Semantic Segmentation 6 3D Hand Pose Estimation 6 Action Quality Assessment 6 Audio-Visual Speech Recognition 6 Dense Video Captioning 6 Face Swapping 6 Hand Gesture Recognition 6 Hand Pose Estimation 6 Multimodal Activity Recognition 6 Novel View Synthesis 6 Pose Prediction 6 Pose Tracking 6 Scene Understanding 6 Spatio-Temporal Action Localization 6 Speech Recognition 6 Text to Video Retrieval 6 Video Anomaly Detection 6 Video Frame Interpolation 6 Video Inpainting 6 Video Instance Segmentation 6 Video Super-Resolution 6 Zero-Shot Action Recognition 6 3D Object Tracking 5 Anomaly Detection In Surveillance Videos 5 Decision Making 5 Emotion Classification 5 Face Recognition 5 Face Verification 5 Human action generation 5 Lipreading 5 Moment Retrieval 5 Object Localization 5 Object Recognition 5 Text Generation 5 Video-Text Retrieval 5 Zero-Shot Learning 5 3D Absolute Human Pose Estimation 4 Action Anticipation 4 Action Triplet Recognition 4 Action Understanding 4 Crowd Counting 4 Deblurring 4 Disentanglement 4 Domain Adaptation 4 Face Detection 4 Facial Emotion Recognition 4 Few Shot Action Recognition 4 Gaze Estimation 4 Gesture Recognition 4 Human Detection 4 Human Pose Forecasting 4 Image Classification 4 Image Generation 4 Image Inpainting 4 Interactive Video Object Segmentation 4 Lane Detection 4 Motion Segmentation 4 Multi-Label Classification 4 Multi-Label Learning 4 Multimodal Sentiment Analysis 4 Person Search 4 Real-Time Multi-Object Tracking 4 Real-Time Object Detection 4 Self-Supervised Action Recognition 4 Speech Emotion Recognition 4 Temporal Action Proposal Generation 4 Trajectory Forecasting 4 Unsupervised Domain Adaptation 4 Unsupervised Object Segmentation 4 Video Description 4 Video Emotion Recognition 4 Video Semantic Segmentation 4 Video-Adverb Retrieval 4 Visual Speech Recognition 4 Weakly Supervised Action Localization 4 motion prediction 4 2D Pose Estimation 3 3D Classification 3 3D Human Reconstruction 3 3D Lane Detection 3 6D Pose Estimation 3 Abnormal Event Detection In Video 3 Action Localization 3 Autonomous Vehicles 3 Camera shot boundary detection 3 Classification 3 Early Action Prediction 3 Face Presentation Attack Detection 3 Facial Action Unit Detection 3 Gait Recognition 3 Genre classification 3 Heart rate estimation 3 Human Interaction Recognition 3 Image Quality Assessment 3 Image Retrieval 3 Image Super-Resolution 3 Lip Reading 3 Medical Image Segmentation 3 Monocular Depth Estimation 3 Motion Forecasting 3 Multimodal Deep Learning 3 Multimodal Emotion Recognition 3 Multiple Instance Learning 3 Natural Language Moment Retrieval 3 Online Action Detection 3 Pedestrian Detection 3 Quantization 3 Referring Expression Segmentation 3 Robot Navigation 3 Small Object Detection 3 Speech Separation 3 Surgical tool detection 3 Temporal Forgery Localization 3 Unconstrained Lip-synchronization 3 Unsupervised Anomaly Detection 3 Unsupervised Person Re-Identification 3 Unsupervised Skeleton Based Action Recognition 3 Video Alignment 3 Video Denoising 3 Video Polyp Segmentation 3 Video Reconstruction 3 Video Restoration 3 Video Salient Object Detection 3 Video-Adverb Retrieval (Unseen Compositions) 3 Visual Keyword Spotting 3 3D Human Pose Tracking 2 3D Object Classification 2 3D Object Detection From Stereo Images 2 3D Object Recognition 2 3D Shape Reconstruction 2 6D Pose Estimation using RGB 2 6D Pose Estimation using RGBD 2 Accident Anticipation 2 Action Parsing 2 Action Spotting 2 Action Unit Detection 2 Active Learning 2 Active Speaker Localization 2 Activity Prediction 2 Activity Recognition In Videos 2 Amodal Instance Segmentation 2 Animal Action Recognition 2 Animal Pose Estimation 2 Arousal Estimation 2 Atomic action recognition 2 Audio-Visual Active Speaker Detection 2 Audio-Visual Synchronization 2 Automatic Speech Recognition (ASR) 2 Bayesian Inference 2 Boundary Captioning 2 Boundary Detection 2 Boundary Grounding 2 Camouflaged Object Segmentation 2 Class-agnostic Object Detection 2 Colorectal Polyps Characterization 2 Copy Detection 2 Cross-domain 3D Human Pose Estimation 2 Denoising 2 Depression Detection 2 Dialogue Act Classification 2 Domain Generalization 2 Driver Attention Monitoring 2 Egocentric Activity Recognition 2 Event Detection 2 Event Segmentation 2 Face Alignment 2 Face Identification 2 Facial Landmark Detection 2 Few Shot Temporal Action Localization 2 Few-Shot Learning 2 Fine-Grained Action Detection 2 Gaze Prediction 2 Generalizable Novel View Synthesis 2 Generalized Zero Shot skeletal action recognition 2 Generic Event Boundary Detection 2 Group Activity Recognition 2 Homography Estimation 2 Human Activity Recognition 2 Human Part Segmentation 2 Human motion prediction 2 Information Retrieval 2 Interactive Segmentation 2 Kinematic Based Workflow Recognition 2 Lightweight Face Recognition 2 Lip to Speech Synthesis 2 Long-tail Learning 2 Metric Learning 2 Motion Estimation 2 Motion Synthesis 2 Multi-Animal Tracking with identification 2 Multi-Hypotheses 3D Human Pose Estimation 2 Multi-Object Tracking and Segmentation 2 Multi-Person Pose Estimation 2 Multi-future Trajectory Prediction 2 Multi-object discovery 2 Multiple Object Tracking with Transformer 2 Multiple People Tracking 2 Multiview Learning 2 Music Information Retrieval 2 Natural Language Queries 2 Natural Language Visual Grounding 2 Neural Rendering 2 Object Counting 2 Open Vocabulary Action Recognition 2 Open World Object Detection 2 Partially Relevant Video Retrieval 2 Person Identification 2 Person Recognition 2 Photoplethysmography (PPG) heart rate estimation 2 Point Tracking 2 Pose Retrieval 2 Real-Time Semantic Segmentation 2 Retrieval 2 Robust Object Detection 2 Scene Change Detection 2 Scene Flow Estimation 2 Scene Graph Generation 2 Scene Text Recognition 2 Self-Supervised Action Recognition Linear 2 Self-supervised Video Retrieval 2 Semantic Object Interaction Classification 2 Semi-Supervised Action Detection 2 Sentiment Analysis 2 Simultaneous Localization and Mapping 2 Skills Assessment 2 Skills Evaluation 2 Speaker Recognition 2 Speaker Verification 2 Speech Enhancement 2 Steering Control 2 Stereo Matching 2 Supervised Video Summarization 2 Surgical Gesture Recognition 2 Talking Face Generation 2 Text Summarization 2 Text-to-video search 2 Thermal Infrared Object Tracking 2 Traffic Accident Detection 2 Unsupervised 3D Human Pose Estimation 2 Unsupervised Human Pose Estimation 2 Unsupervised Video Summarization 2 Valence Estimation 2 Vehicle Re-Identification 2 Video Compression 2 Video Emotion Detection 2 Video Enhancement 2 Video Matting 2 Video Panoptic Segmentation 2 Video Saliency Prediction 2 Video Synchronization 2 Video Visual Relation Detection 2 Video Visual Relation Tagging 2 Video scene graph generation 2 Video-Based Person Re-Identification 2 Video-to-image Affordance Grounding 2 Vision and Language Navigation 2 Visual Odometry 2 Visual Reasoning 2 Weakly Supervised Action Segmentation (Transcript) 2 Weakly Supervised Object Detection 2 Weakly Supervised Temporal Action Localization 2 Weakly-supervised 3D Human Pose Estimation 2 Weather Forecasting 2 Zero Shot Skeletal Action Recognition 2 Zero-Shot Action Detection 2 Zero-Shot Composed Image Retrieval (ZS-CIR) 2 Zero-Shot Object Detection 2 Zero-shot dense video captioning 2 audio-visual learning 2 drone-based object tracking 2 2D Semantic Segmentation task 3 (25 classes) 1 3D Anomaly Detection 1 3D Car Instance Understanding 1 3D Depth Estimation 1 3D Face Reconstruction 1 3D Facial Expression Recognition 1 3D Feature Matching 1 3D Geometry Perception 1 3D Human Dynamics 1 3D Human Pose Estimation in Limited Data 1 3D Human Shape Estimation 1 3D Instance Segmentation 1 3D Multi-Object Tracking 1 3D Object Detection From Monocular Images 1 3D Object Reconstruction 1 3D Object Reconstruction From A Single Image 1 3D Object Retrieval 1 3D Point Cloud Matching 1 3D Point Cloud Reconstruction 1 3D Scene Reconstruction 1 3D Shape Representation 1 Abstractive Text Summarization 1 Action Triplet Detection 1 Active Object Detection 1 Activeness Detection 1 Add - PO 1 Add - PQ 1 Aesthetics Quality Assessment 1 Age Estimation 1 Amodal Panoptic Segmentation 1 Amodal Tracking 1 Analog Video Restoration 1 Anxiety Detection 1 Atari Games 1 Attribute 1 Audio Emotion Recognition 1 Audio Generation 1 Audio Source Separation 1 Audio-visual Question Answering 1 Binarization 1 Blind Image Quality Assessment 1 Box-supervised Instance Segmentation 1 Breast Cancer Detection 1 Breast Tumour Classification 1 Camera Auto-Calibration 1 Camera shot segmentation 1 Cell Segmentation 1 Change Detection 1 Clinical Concept Extraction 1 Color Mismatch Correction 1 Colorectal Gland Segmentation: 1 Colorization 1 Composed Image Retrieval (CoIR) 1 Composed Video Retrieval (CoVR) 1 Composite action recognition 1 Conditional Image Generation 1 Continual Learning 1 Contrastive Learning 1 Conversational Response Generation 1 Conversational Web Navigation 1 Counterfactual Planning 1 Cross-Modal Retrieval 1 Data Augmentation 1 Deep Attention 1 Defect Detection 1 Depth And Camera Motion 1 Descriptive 1 Dialog Act Classification 1 Dialogue Evaluation 1 Dialogue Generation 1 Dimensionality Reduction 1 Disparity Estimation 1 Dominance Estimation 1 Drivable Area Detection 1 Drone Pose Estimation 1 Dynamic Facial Expression Recognition 1 Emotional Dialogue Acts 1 English Conversational Speech Recognition 1 Face Clustering 1 Face Generation 1 Facial Attribute Classification 1 Facial Expression Translation 1 Facial Inpainting 1 Facial expression generation 1 Fact Checking 1 Few Shot Open Set Object Detection 1 Few-Shot Image Classification 1 Few-Shot Object Detection 1 Fill Mask 1 Fine-Grained Vehicle Classification 1 Fine-Grained Visual Categorization 1 Fine-Grained Visual Recognition 1 Fine-grained Action Recognition 1 Fire Detection 1 Future Hand Prediction 1 Future prediction 1 GZSL Video Classification 1 Gait Identification 1 Gaze Target Estimation 1 Gender Prediction 1 General Action Video Anomaly Detection 1 General Classification 1 Generalizable Person Re-identification 1 Generalized Zero-Shot Object Detection 1 Gesture Generation 1 Global 3D Human Pose Estimation 1 Gloss-free Sign Language Translation 1 Graph Matching 1 Group Anomaly Detection 1 HD semantic map learning 1 HDR Reconstruction 1 Hand Detection 1 Hand Joint Reconstruction 1 Hand Segmentation 1 Hand-Gesture Recognition 1 Head Pose Estimation 1 Headline Generation 1 Heart Rate Variability 1 Highlight Detection 1 Home Activity Monitoring 1 Human Dynamics 1 Human Instance Segmentation 1 Human fMRI response prediction 1 Human-Object Interaction Anticipation 1 Human-Object-interaction motion tracking 1 Image Captioning 1 Image Deblurring 1 Image Dehazing 1 Image Denoising 1 Image Generation from Scene Graphs 1 Image Manipulation 1 Image Outpainting 1 Image Registration 1 Image Relighting 1 Image Restoration 1 Image-level Supervised Instance Segmentation 1 Image-to-Text Retrieval 1 Imitation Learning 1 Imputation 1 Indoor Localization 1 Instrument Recognition 1 Inverse-Tone-Mapping 1 Joint Demosaicing and Denoising 1 Keypoint Detection 1 Knowledge Distillation 1 Language Modelling 1 Language-Based Temporal Localization 1 Layout-to-Image Generation 1 Lesion Detection 1 License Plate Detection 1 License Plate Recognition 1 Lip password classification 1 Localization In Video Forgery 1 Logical Reasoning Question Answering 1 Long Term Action Anticipation 1 Long Video Retrieval (Background Removed) 1 Long-video Activity Recognition 1 Low-Light Image Enhancement 1 audio-visual event localization 1 eXtreme-Video-Frame-Interpolation 1 hand-object pose 1 imbalanced classification 1
Filter by Language
English 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Afrikaans 0 Akan 0 Akkadian 0 Akuntsu 0 Albanian 0 American Sign Language 0 Amharic 0 Ancient Greek 0 Ancient Hebrew 0 Apurinã 0 Arabic 0 Aragonese 0 Argentine Sign Language 0 Armenian 0 Arpitan 0 Assamese 0 Assyrian Neo-Aramaic 0 Asturian 0 Avaric 0 Aymara 0 Azerbaijani 0 Bambara 0 Bangala 0 Bangladeshi Sign Language 0 Banjar 0 Bashkir 0 Basque 0 Bavarian 0 Belarusian 0 Bemba (Zambia) 0 Bengali 0 Bhojpuri 0 Bishnupriya 0 Bislama 0 Bodo (India) 0 Bosnian 0 Breton 0 Buginese 0 Bulgarian 0 Burmese 0 Catalan 0 Cebuano 0 Central Bikol 0 Central Khmer 0 Central Kurdish 0 Central Pashto 0 Chamorro 0 Chavacano 0 Chechen 0 Cherokee 0 Cheyenne 0 Chinese 0 Choctaw 0 Chukot 0 Church Slavic 0 Chuvash 0 Congo Swahili 0 Coptic 0 Cornish 0 Corsican 0 Cree 0 Creek 0 Crimean Tatar 0 Croatian 0 Czech 0 Danish 0 Dhivehi 0 Dimli (individual language) 0 Dogri (individual language) 0 Dogri (macrolanguage) 0 Dutch 0 Dzongkha 0 Eastern Mari 0 Egyptian Arabic 0 Erzya 0 Esperanto 0 Estonian 0 Ewe 0 Extremaduran 0 Faroese 0 Fiji Hindi 0 Fijian 0 Filipino 0 Finnish 0 Fon 0 French 0 Friulian 0 Fulah 0 Gagauz 0 Galician 0 Gan Chinese 0 Ganda 0 Geez 0 Georgian 0 German 0 German Sign Language 0 Gilaki 0 Goan Konkani 0 Gothic 0 Greek 0 Greek Sign Language 0 Guarani 0 Gujarati 0 Gulf Arabic 0 Haitian 0 Hakha Chin 0 Hakka Chinese 0 Halh Mongolian 0 Hausa 0 Hawaiian 0 Hebrew 0 Herero 0 Hindi 0 Hiri Motu 0 Hungarian 0 Icelandic 0 Ido 0 Igbo 0 Iloko 0 Indonesian 0 Interlingua (International Auxiliary Language Association) 0 Interlingue 0 Inuktitut 0 Inupiaq 0 Iranian Persian 0 Irish 0 Italian 0 Jamaican Creole English 0 Japanese 0 Javanese 0 Jejueo 0 Kabardian 0 Kabuverdianu 0 Kabyle 0 Kachin 0 Kalaallisut 0 Kalmyk 0 Kannada 0 Kanuri 0 Kara-Kalpak 0 Karachay-Balkar 0 Karelian 0 Kashmiri 0 Kashubian 0 Kazakh 0 Khunsari 0 Kikuyu 0 Kinyarwanda 0 Kirghiz 0 Komi 0 Komi-Permyak 0 Komi-Zyrian 0 Kongo 0 Korean 0 Krio 0 Kuanyama 0 Kurdish 0 Kölsch 0 Ladino 0 Lak 0 Lao 0 Latgalian 0 Latin 0 Latvian 0 Lezghian 0 Ligurian 0 Limburgan 0 Lingala 0 Literary Chinese 0 Lithuanian 0 Livvi 0 Lojban 0 Lombard 0 Low German 0 Lower Sorbian 0 Lozi 0 Lunda 0 Luo (Cameroon) 0 Luo (Kenya and Tanzania) 0 Lushai 0 Luxembourgish 0 Macedonian 0 Maithili 0 Malagasy 0 Malay (individual language) 0 Malay (macrolanguage) 0 Malayalam 0 Maltese 0 Mandarin Chinese 0 Manipuri 0 Manx 0 Maori 0 Marathi 0 Marshallese 0 Mazanderani 0 Mbyá Guaraní 0 Mesopotamian Arabic 0 Min Dong Chinese 0 Minangkabau 0 Mingrelian 0 Mirandese 0 Modern Greek 0 Modern Greek (1453-) 0 Moksha 0 Mongolian 0 Moroccan Arabic 0 Multilingual 0 Mundurukú 0 Najdi Arabic 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Nayini 0 Ndonga 0 Neapolitan 0 Nepali (individual language) 0 Nepali (macrolanguage) 0 Newari 0 Nigerian Fulfulde 0 Nigerian Pidgin 0 North Azerbaijani 0 North Levantine Arabic 0 Northern Frisian 0 Northern Huishui Hmong 0 Northern Kurdish 0 Northern Luri 0 Northern Sami 0 Northern Uzbek 0 Norwegian 0 Norwegian Bokmål 0 Norwegian Nynorsk 0 Novial 0 Nyanja 0 Occitan (post 1500) 0 Odia 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Old French 0 Old Russian 0 Old Turkish 0 Oriya (macrolanguage) 0 Oromo 0 Ossetian 0 Pali 0 Pampanga 0 Pangasinan 0 Papiamento 0 Pedi 0 Pennsylvania German 0 Persian 0 Pfaelzisch 0 Picard 0 Piemontese 0 Pitcairn-Norfolk 0 Plateau Malagasy 0 Polish 0 Pontic 0 Portuguese 0 Portuguse 0 Punjabi 0 Pushto 0 Quechua 0 Rajasthani 0 Romanian 0 Romansh 0 Rundi 0 Russia Buriat 0 Russian 0 Rusyn 0 Saidi Arabic 0 Samoan 0 Sango 0 Sanskrit 0 Santali 0 Sardinian 0 Saterfriesisch 0 Scots 0 Scottish Gaelic 0 Serbian 0 Serbo-Croatian 0 Shan 0 Shona 0 Sichuan Yi 0 Sicilian 0 Silesian 0 Sindhi 0 Sinhala 0 Skolt Sami 0 Slovak 0 Slovenian 0 Soi 0 Somali 0 South Azerbaijani 0 South Levantine Arabic 0 Southern Pashto 0 Southern Sotho 0 Spanish 0 Sranan Tongo 0 Standard Arabic 0 Standard Latvian 0 Sundanese 0 Swahili 0 Swahili (macrolanguage) 0 Swati 0 Swedish 0 Swedish Sign Language 0 Swiss German 0 Swiss-German Sign Language 0 Tagalog 0 Tahitian 0 Tai 0 Tajik 0 Tamil 0 Tatar 0 Telugu 0 Tetum 0 Thai 0 Tibetan 0 Tigrinya 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tonga (Zambia) 0 Tosk Albanian 0 Tsonga 0 Tswana 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Tupinambá 0 Turkish 0 Turkish Sign Language 0 Turkmen 0 Tuvinian 0 Twi 0 Udmurt 0 Uighur 0 Ukrainian 0 Upper Sorbian 0 Urdu 0 Uzbek 0 Venda 0 Venetian 0 Veps 0 Vietnamese 0 Vlaams 0 Vlax Romani 0 Volapük 0 Votic 0 Walloon 0 Waray (Philippines) 0 Warlpiri 0 Welsh 0 West Central Oromo 0 Western Frisian 0 Western Mari 0 Western Panjabi 0 Wolof 0 Wu Chinese 0 Xhosa 0 Yakut 0 Yiddish 0 Yoruba 0 Yue Chinese 0 Zaza 0 Zeeuws 0 Zhuang 0 Zulu 0

16 dataset results for Action Recognition In Videos AND Videos