Activity Detection

63 papers with code • 1 benchmarks • 12 datasets

Detecting activities in extended videos.

Benchmarks

Add a Result

These leaderboards are used to track progress in Activity Detection

Trend	Dataset	Best Model	Paper	Code	Compare
	AVA-Speech	CNN-BiLSTM_best			See all

Libraries

Use these libraries to find Activity Detection models and implementations

alibaba-damo-academy/FunASR

3 papers

3,378

Datasets

Most implemented papers

Most implemented Social Latest No code

A Convolutional Neural Network Smartphone App for Real-Time Voice Activity Detection

SIP-Lab/CNN-VAD • IEEE Access 2018

This paper presents a smartphone app that performs real-time voice activity detection based on convolutional neural network.

Paper
Code

Temporal Gaussian Mixture Layer for Videos

piergiaj/tgm-icml19 • • ICLR 2019

We introduce a new convolutional layer named the Temporal Gaussian Mixture (TGM) layer and present how it can be used to efficiently capture longer-term temporal information in continuous activity videos.

Paper
Code

S3D: Single Shot multi-Span Detector via Fully 3D Convolutional Networks

dazhang-cv/S3D • • 21 Jul 2018

In this paper, we present a novel Single Shot multi-Span Detector for temporal activity detection in long, untrimmed videos using a simple end-to-end fully three-dimensional convolutional (Conv3D) network.

Paper
Code

Structure-Aware Convolutional Neural Networks

vector-1127/SACNNs • • NeurIPS 2018

Convolutional neural networks (CNNs) are inherently subject to invariable filters that can only aggregate local inputs with the same topological structures.

Paper
Code

The Second DIHARD Diarization Challenge: Dataset, task, and baselines

iiscleap/DIHARD_2019_baseline_alltracks • 18 Jun 2019

This paper introduces the second DIHARD challenge, the second in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variation in recording equipment, noise conditions, and conversational domain.

Paper
Code

Personalized Activity Recognition with Deep Triplet Embeddings

dmbee/fcn-core • • 15 Jan 2020

The novel subject triplet loss provides the best performance overall, and all personalized deep embeddings out-perform our baseline personalized engineered feature embedding and an impersonal fully convolutional neural network classifier.

Paper
Code

Argus: Efficient Activity Detection System for Extended Video Analysis

JunweiLiang/Object_Detection_Tracking • • Proceedings of the IEEE Winter Conference on Applications of Computer Vision Workshops 2020

We propose an Efficient Activity Detection System, Argus, for Extended Video Analysis in the surveillance scenario.

Paper
Code

Dual Attention in Time and Frequency Domain for Voice Activity Detection

Jo0o0Hyung/Dual-Attention-for-VAD • • 27 Mar 2020

The results show that the focal loss can improve the performance in various imbalance situations compared to the cross entropy loss, a commonly used loss function in VAD.

Paper
Code