Activity Detection
63 papers with code • 1 benchmarks • 12 datasets
Detecting activities in extended videos.
Libraries
Use these libraries to find Activity Detection models and implementationsDatasets
Latest papers
Online speaker diarization of meetings guided by speech separation
The results show that our system improves the state-of-the-art on the AMI headset mix, using no oracle information and under full evaluation (no collar and including overlapped speech).
Advanced Image Segmentation Techniques for Neural Activity Detection via C-fos Immediate Early Gene Expression
This research contributes to the development of more efficient and automated image segmentation methods, advancing the understanding of neural function in neuroscience research.
Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations
Two metrics are proposed to evaluate AER performance with automatic segmentation based on time-weighted emotion and speaker classification errors.
ivrit.ai: A Comprehensive Dataset of Hebrew Speech for AI Research and Development
We introduce "ivrit. ai", a comprehensive Hebrew speech dataset, addressing the distinct lack of extensive, high-quality resources for advancing Automated Speech Recognition (ASR) technology in Hebrew.
Long-term Conversation Analysis: Exploring Utility and Privacy
The analysis of conversations recorded in everyday life requires privacy protection.
FunASR: A Fundamental End-to-End Speech Recognition Toolkit
FunASR offers models trained on large-scale industrial corpora and the ability to deploy them in applications.
Evaluation of Noise Reduction Methods for Sentence Recognition by Sinhala Speaking Listeners
Noise reduction is a crucial aspect of hearing aids, which researchers have been striving to address over the years.
Token Turing Machines
The model's memory module ensures that a new observation will only be processed with the contents of the memory (and not the entire history), meaning that it can efficiently process long sequences with a bounded computational cost at each step.
Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization
End-to-end diarization presents an attractive alternative to standard cascaded diarization systems because a single system can handle all aspects of the task at once.
SG-VAD: Stochastic Gates Based Speech Activity Detection
Our key idea is to model VAD as a denoising task, and construct a network that is designed to identify nuisance features for a speech classification task.