We empirically demonstrate a general and robust grid schedule that yields a significant out-of-the-box training speedup without a loss in accuracy for different models (I3D, non-local, SlowFast), datasets (Kinetics, Something-Something, Charades), and training settings (with and without pre-training, 128 GPUs or 1 GPU).
Ranked #1 on Video Classification on Kinetics
In this paper, we introduce SalsaNext for the uncertainty-aware semantic segmentation of a full 3D LiDAR point cloud in real-time.
Ranked #1 on 3D Semantic Segmentation on SemanticKITTI
Document classification is a challenging task with important applications.
Ranked #1 on Text Classification on Yelp-5
We confirm that most of the existing video datasets are statistically biased to only capture action videos from a limited number of countries.
Ranked #1 on Action Detection on Charades (using extra training data)
While the multi-branch architecture is one of the key ingredients to the success of computer vision tasks, it has not been well investigated in natural language processing, especially sequence learning tasks.
Ranked #2 on Machine Translation on IWSLT2014 German-English