In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals.
Ranked #2 on Vessel Detection on Vessel detection Dateset
Fast R-CNN builds on previous work to efficiently classify object proposals using deep convolutional networks.
Ranked #23 on Object Detection on PASCAL VOC 2007 (using extra training data)
Both convolutional and recurrent operations are building blocks that process one local neighborhood at a time.
Ranked #8 on Action Classification on Toyota Smarthome dataset (using extra training data)
Most methods for object instance segmentation require all training examples to be labeled with segmentation masks.
In contrast to previous region-based detectors such as Fast/Faster R-CNN that apply a costly per-region subnetwork hundreds of times, our region-based detector is fully convolutional with almost all computation shared on the entire image.
Ranked #4 on Real-Time Object Detection on PASCAL VOC 2007
Feature pyramids are a basic component in recognition systems for detecting objects at different scales.
Ranked #3 on Pedestrian Detection on TJU-Ped-campus
Our hypothesis is that the appearance of a person -- their pose, clothing, action -- is a powerful cue for localizing the objects they are interacting with.
Ranked #53 on Human-Object Interaction Detection on HICO-DET
Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance.
Ranked #1 on Keypoint Estimation on GRIT
In this work, we establish dense correspondences between RGB image and a surface-based representation of the human body, a task we refer to as dense human pose estimation.
Ranked #2 on Pose Estimation on DensePose-COCO
We investigate omni-supervised learning, a special regime of semi-supervised learning in which the learner exploits all available labeled data plus internet-scale sources of unlabeled data.