Object detection is the task of detecting instances of objects of a certain class within an image. The state-of-the-art methods can be categorized into two main types: one-stage methods and two stage-methods. One-stage methods prioritize inference speed, and example models include YOLO, SSD and RetinaNet. Two-stage methods prioritize detection accuracy, and example models include Faster R-CNN, Mask R-CNN and Cascade R-CNN.
The most popular benchmark is the MSCOCO dataset. Models are typically evaluated according to a Mean Average Precision metric.
( Image credit: Detectron )
|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
With the recent progress of deep learning, advanced industrial object detectors are built for smart industrial applications.
Ranked #16 on Weakly Supervised Object Detection on PASCAL VOC 2007
Our radar object proposal network uses radar point clouds to generate 3D proposals from a set of 3D prior boxes.
In each layer of the GNN, apart from the linear transformation which maps the per node input features to the corresponding higher level features, a per node masked attention by specifying different weights to different nodes in its first ring neighborhood is also performed.
Our POMP method uses as input the current pose of an agent (e. g. a robot) and a RGB-D frame.
Domain adaptation for object detection tries to adapt the detector from labeled datasets to unlabeled ones for better performance.
Finally, we reconstruct the feature extractor to ensure that our model can obtain more richer and robust features.
Video object detection is a tough task due to the deteriorated quality of video sequences captured under complex environments.
Advances in deep learning have enabled the development of models that have exhibited a remarkable tendency to recognize and even localize actions in videos.
It dynamically adjust cropping size to balance cover proportion between objects and chips, which narrows object scale variation in training and improves performance without bells and whistels; In addtion, we introduce mosaic effective sloving object sparity and background similarity problems in areial dataset; To balance catgory, we present mask resampling in chips providing higher quality training sample; Our model achieves state-of-the-art perfomance on two popular aerial images datasets of VisDrone and UAVDT.