ME R-CNN: Multi-Expert R-CNN for Object Detection

4 Apr 2017 · Hyungtae Lee, Sungmin Eum, Heesung Kwon ·

We introduce Multi-Expert Region-based Convolutional Neural Network (ME R-CNN) which is equipped with multiple experts (ME) where each expert is learned to process a certain type of regions of interest (RoIs). This architecture better captures the appearance variations of the RoIs caused by different shapes, poses, and viewing angles. In order to direct each RoI to the appropriate expert, we devise a novel "learnable" network, which we call, expert assignment network (EAN). EAN automatically learns the optimal RoI-expert relationship even without any supervision of expert assignment. As the major components of ME R-CNN, ME and EAN, are mutually affecting each other while tied to a shared network, neither an alternating nor a naive end-to-end optimization is likely to fail. To address this problem, we introduce a practical training strategy which is tailored to optimize ME, EAN, and the shared network in an end-to-end fashion. We show that both of the architectures provide considerable performance increase over the baselines on PASCAL VOC 07, 12, and MS COCO datasets.

PDF Abstract