The former networks are able to encode multi-scale contextual information by probing the incoming features with filters or pooling operations at multiple rates and multiple effective fields-of-view, while the latter networks can capture sharper object boundaries by gradually recovering the spatial information.
#2 best model for Semantic Segmentation on PASCAL VOC 2012 test
We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms.
#34 best model for Image Classification on ImageNet
Photorealistic image stylization concerns transferring style of a reference photo to a content photo with the constraint that the stylized photo should remain photorealistic.
In many domains of computer vision, generative adversarial networks (GANs) have achieved great success, among which the fam- ily of Wasserstein GANs (WGANs) is considered to be state-of-the-art due to the theoretical contributions and competitive qualitative performance.
To translate an image to another domain, we recombine its content code with a random style code sampled from the style space of the target domain.
Datasets, Transforms and Models specific to Computer Vision
#121 best model for Image Classification on ImageNet
Temporal action proposal generation is an important yet challenging problem, since temporal proposals with rich action content are indispensable for analysing real-world videos with long duration and high proportion irrelevant content.
We propose a straightforward method that simultaneously reconstructs the 3D facial structure and provides dense alignment.
SOTA for Face Alignment on AFLW-LFPA
In this paper, we study a new task called Unified Perceptual Parsing, which requires the machine vision systems to recognize as many visual concepts as possible from a given image.
#21 best model for Semantic Segmentation on ADE20K val