Scene Understanding

189 papers with code

Boundary-Seeking Generative Adversarial Networks

We introduce a method for training GANs with discrete data that uses the estimated difference measure from the discriminator to compute importance weights for generated samples, thus providing a policy gradient for training the generator.

Unified Perceptual Parsing for Scene Understanding

In this paper, we study a new task called Unified Perceptual Parsing, which requires the machine vision systems to recognize as many visual concepts as possible from a given image.

LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation

As a result they are huge in terms of parameters and number of operations; hence slow too.

COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images

The goal of COCO-Text is to advance state-of-the-art in text detection and recognition in natural images.

ERFNet: Efficient Residual Factorized ConvNet for Real-time Semantic Segmentation

A comprehensive set of experiments on the publicly available Cityscapes dataset demonstrates that our system achieves an accuracy that is similar to the state of the art, while being orders of magnitude faster to compute than other architectures that achieve top precision.

Dilated Residual Networks

Convolutional networks for image classification progressively reduce resolution until the image is represented by tiny feature maps in which the spatial structure of the scene is no longer discernible.

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

We show that SegNet provides good performance with competitive inference time and more efficient inference memory-wise as compared to other architectures.

From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network

3D object detection from LiDAR point cloud is a challenging problem in 3D scene understanding and has many practical applications.

Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions

Deep learning models with convolutional and recurrent networks are now ubiquitous and analyze massive amounts of audio, image, video, text and graph data, with applications in automatic translation, speech-to-text, scene understanding, ranking user preferences, ad placement, etc.

