Search Results for author: David Ross

Found 8 papers, 4 papers with code

CAPE: CAM as a Probabilistic Ensemble for Enhanced DNN Interpretation

1 code implementation • 3 Apr 2024 • Townim Faisal Chowdhury, Kewen Liao, Vu Minh Hieu Phan, Minh-Son To, Yutong Xie, Kevin Hung, David Ross, Anton Van Den Hengel, Johan W. Verjans, Zhibin Liao

Deep Neural Networks (DNNs) are widely used for visual classification tasks, but their complex computation process and black-box nature hinder decision transparency and interpretability.

Decision Making

Paper
Code

UnLoc: A Unified Framework for Video Localization Tasks

1 code implementation • ICCV 2023 • Shen Yan, Xuehan Xiong, Arsha Nagrani, Anurag Arnab, Zhonghao Wang, Weina Ge, David Ross, Cordelia Schmid

While large-scale image-text pretrained models such as CLIP have been used for multiple video-level tasks on trimmed videos, their use for temporal localization in untrimmed videos is still a relatively unexplored task.

Ranked #1 on Action Segmentation on COIN

Action Segmentation Moment Retrieval +5

2,996

Paper
Code

im2nerf: Image to Neural Radiance Field in the Wild

no code implementations • 8 Sep 2022 • Lu Mi, Abhijit Kundu, David Ross, Frank Dellaert, Noah Snavely, Alireza Fathi

We take a step towards addressing this shortcoming by introducing a model that encodes the input image into a disentangled object representation that contains a code for object shape, a code for object appearance, and an estimated camera pose from which the object image is captured.

Novel View Synthesis Object

Paper
Add Code

Optical Mouse: 3D Mouse Pose From Single-View Video

no code implementations • 17 Jun 2021 • Bo Hu, Bryan Seybold, Shan Yang, David Ross, Avneesh Sud, Graham Ruby, Yi Liu

We present a method to infer the 3D pose of mice, including the limbs and feet, from monocular videos.

Paper
Add Code

Virtual Multi-view Fusion for 3D Semantic Segmentation

1 code implementation • ECCV 2020 • Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru

Features from multiple per view predictions are finally fused on 3D mesh vertices to predict mesh semantic segmentation labels.

Ranked #12 on Semantic Segmentation on ScanNet

2D Semantic Segmentation 3D Semantic Segmentation +2

Paper
Code

Pillar-based Object Detection for Autonomous Driving

1 code implementation • ECCV 2020 • Yue Wang, Alireza Fathi, Abhijit Kundu, David Ross, Caroline Pantofaru, Thomas Funkhouser, Justin Solomon

We present a simple and flexible object detection framework optimized for autonomous driving.

3D Object Detection Autonomous Driving +2

131

Paper
Code

DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes

no code implementations • CVPR 2020 • Mahyar Najibi, Guangda Lai, Abhijit Kundu, Zhichao Lu, Vivek Rathod, Thomas Funkhouser, Caroline Pantofaru, David Ross, Larry S. Davis, Alireza Fathi

In contrast, we propose a general-purpose method that works on both indoor and outdoor scenes.

3D Object Detection Autonomous Driving +2

Paper
Add Code

Speech2Action: Cross-modal Supervision for Action Recognition

no code implementations • CVPR 2020 • Arsha Nagrani, Chen Sun, David Ross, Rahul Sukthankar, Cordelia Schmid, Andrew Zisserman

We train a BERT-based Speech2Action classifier on over a thousand movie screenplays, to predict action labels from transcribed speech segments.

Action Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.