no code implementations • 20 Mar 2024 • Ruozhen He, Paola Cascante-Bonilla, Ziyan Yang, Alexander C. Berg, Vicente Ordonez
We introduce SynGround, a novel framework that combines data-driven learning and knowledge transfer from various large-scale pretrained models to enhance the visual grounding capabilities of a pretrained vision-and-language model.
no code implementations • 7 Dec 2023 • Ruozhen He, Paola Cascante-Bonilla, Ziyan Yang, Alexander C. Berg, Vicente Ordonez
Vision-and-language models trained to match images with text can be combined with visual explanation methods to point to the locations of specific objects in an image.
no code implementations • 31 Oct 2023 • Mykhailo Shvets, Dongxu Zhao, Marc Niethammer, Roni Sengupta, Alexander C. Berg
Multi-task approaches to joint depth and segmentation prediction are well-studied for monocular images.
18 code implementations • ICCV 2023 • Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, Ross Girshick
We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation.
Ranked #2 on Zero-Shot Instance Segmentation on LVIS v1.0 val
1 code implementation • CVPR 2022 • Yutong Bai, Xinlei Chen, Alexander Kirillov, Alan Yuille, Alexander C. Berg
In this work we present point-level region contrast, a self-supervised pre-training approach for the task of object detection.
no code implementations • NeurIPS 2021 • Aldo Pacchiano, Shaun Singh, Edward Chou, Alexander C. Berg, Jakob Foerster
The lender only observes whether a customer will repay a loan if the loan is issued to begin with, and thus modeled decisions affect what data is available to the lender for future decisions.
2 code implementations • CVPR 2021 • Bowen Cheng, Ross Girshick, Piotr Dollár, Alexander C. Berg, Alexander Kirillov
We perform an extensive analysis across different error types and object sizes and show that Boundary IoU is significantly more sensitive than the standard Mask IoU measure to boundary errors for large objects and does not over-penalize errors on smaller objects.
1 code implementation • ICCV 2021 • Ronghang Hu, Nikhila Ravi, Alexander C. Berg, Deepak Pathak
We present Worldsheet, a method for novel view synthesis using just a single RGB image as input.
1 code implementation • 30 Jun 2020 • Cody Coleman, Edward Chou, Julian Katz-Samuels, Sean Culatana, Peter Bailis, Alexander C. Berg, Robert Nowak, Roshan Sumbaly, Matei Zaharia, I. Zeki Yalniz
Many active learning and search approaches are intractable for large-scale industrial settings with billions of unlabeled examples.
no code implementations • 9 Aug 2019 • Phil Ammirato, Alexander C. Berg
The Probabilistic Object Detection Challenge evaluates object detection methods using a new evaluation measure, Probability-based Detection Quality (PDQ), on a new synthetic image dataset.
no code implementations • ICCV 2019 • Cheng-Yang Fu, Tamara L. Berg, Alexander C. Berg
In addition, the instance mask projection operator works well on other (non-clothing) datasets, providing an improvement of 3 points in mIOU on Thing classes of Cityscapes, a self-driving dataset, on top of a state-of-the-art approach.
no code implementations • 15 Apr 2019 • Sergei Alyamkin, Matthew Ardi, Alexander C. Berg, Achille Brighton, Bo Chen, Yiran Chen, Hsin-Pai Cheng, Zichen Fan, Chen Feng, Bo Fu, Kent Gauen, Abhinav Goel, Alexander Goncharenko, Xuyang Guo, Soonhoi Ha, Andrew Howard, Xiao Hu, Yuanjun Huang, Donghyun Kang, Jaeyoun Kim, Jong Gook Ko, Alexander Kondratyev, Junhyeok Lee, Seungjae Lee, Suwoong Lee, Zichao Li, Zhiyu Liang, Juzheng Liu, Xin Liu, Yang Lu, Yung-Hsiang Lu, Deeptanshu Malik, Hong Hanh Nguyen, Eunbyung Park, Denis Repin, Liang Shen, Tao Sheng, Fei Sun, David Svitov, George K. Thiruvathukal, Baiwu Zhang, Jingchi Zhang, Xiaopeng Zhang, Shaojie Zhuo
In addition to mobile phones, many autonomous systems rely on visual data for making decisions and some of these systems have limited energy (such as unmanned aerial vehicles also called drones and mobile robots).
no code implementations • 12 Mar 2019 • Chen Feng, Tao Sheng, Zhiyu Liang, Shaojie Zhuo, Xiaopeng Zhang, Liang Shen, Matthew Ardi, Alexander C. Berg, Yiran Chen, Bo Chen, Kent Gauen, Yung-Hsiang Lu
The IEEE Low-Power Image Recognition Challenge (LPIRC) is an annual competition started in 2015 that encourages joint hardware and software solutions for computer vision systems with low latency and power.
52 code implementations • 10 Jan 2019 • Cheng-Yang Fu, Mykhailo Shvets, Alexander C. Berg
COCO test-dev results are up to 41. 4 mAP for RetinaMask-101 vs 39. 1mAP for RetinaNet-101, while the runtime is the same during evaluation.
Ranked #154 on Object Detection on COCO minival
no code implementations • 3 Oct 2018 • Sergei Alyamkin, Matthew Ardi, Achille Brighton, Alexander C. Berg, Yiran Chen, Hsin-Pai Cheng, Bo Chen, Zichen Fan, Chen Feng, Bo Fu, Kent Gauen, Jongkook Go, Alexander Goncharenko, Xuyang Guo, Hong Hanh Nguyen, Andrew Howard, Yuanjun Huang, Donghyun Kang, Jaeyoun Kim, Alexander Kondratyev, Seungjae Lee, Suwoong Lee, Junhyeok Lee, Zhiyu Liang, Xin Liu, Juzheng Liu, Zichao Li, Yang Lu, Yung-Hsiang Lu, Deeptanshu Malik, Eunbyung Park, Denis Repin, Tao Sheng, Liang Shen, Fei Sun, David Svitov, George K. Thiruvathukal, Baiwu Zhang, Jingchi Zhang, Xiaopeng Zhang, Shaojie Zhuo
The Low-Power Image Recognition Challenge (LPIRC, https://rebootingcomputing. ieee. org/lpirc) is an annual competition started in 2015.
1 code implementation • 13 Mar 2018 • Phil Ammirato, Cheng-Yang Fu, Mykhailo Shvets, Jana Kosecka, Alexander C. Berg
While state-of-the-art general object detectors are getting better and better, there are not many systems specifically designed to take advantage of the instance detection problem.
no code implementations • ECCV 2018 • Eunbyung Park, Alexander C. Berg
The meta learning is driven by the goal of deep networks that can quickly be adapted to robustly model a particular target in future frames.
no code implementations • EMNLP 2017 • Cheng-Yang Fu, Joon Lee, Mohit Bansal, Alexander C. Berg
Sports channel video portals offer an exciting domain for research on multimodal, multilingual analysis.
2 code implementations • CVPR 2017 • Eunbyung Park, Jimei Yang, Ersin Yumer, Duygu Ceylan, Alexander C. Berg
Instead of taking a 'blank slate' approach, we first explicitly infer the parts of the geometry visible both in the input and novel views and then re-cast the remaining synthesis problem as image completion.
no code implementations • 27 Feb 2017 • Phil Ammirato, Patrick Poirson, Eunbyung Park, Jana Kosecka, Alexander C. Berg
We present a new public dataset with a focus on simulating robotic vision tasks in everyday indoor environments using real imagery.
no code implementations • 25 Feb 2017 • Georgios Georgakis, Arsalan Mousavian, Alexander C. Berg, Jana Kosecka
In this work we explore the ability of using synthetically generated composite images for training state-of-the-art object detectors, especially for object instance detection.
3 code implementations • 23 Jan 2017 • Cheng-Yang Fu, Wei Liu, Ananth Ranga, Ambrish Tyagi, Alexander C. Berg
The main contribution of this paper is an approach for introducing additional context into state-of-the-art general object detection.
no code implementations • 1 Nov 2016 • Tatiana Tommasi, Arun Mallya, Bryan Plummer, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg
This paper presents an approach for answering fill-in-the-blank multiple choice questions from the Visual Madlibs dataset.
no code implementations • 19 Sep 2016 • Patrick Poirson, Phil Ammirato, Cheng-Yang Fu, Wei Liu, Jana Kosecka, Alexander C. Berg
For applications in navigation and robotics, estimating the 3D pose of objects is as important as detection.
no code implementations • 12 Aug 2016 • Sirion Vittayakorn, Alexander C. Berg, Tamara L. Berg
Toward this goal, we utilize features from existing deep networks and also fine-tune new networks for temporal estimation.
no code implementations • 11 Aug 2016 • Tatiana Tommasi, Arun Mallya, Bryan Plummer, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg
This paper focuses on answering fill-in-the-blank style multiple choice questions from the Visual Madlibs dataset.
4 code implementations • 31 Jul 2016 • Licheng Yu, Patrick Poirson, Shan Yang, Alexander C. Berg, Tamara L. Berg
Humans refer to objects in their environments all the time, especially in dialogue with other people.
223 code implementations • 8 Dec 2015 • Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg
Experimental results on the PASCAL VOC, MS COCO, and ILSVRC datasets confirm that SSD has comparable accuracy to methods that utilize an additional object proposal step and is much faster, while providing a unified framework for both training and inference.
Ranked #3 on Object Detection on PASCAL VOC 2012
no code implementations • ICCV 2015 • M. Hadi Kiapour, Xufeng Han, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg
In this paper, we define a new task, Exact Street to Shop, where our goal is to match a real-world example of a garment item to the same item in an online shop.
no code implementations • ICCV 2015 • Licheng Yu, Eunbyung Park, Alexander C. Berg, Tamara L. Berg
In this paper, we introduce a new dataset consisting of 360, 001 focused natural language descriptions for 10, 738 images.
no code implementations • 19 Nov 2015 • Eunbyung Park, Alexander C. Berg
Although deep convolutional neural networks(CNNs) have achieved remarkable results on object detection and segmentation, pre- and post-processing steps such as region proposals and non-maximum suppression(NMS), have been required.
no code implementations • 11 Nov 2015 • Cheng-Yang Fu, Alexander C. Berg
This submission has been withdrawn by arXiv administrators because it is intentionally incomplete, which is in violation of our policies.
4 code implementations • 15 Jun 2015 • Wei Liu, Andrew Rabinovich, Alexander C. Berg
When we add our proposed global feature, and a technique for learning normalization parameters, accuracy increases consistently even over our improved versions of the baselines.
Ranked #39 on Semantic Segmentation on PASCAL VOC 2012 test
2 code implementations • CVPR 2015 • Xufeng Han, Thomas Leung, Yangqing Jia, Rahul Sukthankar, Alexander C. Berg
We perform a comprehensive set of experiments on standard datasets to carefully study the contributions of each aspect of MatchNet, with direct comparisons to established methods.
no code implementations • CVPR 2015 • Johannes L. Schonberger, Alexander C. Berg, Jan-Michael Frahm
Based on the insights of this evaluation, we propose a learning-based approach, the PAirwise Image Geometry Encoding (PAIGE), to efficiently identify image pairs with scene overlap without the need to perform exhaustive putative matching and geometric verification.
no code implementations • 31 May 2015 • Licheng Yu, Eunbyung Park, Alexander C. Berg, Tamara L. Berg
In this paper, we introduce a new dataset consisting of 360, 001 focused natural language descriptions for 10, 738 images.
12 code implementations • 1 Sep 2014 • Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, Li Fei-Fei
The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images.
no code implementations • NeurIPS 2011 • Jia Deng, Sanjeev Satheesh, Alexander C. Berg, Fei Li
We present a novel approach to efficiently learn a label tree for large scale classification with many classes.