no code implementations • ECCV 2020 • Jinwoo Choi, Gaurav Sharma, Samuel Schulter, Jia-Bin Huang
As the first novelty, we propose an attention mechanism which focuses on more discriminative clips and directly optimizes for video-level (cf.
Ranked #3 on Unsupervised Domain Adaptation on UCF-HMDB
no code implementations • 23 Apr 2024 • Abhishek Aich, Yumin Suh, Samuel Schulter, Manmohan Chandraker
With efficiency being a high priority for scaling such models, we observed that the state-of-the-art method Mask2Former uses ~50% of its compute only on the transformer encoder.
no code implementations • 6 Apr 2024 • Zaid Khan, Vijay Kumar BG, Samuel Schulter, Yun Fu, Manmohan Chandraker
We propose a method where we exploit existing annotations for a vision-language task to improvise a coarse reward signal for that task, treat the LLM as a policy, and apply reinforced self-training to improve the visual program synthesis ability of the LLM for that task.
no code implementations • 26 Mar 2024 • Mingfu Liang, Jong-Chyi Su, Samuel Schulter, Sparsh Garg, Shiyu Zhao, Ying Wu, Manmohan Chandraker
This necessitates an expensive process of continuously curating and annotating data with significant human effort.
1 code implementation • 29 Dec 2023 • Shiyu Zhao, Long Zhao, Vijay Kumar B. G, Yumin Suh, Dimitris N. Metaxas, Manmohan Chandraker, Samuel Schulter
The recent progress in language-based open-vocabulary object detection can be largely attributed to finding better ways of leveraging large-scale data with free-form text annotations.
no code implementations • ICCV 2023 • Abhishek Aich, Samuel Schulter, Amit K. Roy-Chowdhury, Manmohan Chandraker, Yumin Suh
Further, we present a simple but effective search algorithm that translates user constraints to runtime width configurations of both the shared encoder and task decoders, for sampling the sub-architectures.
2 code implementations • 11 Aug 2023 • Shiyu Zhao, Samuel Schulter, Long Zhao, Zhixing Zhang, Vijay Kumar B. G, Yumin Suh, Manmohan Chandraker, Dimitris N. Metaxas
This work identifies two challenges of using self-training in OVD: noisy PLs from VLMs and frequent distribution changes of PLs.
1 code implementation • CVPR 2023 • Zaid Khan, Vijay Kumar BG, Samuel Schulter, Xiang Yu, Yun Fu, Manmohan Chandraker
We introduce SelTDA (Self-Taught Data Augmentation), a strategy for finetuning large VLMs on small-scale VQA datasets.
no code implementations • CVPR 2023 • Zhixiang Min, Bingbing Zhuang, Samuel Schulter, Buyu Liu, Enrique Dunn, Manmohan Chandraker
Monocular 3D object localization in driving scenes is a crucial task, but challenging due to its ill-posed nature.
no code implementations • ICCV 2023 • Samuel Schulter, Vijay Kumar B G, Yumin Suh, Konstantinos M. Dafnis, Zhixing Zhang, Shiyu Zhao, Dimitris Metaxas
With more than 28K unique object descriptions on over 25K images, OmniLabel provides a challenging benchmark with diverse and complex object descriptions in a naturally open-vocabulary setting.
1 code implementation • 18 Jul 2022 • Shiyu Zhao, Zhixing Zhang, Samuel Schulter, Long Zhao, Vijay Kumar B. G, Anastasis Stathopoulos, Manmohan Chandraker, Dimitris Metaxas
We propose a novel method that leverages the rich semantics available in recent vision and language models to localize and classify objects in unlabeled images, effectively generating pseudo labels for object detection.
Ranked #15 on Open Vocabulary Object Detection on MSCOCO (using extra training data)
no code implementations • CVPR 2022 • Inkyu Shin, Yi-Hsuan Tsai, Bingbing Zhuang, Samuel Schulter, Buyu Liu, Sparsh Garg, In So Kweon, Kuk-Jin Yoon
In this paper, we propose and explore a new multi-modal extension of test-time adaptation for 3D semantic segmentation.
no code implementations • CVPR 2022 • Dripta S. Raychaudhuri, Yumin Suh, Samuel Schulter, Xiang Yu, Masoud Faraki, Amit K. Roy-Chowdhury, Manmohan Chandraker
In contrast to the existing dynamic multi-task approaches that adjust only the weights within a fixed architecture, our approach affords the flexibility to dynamically control the total computational cost and match the user-preferred task importance better.
1 code implementation • 27 Mar 2022 • Zaid Khan, Vijay Kumar BG, Xiang Yu, Samuel Schulter, Manmohan Chandraker, Yun Fu
Self-supervised vision-language pretraining from pure images and text with a contrastive loss is effective, but ignores fine-grained alignment due to a dual-stream architecture that aligns image and text representations only on a global level.
no code implementations • CVPR 2022 • Christian Simon, Masoud Faraki, Yi-Hsuan Tsai, Xiang Yu, Samuel Schulter, Yumin Suh, Mehrtash Harandi, Manmohan Chandraker
Humans have the ability to accumulate knowledge of new tasks in varying conditions, but deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
no code implementations • 22 Apr 2021 • Weizhe Liu, David Ferstl, Samuel Schulter, Lukas Zebedin, Pascal Fua, Christian Leistner
We introduce a novel approach to unsupervised and semi-supervised domain adaptation for semantic segmentation.
no code implementations • ECCV 2020 • Xiangyun Zhao, Samuel Schulter, Gaurav Sharma, Yi-Hsuan Tsai, Manmohan Chandraker, Ying Wu
To address this challenge, we design a framework which works with such partial annotations, and we exploit a pseudo labeling approach that we adapt for our specific case.
no code implementations • ECCV 2020 • Sujoy Paul, Yi-Hsuan Tsai, Samuel Schulter, Amit K. Roy-Chowdhury, Manmohan Chandraker
In this work, we propose a novel framework for domain adaptation in semantic segmentation with image-level weak labels in the target domain.
no code implementations • CVPR 2020 • Buyu Liu, Bingbing Zhuang, Samuel Schulter, Pan Ji, Manmohan Chandraker
(2) Introducing the LSTM and FTM modules improves the prediction consistency in videos.
no code implementations • ICLR 2019 • Yi-Hsuan Tsai, Kihyuk Sohn, Samuel Schulter, Manmohan Chandraker
To this end, we propose to learn discriminative feature representations of patches based on label histograms in the source domain, through the construction of a disentangled space.
8 code implementations • ICCV 2019 • Yi-Hsuan Tsai, Kihyuk Sohn, Samuel Schulter, Manmohan Chandraker
Predicting structured outputs such as semantic segmentation relies on expensive per-pixel annotations to learn supervised models like convolutional neural networks.
Ranked #22 on Image-to-Image Translation on SYNTHIA-to-Cityscapes
no code implementations • CVPR 2019 • Ziyan Wang, Buyu Liu, Samuel Schulter, Manmohan Chandraker
In this paper, we address the problem of inferring the layout of complex road scenes given a single camera as input.
no code implementations • ICLR 2019 • Nataniel Ruiz, Samuel Schulter, Manmohan Chandraker
Simulation is a useful tool in situations where training data for machine learning models is costly to annotate or even hard to acquire.
no code implementations • ECCV 2018 • Samuel Schulter, Menghua Zhai, Nathan Jacobs, Manmohan Chandraker
Given a single RGB image of a complex outdoor road scene in the perspective view, we address the novel problem of estimating an occlusion-reasoned semantic scene layout in the top-view.
no code implementations • 28 Mar 2018 • Tuan-Hung Vu, Wongun Choi, Samuel Schulter, Manmohan Chandraker
This paper proposes a novel memory-based online video representation that is efficient, accurate and predictive.
12 code implementations • CVPR 2018 • Yi-Hsuan Tsai, Wei-Chih Hung, Samuel Schulter, Kihyuk Sohn, Ming-Hsuan Yang, Manmohan Chandraker
In this paper, we propose an adversarial learning method for domain adaptation in the context of semantic segmentation.
Ranked #3 on Domain Adaptation on Synscapes-to-Cityscapes
no code implementations • CVPR 2017 • Samuel Schulter, Paul Vernaza, Wongun Choi, Manmohan Chandraker
In this work, we demonstrate that it is possible to learn features for network-flow-based data association via backpropagation, by expressing the optimum of a smoothed network flow problem as a differentiable function of the pairwise association costs.
no code implementations • ICCV 2015 • Gernot Riegler, Samuel Schulter, Matthias Ruther, Horst Bischof
However, this setting is not realistic for practical applications, because the blur is typically different for each test image.
no code implementations • 27 Oct 2015 • Georg Poier, Konstantinos Roditakis, Samuel Schulter, Damien Michel, Horst Bischof, Antonis A. Argyros
Model-based approaches to 3D hand tracking have been shown to perform well in a wide range of scenarios.
no code implementations • CVPR 2015 • Samuel Schulter, Christian Leistner, Horst Bischof
The aim of single image super-resolution is to reconstruct a high-resolution image from a single low-resolution input.
no code implementations • CVPR 2014 • Samuel Schulter, Christian Leistner, Paul Wohlhart, Peter M. Roth, Horst Bischof
In this way, we can simultaneously predict the object probability of a window in a sliding window approach as well as regress its aspect ratio with a single model.
no code implementations • CVPR 2013 • Samuel Schulter, Paul Wohlhart, Christian Leistner, Amir Saffari, Peter M. Roth, Horst Bischof
Contrary to Boosted Trees, in our method the loss minimization is an inherent part of the tree growing process, thus allowing to keep the benefits of common Random Forests, such as, parallel processing.