no code implementations • 22 Feb 2023 • Viet-Quoc Pham, Nao Mishima
Weakly supervised visual grounding aims to predict the region in an image that corresponds to a specific linguistic query, where the mapping between the target object and query is unknown in the training stage.
no code implementations • 7 Jun 2017 • Viet-Quoc Pham, Satoshi Ito, Tatsuo Kozakaya
We present a simple and effective framework for simultaneous semantic segmentation and instance segmentation with Fully Convolutional Networks (FCNs).
no code implementations • ICCV 2015 • Viet-Quoc Pham, Tatsuo Kozakaya, Osamu Yamaguchi, Ryuzo Okada
This paper presents a patch-based approach for crowd density estimation in public scenes.