no code implementations • 16 Jan 2024 • Anh-Cuong Pham, Van-Quang Nguyen, Thi-Hong Vuong, Quang-Thuy Ha
Image captioning is a crucial task with applications in a wide range of domains, including healthcare and education.
1 code implementation • 7 Oct 2023 • Korawat Charoenpitaks, Van-Quang Nguyen, Masanori Suganuma, Masahiro Takahashi, Ryoma Niihara, Takayuki Okatani
To enable research in this understudied area, a new dataset named the DHPR (Driving Hazard Prediction and Reasoning) dataset is created.
no code implementations • 27 Feb 2023 • Thong Bach, Thuong Nguyen Canh, Van-Quang Nguyen
Recent advancements in deep learning techniques have significantly improved the quality of compressed videos.
2 code implementations • 20 Jul 2022 • Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani
Current state-of-the-art methods for image captioning employ region-based features, as they provide object-level information that is essential to describe the content of images; they are usually extracted by an object detector such as Faster R-CNN.
Ranked #8 on Image Captioning on nocaps in-domain
1 code implementation • 1 Jun 2021 • Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani
It then integrates the prediction with the visual information etc., yielding the final prediction of an action and an object.
1 code implementation • ECCV 2020 • Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani
It has been a primary concern in recent studies of vision and language tasks to design an effective attention mechanism dealing with interactions between the two modalities.
Ranked #7 on Visual Dialog on Visual Dialog v1.0 test-std