1 code implementation • 29 Apr 2024 • Huy Quang Pham, Thang Kien-Bao Nguyen, Quan Van Nguyen, Dan Quang Tran, Nghia Hieu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
To this end, we introduce a novel dataset, ViOCRVQA (Vietnamese Optical Character Recognition - Visual Question Answering dataset), consisting of 28, 000+ images and 120, 000+ question-answer pairs.
Optical Character Recognition Optical Character Recognition (OCR) +2
1 code implementation • 16 Apr 2024 • Quan Van Nguyen, Dan Quang Tran, Huy Quang Pham, Thang Kien-Bao Nguyen, Nghia Hieu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Visual Question Answering (VQA) is a complicated task that requires the capability of simultaneously processing natural language and images.
Multimodal Deep Learning Optical Character Recognition (OCR) +5