Vietnamese Multimodal Learning

3 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

OpenViVQA: Task, Dataset, and Multimodal Fusion Models for Visual Question Answering in Vietnamese

hieunghia-pat/openvivqa-dataset 7 May 2023

The VQA task requires methods that have the ability to fuse the information from questions and images to produce appropriate answers.

ViTextVQA: A Large-Scale Visual Question Answering Dataset for Evaluating Vietnamese Text Comprehension in Images

minhquan6203/vitextvqa-dataset 16 Apr 2024

Visual Question Answering (VQA) is a complicated task that requires the capability of simultaneously processing natural language and images.

New Benchmark Dataset and Fine-Grained Cross-Modal Fusion Framework for Vietnamese Multimodal Aspect-Category Sentiment Analysis

no code yet • 1 May 2024

To address this, we introduce a new Vietnamese multimodal dataset, named ViMACSA, which consists of 4, 876 text-image pairs with 14, 618 fine-grained annotations for both text and image in the hotel domain.