no code implementations • COLING 2022 • Tin Van Huynh, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
In this paper, we introduce ViNLI (Vietnamese Natural Language Inference), an open-domain and high-quality corpus for evaluating Vietnamese NLI models, which is created and evaluated with a strict process of quality control.
1 code implementation • 16 Apr 2024 • Quan Van Nguyen, Dan Quang Tran, Huy Quang Pham, Thang Kien-Bao Nguyen, Nghia Hieu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Visual Question Answering (VQA) is a complicated task that requires the capability of simultaneously processing natural language and images.
Multimodal Deep Learning Optical Character Recognition (OCR) +5
no code implementations • 23 Mar 2024 • Phong Nguyen-Thuan Do, Son Quoc Tran, Phu Gia Hoang, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
The success of Natural Language Understanding (NLU) benchmarks in various languages, such as GLUE for English, CLUE for Chinese, KLUE for Korean, and IndoNLU for Indonesian, has facilitated the evaluation of new NLU models across a wide range of tasks.
1 code implementation • 5 Feb 2024 • Thinh Phuoc Ngo, Khoa Tran Anh Dang, Son T. Luu, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
This paper presents the development process of a Vietnamese spoken language corpus for machine reading comprehension (MRC) tasks and provides insights into the challenges and opportunities associated with using real-world data for machine reading comprehension tasks.
1 code implementation • 29 Jan 2024 • Thanh-Nhi Nguyen, Thanh-Phong Le, Kiet Van Nguyen
In this work, we introduce Vietnamese Lexical Normalization (ViLexNorm), the first-ever corpus developed for the Vietnamese lexical normalization task.
Lexical Normalization Vietnamese Social Media Text Processing
1 code implementation • 12 Nov 2023 • Anh Thi-Hoang Nguyen, Dung Ha Nguyen, Nguyet Thi Nguyen, Khanh Thanh-Duy Ho, Kiet Van Nguyen
Our dataset is accessible for research purposes.
1 code implementation • 27 Oct 2023 • Khiem Vinh Tran, Hao Phu Phan, Kiet Van Nguyen, Ngan Luu Thuy Nguyen
Neural models for VQA have made remarkable progress on large-scale datasets, with a primary focus on resource-rich languages like English.
1 code implementation • 17 Oct 2023 • Quoc-Nam Nguyen, Thang Chau Phan, Duc-Vu Nguyen, Kiet Van Nguyen
English and Chinese, known as resource-rich languages, have witnessed the strong development of transformer-based language models for natural language processing tasks.
Vietnamese Language Models Vietnamese Social Media Text Processing +1
no code implementations • 26 Sep 2023 • Vu Le Anh Quan, Chau Thuan Phat, Kiet Van Nguyen, Phan The Duy, Van-Hau Pham
Hence, in this work, we propose XGV-BERT, a framework that combines the pre-trained CodeBERT model and Graph Neural Network (GCN) to detect software vulnerabilities.
1 code implementation • 6 Sep 2023 • Chau-Thang Phan, Quoc-Nam Nguyen, Chi-Thanh Dang, Trong-Hop Do, Kiet Van Nguyen
Our proposed ViCGCN approach demonstrates a significant improvement of up to 6. 21%, 4. 61%, and 2. 63% over the best Contextualized Language Models, including multilingual and monolingual, on three benchmark datasets, UIT-VSMEC, UIT-ViCTSD, and UIT-VSFC, respectively.
1 code implementation • 31 Aug 2023 • Chau-Thang Phan, Quoc-Nam Nguyen, Kiet Van Nguyen
Drawing inspiration from recent advancements in natural language processing and understanding, we cast link prediction as an NLI task, wherein the presence of a link between two articles is treated as a premise, and the task is to determine whether this premise holds based on the information presented in the articles.
no code implementations • 28 Jul 2023 • Khiem Vinh Tran, Kiet Van Nguyen, Ngan Luu Thuy Nguyen
Visual Question Answering (VQA) is an intricate and demanding task that integrates natural language processing (NLP) and computer vision (CV), capturing the interest of researchers.
no code implementations • 17 Jul 2023 • Nghia Hieu Nguyen, Kiet Van Nguyen
Based on these two novel modules, we introduce the Parallel Attention Transformer (PAT), achieving the best accuracy compared to all baselines on the benchmark ViVQA dataset and other SOTA methods including SAAA and MCAN.
1 code implementation • 7 May 2023 • Nghia Hieu Nguyen, Duong T. D. Vo, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
The VQA task requires methods that have the ability to fuse the information from questions and images to produce appropriate answers.
1 code implementation • 31 Mar 2023 • Son T. Luu, Khoi Trong Hoang, Tuong Quang Pham, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
From the results of the error analysis, we found the challenge of the reading comprehension models is understanding the implicit context in texts and linking them together in order to find the correct answers.
no code implementations • 16 Mar 2023 • Son Quoc Tran, Phong Nguyen-Thuan Do, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
From the analysis results, we suggest new directions for developing Vietnamese language models.
no code implementations • 23 Feb 2023 • Ngan Luu-Thuy Nguyen, Nghia Hieu Nguyen, Duong T. D Vo, Khanh Quoc Tran, Kiet Van Nguyen
Visual Question Answering (VQA) is a challenging task of natural language processing (NLP) and computer vision (CV), attracting significant attention from researchers.
1 code implementation • 24 Jan 2023 • Phu Gia Hoang, Canh Duc Luu, Khanh Quoc Tran, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
The rise in hateful and offensive language directed at other users is one of the adverse side effects of the increased use of social networking platforms.
Ranked #1 on Sequence-to-sequence Language Modeling on ViHOS
1 code implementation • 10 Nov 2022 • Nghia Hieu Nguyen, Duong T. D. Vo, Kiet Van Nguyen
Recognizing handwriting images is challenging due to the vast variation in writing style across many people and distinct linguistic aspects of writing languages.
no code implementations • 21 Sep 2022 • Luan Thanh Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Inspired by the success of the GLUE, we introduce the Social Media Text Classification Evaluation (SMTCE) benchmark, as a collection of datasets and models across a diverse set of SMTC tasks.
no code implementations • 20 Jun 2022 • Nhung Thi-Hong Nguyen, Phuong Phan-Dieu Ha, Luan Thanh Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Question answering (QA) systems have gained explosive attention in recent years.
1 code implementation • 1 Jun 2022 • Khanh Q. Tran, An T. Nguyen, Phu Gia Hoang, Canh Duc Luu, Trong-Hop Do, Kiet Van Nguyen
Secondly, a novel hate speech detection (HSD) model, which is the combination of a pre-trained PhoBERT model and a Text-CNN model, was proposed for solving tasks in Vietnamese.
Hate Speech Detection Vietnamese Social Media Text Processing
no code implementations • 14 Apr 2022 • Kiet Van Nguyen, Phong Nguyen-Thuan Do, Nhat Duy Nguyen, Tin Van Huynh, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen
Question answering (QA) is a natural language understanding task within the fields of information retrieval and information extraction that has attracted much attention from the computational linguistics and artificial intelligence research community in recent years because of the strong development of machine reading comprehension-based models.
no code implementations • 22 Mar 2022 • Kiet Van Nguyen, Son Quoc Tran, Luan Thanh Nguyen, Tin Van Huynh, Son T. Luu, Ngan Luu-Thuy Nguyen
To address the weakness, we provide the research community with a benchmark dataset named UIT-ViQuAD 2. 0 for evaluating the MRC task and question answering systems for the Vietnamese language.
no code implementations • PACLIC 2021 • Duc-Vu Nguyen, Linh-Bao Vo, Ngoc-Linh Tran, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Previous studies on joint Chinese word segmentation and part-of-speech tagging mainly follow the character-based tagging model focusing on modeling n-gram features.
1 code implementation • 15 Oct 2021 • Kim Thi-Thanh Nguyen, Sieu Khai Huynh, Luong Luc Phan, Phuc Huynh Pham, Duc-Vu Nguyen, Kiet Van Nguyen
Aspect-based sentiment analysis plays an essential role in natural language processing and artificial intelligence.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +4
no code implementations • 31 Aug 2021 • Huy Quoc To, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen
Recent researches have demonstrated that BERT shows potential in a wide range of natural language processing tasks.
no code implementations • 6 Aug 2021 • Thuan Trong Nguyen, Thuan Q. Nguyen, Dung Vo, Vi Nguyen, Ngoc Ho, Nguyen D. Vo, Kiet Van Nguyen, Khang Nguyen
We use 10, 044 images for model training and 6, 682 test images to classify each food in the VinaFood21 dataset and achieved an average accuracy of 74. 81% when fine-tuning CNN EfficientNet-B0.
no code implementations • International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems 2021 • Khanh Quoc Tran, Binh Van Duong, Linh Quang Tran, An Le-Hoai Tran, An Trong Nguyen, Kiet Van Nguyen
We conduct experiments with modern machine learning methods based on ensemble learning models: LightGBM, CatBoost, and Random Forest.
1 code implementation • 31 May 2021 • Luong Luc Phan, Phuc Huynh Pham, Kim Thi-Thanh Nguyen, Tham Thi Nguyen, Sieu Khai Huynh, Luan Thanh Nguyen, Tin Van Huynh, Kiet Van Nguyen
In this paper, we present a process of building a social listening system based on aspect-based sentiment analysis in Vietnamese from creating a dataset to building a real application.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +5
no code implementations • 19 May 2021 • Phong Nguyen-Thuan Do, Nhat Duy Nguyen, Tin Van Huynh, Kiet Van Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen
We propose a conversion algorithm to create the dataset for sentence extraction-based machine reading comprehension and three types of approaches for sentence extraction-based machine reading comprehension in Vietnamese.
1 code implementation • 4 May 2021 • Son T. Luu, Mao Nguyen Bui, Loi Duc Nguyen, Khiem Vinh Tran, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
To help machines understand conversation texts, we present UIT-ViCoQA, a new corpus for conversational machine reading comprehension in the Vietnamese language.
no code implementations • 24 Apr 2021 • Nhung Thi-Hong Nguyen, Phuong Phan-Dieu Ha, Luan Thanh Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Customer product reviews play a role in improving the quality of products and services for business organizations or their brands.
Complaint Comment Classification Constructive Comment Classification +2
no code implementations • SEMEVAL 2021 • Phu Gia Hoang, Luan Thanh Nguyen, Kiet Van Nguyen
The increment of toxic comments on online space is causing tremendous effects on other vulnerable users.
2 code implementations • 22 Mar 2021 • Son T. Luu, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
On social medias, hate speech has become a critical problem for social network users.
Hate Speech Detection Vietnamese Social Media Text Processing
no code implementations • 18 Mar 2021 • Luan Thanh Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
For these tasks, we propose a system for constructive and toxic speech detection with the state-of-the-art transfer learning model in Vietnamese NLP as PhoBERT.
Constructive Comment Classification General Classification +2
1 code implementation • 24 Feb 2021 • Duc-Vu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
In this paper, we implement this idea to improve word segmentation and part of speech tagging the Vietnamese language by employing a simplified constituency parser.
no code implementations • VLSP 2020 • Kim Thi-Thanh Nguyen, Kiet Van Nguyen
This paper presents the system that we propose for the Reliable Intelligence Indentification on Vietnamese Social Network Sites (ReINTEL) task of the Vietnamese Language and Speech Processing 2020 (VLSP 2020) Shared Task.
no code implementations • 21 Oct 2020 • Huy Quoc To, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen
We propose a new dataset for gender prediction based on Vietnamese names.
no code implementations • 19 Oct 2020 • Tuan-Vi Tran, Xuan-Thien Pham, Duc-Vu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
In this work, we use a span-based approach for Vietnamese constituency parsing.
no code implementations • 30 Sep 2020 • Kiet Van Nguyen, Duc-Vu Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen
Due to the lack of benchmark datasets for Vietnamese, we present the Vietnamese Question Answering Dataset (UIT-ViQuAD), a new dataset for the low-resource language as Vietnamese to evaluate MRC models.
no code implementations • PACLIC 2020 • Huy Duc Huynh, Hang Thi-Thuy Do, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
There are various studies in this field in many languages but limited to the Vietnamese language.
1 code implementation • 25 Sep 2020 • Son T. Luu, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Thus, when collecting the data about user comments on the social network, the data is usually skewed about one label, which leads the dataset to become imbalanced and deteriorate the model's ability.
no code implementations • 23 Sep 2020 • Khang Phuoc-Quy Nguyen, Kiet Van Nguyen
Textual emotion recognition has been a promising research topic in recent years.
no code implementations • 7 Sep 2020 • Khiem Vinh Tran, Hao Phu Phan, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Recently, COVID-19 has affected a variety of real-life aspects of the world and led to dreadful consequences.
no code implementations • 20 Aug 2020 • Son T. Luu, Kiet Van Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen
In this paper, we conduct several experiments on neural network-based model to understand the impact of word representation to the Vietnamese multiple-choice machine reading comprehension.
no code implementations • 19 Jun 2020 • Kiet Van Nguyen, Tin Van Huynh, Duc-Vu Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen
In particular, we develop a process of creating a corpus for the Vietnamese machine reading comprehension.
1 code implementation • 14 Jun 2020 • Duc-Vu Nguyen, Dang Van Thin, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
In this paper, we approach Vietnamese word segmentation as a binary classification by using the Support Vector Machine classifier.
3 code implementations • 1 Feb 2020 • Quan Hoang Lam, Quang Duy Le, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
This paper contributes to research on Image Captioning task in terms of extending dataset to a different language - Vietnamese.
1 code implementation • 31 Jan 2020 • Son T. Luu, Hung P. Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Consequently, we compare traditional machine learning and deep learning on a large dataset about the user's comments on social network in Vietnamese and find out what is the advantage and disadvantage of each model by comparing their accuracy on F1-score, then we pick two models in which has highest accuracy in traditional machine learning models and deep neural models respectively.
no code implementations • 16 Jan 2020 • Kiet Van Nguyen, Khiem Vinh Tran, Son T. Luu, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen
Although Vietnamese is the 17th most popular native-speaker language in the world, there are not many research studies on Vietnamese machine reading comprehension (MRC), the task of understanding a text and answering questions about it.
no code implementations • 27 Dec 2019 • Tin Van Huynh, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen
In addition, we also proposed a simple and effective ensemble model combining different deep neural network models.
no code implementations • 21 Nov 2019 • Vong Anh Ho, Duong Huynh-Cong Nguyen, Danh Hoang Nguyen, Linh Thi-Van Pham, Duc-Vu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
In this task, the result is not produced in terms of either polarity: positive or negative or in the form of rating (from 1 to 5) but of a more detailed level of analysis in which the results are depicted in more expressions like sadness, enjoyment, anger, disgust, fear, and surprise.
no code implementations • 17 Nov 2019 • Binh An Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
In recent years, Vietnamese Named Entity Recognition (NER) systems have had a great breakthrough when using Deep Neural Network methods.
no code implementations • 17 Nov 2019 • Phu X. V. Nguyen, Tham T. T. Hong, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Student's feedback is an important source of collecting students' opinions to improve the quality of training activities.
no code implementations • 9 Nov 2019 • Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Dependency parsing is needed in different applications of natural language processing.
1 code implementation • 9 Nov 2019 • Hang Thi-Thuy Do, Huy Duc Huynh, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen
In this paper, we describe our system which participates in the shared task of Hate Speech Detection on Social Networks of VLSP 2019 evaluation campaign.
no code implementations • 9 Nov 2019 • Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
In recent years, dependency parsing is a fascinating research topic and has a lot of applications in natural language processing.
1 code implementation • 9 Nov 2019 • Tin Van Huynh, Vu Duc Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen
In recent years, Hate Speech Detection has become one of the interesting fields in natural language processing or computational linguistics.
Hate Speech Detection Vietnamese Social Media Text Processing
no code implementations • 30 Oct 2019 • Binh Duc Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
In Vietnamese dependency parsing, several methods have been proposed.