TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Named Entity Recognition In Vietnamese	PhoNER COVID19	ViHealthBERT	F1 (%)	96.7	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/vihealthbert-pre-trained-language-models-for/named-entity-recognition-in-vietnamese-on-1)](https://paperswithcode.com/sota/named-entity-recognition-in-vietnamese-on-1?p=vihealthbert-pre-trained-language-models-for)`

ViHealthBERT: Pre-trained Language Models for Vietnamese in Health Text Mining

LREC 2022 · Minh, Nguyen and Tran, Vu Hoang and Hoang, Vu and Ta, Huy Duc and Bui, Trung Huu and Truong, Steven Quoc Hung ·

Pre-trained language models have become crucial to achieving competitive results across many Natural Language Processing (NLP) problems. For monolingual pre-trained models in low-resource languages, the quantity has been significantly increased. However, most of them relate to the general domain, and there are limited strong baseline language models for domain-specific. We introduce ViHealthBERT, the first domain-specific pre-trained language model for Vietnamese healthcare. The performance of our model shows strong results while outperforming the general domain language models in all health-related datasets. Moreover, we also present Vietnamese datasets for the healthcare domain for two tasks are Acronym Disambiguation (AD) and Frequently Asked Questions (FAQ) Summarization. We release our ViHealthBERT to facilitate future research and downstream application for Vietnamese NLP in domain-specific.

PDF