Chinese Word Segmentation

48 papers with code • 6 benchmarks • 3 datasets

Chinese word segmentation is the task of splitting Chinese text (i.e. a sequence of Chinese characters) into words (Source: www.nlpprogress.com).

Most implemented papers

Adversarial Transfer Learning for Chinese Named Entity Recognition with Self-Attention Mechanism

CPF-NLPR/AT4ChineseNER EMNLP 2018

However, existing methods for Chinese NER either do not exploit word boundary information from CWS or cannot filter the specific information of CWS.

Unsupervised Neural Word Segmentation for Chinese via Segmental Language Modeling

Edward-Sun/SLM EMNLP 2018

As far as we know, we are the first to propose a neural model for unsupervised CWS and achieve competitive performance to the state-of-the-art statistical models on four different datasets from SIGHAN 2005 bakeoff.

Subword Encoding in Lattice LSTM for Chinese Word Segmentation

jiesutd/SubwordEncoding-CWS NAACL 2019

Previous lattice LSTM model takes word embeddings as the lexicon input, we prove that subword encoding can give the comparable performance and has the benefit of not relying on any external segmentor.

Improving Cross-Domain Chinese Word Segmentation with Word Embeddings

vatile/CWS-NAACL2019 NAACL 2019

Cross-domain Chinese Word Segmentation (CWS) remains a challenge despite recent progress in neural-based CWS.

A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing

fastnlp/JointCwsParser TACL 2020

Our graph-based joint model achieves better performance than previous joint models and state-of-the-art results in both Chinese word segmentation and dependency parsing.

A Concise Model for Multi-Criteria Chinese Word Segmentation with Transformer Encoder

acphile/MCCWS Findings of the Association for Computational Linguistics 2020

Multi-criteria Chinese word segmentation (MCCWS) aims to exploit the relations among the multiple heterogeneous segmentation criteria and further improve the performance of each single criterion.

Attention Is All You Need for Chinese Word Segmentation

akibcmi/SAMS EMNLP 2020

Taking greedy decoding algorithm as it should be, this work focuses on further strengthening the model itself for Chinese word segmentation (CWS), which results in an even more fast and more accurate CWS model.

Neural Chinese Word Segmentation as Sequence to Sequence Translation

SourcecodeSharing/CWSpostediting 29 Nov 2019

In this paper, we cast the CWS as a sequence translation problem and propose a novel sequence-to-sequence CWS model with an attention-based encoder-decoder framework.

Improving Chinese Word Segmentation with Wordhood Memory Networks

SVAIGBA/WMSeg ACL 2020

Contextual features always play an important role in Chinese word segmentation (CWS).

Joint Chinese Word Segmentation and Part-of-speech Tagging via Two-way Attentions of Auto-analyzed Knowledge

SVAIGBA/TwASP ACL 2020

Chinese word segmentation (CWS) and part-of-speech (POS) tagging are important fundamental tasks for Chinese language processing, where joint learning of them is an effective one-step solution for both tasks.