Chinese Word Segmentation

48 papers with code • 6 benchmarks • 3 datasets

Chinese word segmentation is the task of splitting Chinese text (i.e. a sequence of Chinese characters) into words (Source: www.nlpprogress.com).

Latest papers with no code

An Effective Incorporating Heterogeneous Knowledge Curriculum Learning for Sequence Labeling

no code yet • 21 Feb 2024

To address this challenge, we propose a two-stage curriculum learning (TCL) framework specifically designed for sequence labeling tasks.

Incorporating Deep Syntactic and Semantic Knowledge for Chinese Sequence Labeling with GCN

no code yet • 3 Jun 2023

Recently, it is quite common to integrate Chinese sequence labeling results to enhance syntactic and semantic parsing.

Joint Chinese Word Segmentation and Span-based Constituency Parsing

no code yet • 3 Nov 2022

In constituency parsing, span-based decoding is an important direction.

Mining Word Boundaries in Speech as Naturally Annotated Word Segmentation Data

no code yet • 31 Oct 2022

Inspired by early research on exploring naturally annotated data for Chinese word segmentation (CWS), and also by recent research on integration of speech and text processing, this work for the first time proposes to mine word boundaries from parallel speech/text data.

That Slepen Al the Nyght with Open Ye! Cross-era Sequence Segmentation with Switch-memory

no code yet • ACL 2022

The evolution of language follows the rule of gradual change.

A New Evaluation Method: Evaluation Data and Metrics for Chinese Grammar Error Correction

no code yet • 30 Apr 2022

In terms of the reference-based metric, we introduce sentence-level accuracy and char-level BLEU to evaluate the corrected sentences.

Chinese Word Segmentation with Heterogeneous Graph Neural Network

no code yet • 22 Jan 2022

In recent years, deep learning has achieved significant success in the Chinese word segmentation (CWS) task.

Joint Chinese Word Segmentation and Part-of-speech Tagging via Two-stage Span Labeling

no code yet • PACLIC 2021

Previous studies on joint Chinese word segmentation and part-of-speech tagging mainly follow the character-based tagging model focusing on modeling n-gram features.

Green CWS: Extreme Distillation and Efficient Decode Method Towards Industrial Application

no code yet • 17 Nov 2021

Benefiting from the strong ability of the pre-trained model, the research on Chinese Word Segmentation (CWS) has made great progress in recent years.

Unsupervised Chinese Word Segmentation with BERT Oriented Probing and Transformation

no code yet • ACL ARR November 2021

Word Segmentation is a fundamental step for understanding Chinese language.