A Novel Metric for Evaluating Semantics Preservation

ACL ARR October 2021 · Anonymous ·

In this paper, we leverage pre-trained language models (PLMs) to precisely evaluate the semantics preservation of edition process on sentences. Our metric, Neighboring Distribution Divergence (NDD), evaluates the disturbance on predicted distribution of neighboring words from mask language model (MLM). NDD is capable of detecting precise changes in semantics which are easily ignored by text similarity. By exploiting the property of NDD, we implement a unsupervised and even training-free algorithm for extractive sentence compression. We show that NDD-based algorithm outperforms previous perplexity-based unsupervised algorithm by a large margin. For further exploration on interpretability, we evaluate NDD by pruning on syntactic dependency treebanks and apply NDD for predicate detection as well.

PDF Abstract