A semi-supervised model for Persian rumor verification based on content information

Rumor is a collective attempt to interpret a vague but attractive situation by using the power of words. In social networks, false-rumors may have significantly different contextual characteristics from true-rumors at lexical, syntactic, semantic levels. Therefore, this study presents the BERT-SAWS semi-supervised learning model for early verification of Persian rumor by investigating content-based and context features at three views: Contextual Word Embeddings (CWE), speech act, and Writing Style (WS). This model is built by loading pre-trained Bidirectional Encoder Representations from Transformers (BERT) as an unsupervised language representation, fine-tuning it using a small Persian rumor dataset, and combining with a supervised learning model to provide an enriched text representation of the content of the rumor. This text representation enables the model to have a better comprehending of the rumor language to verify rumors better than baseline models for two reasons: (i) early rumor verification by focusing on content-based and context-based features of the source rumor. (ii) overcoming the problem of the shortcoming of the dataset in deep neural networks by loading pre-trained BERT, fine-tuning it using the Persian rumor dataset, and combining with speech act and WS-based features. The empirical results of applying the model on Twitter and Telegram datasets demonstrated that BERT-SAWS can enhance the performance of the classifier from 2% to 18%. It indicates that speech act and WS alongside semantic contextual vectors are helpful features in the rumor verification task.

PDF Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Rumour Detection Sepehr_RumTel01 BERT-SAWS F-Measure 0.934 # 2

Methods


No methods listed for this paper. Add relevant methods here