Using Author Embeddings to Improve Tweet Stance Classification

WS 2018 · Adrian Benton, Mark Dredze ·

Many social media classification tasks analyze the content of a message, but do not consider the context of the message. For example, in tweet stance classification {--} where a tweet is categorized according to a viewpoint it espouses {--} the expressed viewpoint depends on latent beliefs held by the user. In this paper we investigate whether incorporating knowledge about the author can improve tweet stance classification. Furthermore, since author information and embeddings are often unavailable for labeled training examples, we propose a semi-supervised pretraining method to predict user embeddings. Although the neural stance classifiers we learn are often outperformed by a baseline SVM, author embedding pre-training yields improvements over a non-pre-trained neural network on four out of five domains in the SemEval 2016 6A tweet stance classification task. In a tweet gun control stance classification dataset, improvements from pre-training are only apparent when training data is limited.

PDF Abstract