Multi-View Approach to Suggest Moderation Actions in Community Question Answering Sites

With thousands of new questions posted every day on popular Q&A websites, there is a need for automated and accurate software solutions to replace manual moderation. In this paper, we address the critical drawbacks of crowdsourcing moderation actions in Q&A communities and demonstrate the ability to automate moderation using the latest machine learning models. From a technical point, we propose a multi-view approach that generates three distinct feature groups that examine a question from three different perspectives: 1) question-related features extracted using a BERT-based regression model; 2) context-related features extracted using a named-entity-recognition model; and 3) general lexical features derived using statistical and analytical methods. As a last step, we train a gradient boosting classifier to predict a moderation action. For evaluation purposes, we created a new dataset consisting of 60,000 Stack Overflow questions classified into three choices of moderation actions. Based on cross-validation on the novel dataset, our approach reaches 95.6% accuracy as a multiclass task and outperforms all state-of-the-art and previously-published models. Our results clearly demonstrate the high influence of our feature generation components on the overall success of the classifier.

PDF

Datasets


Introduced in the Paper:

60k Stack Overflow Questions
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Question Quality Assessment 60k Stack Overflow Questions Multi-view approach F1 Score .917 # 1

Methods


No methods listed for this paper. Add relevant methods here