Talla at SemEval-2017 Task 3: Identifying Similar Questions Through Paraphrase Detection

SEMEVAL 2017 · Byron Galbraith, Bhanu Pratap, Daniel Shank ·

This paper describes our approach to the SemEval-2017 shared task of determining question-question similarity in a community question-answering setting (Task 3B). We extracted both syntactic and semantic similarity features between candidate questions, performed pairwise-preference learning to optimize for ranking order, and then trained a random forest classifier to predict whether the candidate questions are paraphrases of each other. This approach achieved a MAP of 45.7{\%} out of max achievable 67.0{\%} on the test set.

PDF Abstract