Paper2vec: Citation-Context Based Document Distributed Representation for Scholar Recommendation

20 Mar 2017  ·  Han Tian, Hankz Hankui Zhuo ·

Due to the availability of references of research papers and the rich information contained in papers, various citation analysis approaches have been proposed to identify similar documents for scholar recommendation. Despite of the success of previous approaches, they are, however, based on co-occurrence of items. Once there are no co-occurrence items available in documents, they will not work well. Inspired by distributed representations of words in the literature of natural language processing, we propose a novel approach to measuring the similarity of papers based on distributed representations learned from the citation context of papers. We view the set of papers as the vocabulary, define the weighted citation context of papers, and convert it to weight matrix similar to the word-word cooccurrence matrix in natural language processing. After that we explore a variant of matrix factorization approach to train distributed representations of papers on the matrix, and leverage the distributed representations to measure similarities of papers. In the experiment, we exhibit that our approach outperforms state-of-theart citation-based approaches by 25%, and better than other distributed representation based methods.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here