Extracting Social Networks from Literary Text with Word Embedding Tools

WS 2016 · Gerhard Wohlgenannt, Ekaterina Chernyak, Dmitry Ilvovsky ·

In this paper a social network is extracted from a literary text. The social network shows, how frequent the characters interact and how similar their social behavior is. Two types of similarity measures are used: the first applies co-occurrence statistics, while the second exploits cosine similarity on different types of word embedding vectors. The results are evaluated by a paid micro-task crowdsourcing survey. The experiments suggest that specific types of word embeddings like word2vec are well-suited for the task at hand and the specific circumstances of literary fiction text.

PDF Abstract