Topic Models
210 papers with code • 6 benchmarks • 12 datasets
A topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Topic modeling is a frequently used text-mining tool for the discovery of hidden semantic structures in a text body.
Libraries
Use these libraries to find Topic Models models and implementationsDatasets
Most implemented papers
A Coefficient of Determination for Probabilistic Topic Models
This research proposes a new (old) metric for evaluating goodness of fit in topic models, the coefficient of determination, or $R^2$.
Cross-lingual Contextualized Topic Models with Zero-shot Learning
They all cover the same content, but the linguistic differences make it impossible to use traditional, bag-of-word-based topic models.
Top2Vec: Distributed Representations of Topics
Distributed representations of documents and words have gained popularity due to their ability to capture semantics of words and documents.
Using Transformer based Ensemble Learning to classify Scientific Articles
The first one is a RoBERTa [10] based model built over these abstracts.
Is Automated Topic Model Evaluation Broken?: The Incoherence of Coherence
To address the standardization gap, we systematically evaluate a dominant classical model and two state-of-the-art neural models on two commonly used datasets.
Contrastive Learning for Neural Topic Model
Recent empirical studies show that adversarial topic models (ATM) can successfully capture semantic patterns of the document by differentiating a document with another dissimilar sample.
Improving Contextualized Topic Models with Negative Sampling
Topic modeling has emerged as a dominant method for exploring large document collections.
A Survey on Neural Topic Models: Methods, Applications, and Challenges
In this paper, we present a comprehensive survey on neural topic models concerning methods, applications, and challenges.
Software Framework for Topic Modelling with Large Corpora
Large corpora are ubiquitous in today’s world and memory quickly becomes the limiting factor in practical applications of the Vector Space Model (VSM).
Supervised Topic Models
We introduce supervised latent Dirichlet allocation (sLDA), a statistical model of labelled documents.