3 code implementations • 3 Jul 2018 • Gibran Fuentes-Pineda, Ivan Vladimir Meza-Ruiz
This paper describes an alternative approach to discover topics based on Min-Hashing, which can handle massive text corpora and large vocabularies using modest computer hardware and does not require to fix the number of topics in advance.
1 code implementation • 6 Sep 2015 • Gibran Fuentes-Pineda, Ivan Vladimir Meza-Ruiz
We present Sampled Weighted Min-Hashing (SWMH), a randomized approach to automatically mine topics from large-scale corpora.