Search Results for author: Michael J. Paul

Found 24 papers, 7 papers with code

User Factor Adaptation for User Embedding via Multitask Learning

1 code implementation • EACL (AdaptNLP) 2021 • Xiaolei Huang, Michael J. Paul, Robin Burke, Franck Dernoncourt, Mark Dredze

In this study, we treat the user interest as domains and empirically examine how the user language can vary across the user factor in three English social media datasets.

Clustering text-classification +1

Paper
Code

Why Overfitting Isn't Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries

no code implementations • ACL 2020 • Mozhi Zhang, Yoshinari Fujinuma, Michael J. Paul, Jordan Boyd-Graber

Cross-lingual word embeddings (CLWE) are often evaluated on bilingual lexicon induction (BLI).

Bilingual Lexicon Induction Cross-Lingual Word Embeddings +2

Paper
Add Code

Multilingual Twitter Corpus and Baselines for Evaluating Demographic Bias in Hate Speech Recognition

2 code implementations • LREC 2020 • Xiaolei Huang, Linzi Xing, Franck Dernoncourt, Michael J. Paul

Existing research on fairness evaluation of document classification models mainly uses synthetic monolingual data without ground truth for author demographic attributes.

Document Classification Fairness +3

Paper
Code

Evaluating Topic Quality with Posterior Variability

1 code implementation • IJCNLP 2019 • Linzi Xing, Michael J. Paul, Giuseppe Carenini

Probabilistic topic models such as latent Dirichlet allocation (LDA) are popularly used with Bayesian inference methods such as Gibbs sampling to learn posterior distributions over topic model parameters.

Bayesian Inference Topic Models

Paper
Code

Overview of the Fourth Social Media Mining for Health (SMM4H) Shared Tasks at ACL 2019

no code implementations • WS 2019 • Davy Weissenbacher, Abeed Sarker, Arjun Magge, Ashlynn Daughton, Karen O{'}Connor, Michael J. Paul, Gonzalez-Hern, Graciela ez

We present the Social Media Mining for Health Shared Tasks collocated with the ACL at Florence in 2019, which address these challenges for health monitoring and surveillance, utilizing state of the art techniques for processing noisy, real-world, and substantially creative language expressions from social media users.

Task 2

Paper
Add Code

Neural Temporality Adaptation for Document Classification: Diachronic Word Embeddings and Domain Adaptation Models

1 code implementation • ACL 2019 • Xiaolei Huang, Michael J. Paul

Language usage can change across periods of time, but document classifiers models are usually trained and tested on corpora spanning multiple years without considering temporal variations.

Classification Diachronic Word Embeddings +4

Paper
Code

A Resource-Free Evaluation Metric for Cross-Lingual Word Embeddings Based on Graph Modularity

1 code implementation • ACL 2019 • Yoshinari Fujinuma, Jordan Boyd-Graber, Michael J. Paul

Cross-lingual word embeddings encode the meaning of words from different languages into a shared low-dimensional space.

Cross-Lingual Word Embeddings Word Embeddings +1

Paper
Code

Neural User Factor Adaptation for Text Classification: Learning to Generalize Across Author Demographics

1 code implementation • SEMEVAL 2019 • Xiaolei Huang, Michael J. Paul

Language use varies across different demographic factors, such as gender, age, and geographic location.

Document Classification General Classification +1

Paper
Code

Analyzing Bayesian Crosslingual Transfer in Topic Models

no code implementations • NAACL 2019 • Shudong Hao, Michael J. Paul

We introduce a theoretical analysis of crosslingual transfer in probabilistic topic models.

Topic Models

Paper
Add Code

An Empirical Study on Crosslingual Transfer in Probabilistic Topic Models

no code implementations • CL 2020 • Shudong Hao, Michael J. Paul

Probabilistic topic modeling is a popular choice as the first step of crosslingual tasks to enable knowledge transfer and extract multilingual features.

Topic Models Transfer Learning

Paper
Add Code

Overview of the Third Social Media Mining for Health (SMM4H) Shared Tasks at EMNLP 2018

no code implementations • WS 2018 • Davy Weissenbacher, Abeed Sarker, Michael J. Paul, Gonzalez-Hern, Graciela ez

The goals of the SMM4H shared tasks are to release annotated social media based health related datasets to the research community, and to compare the performances of natural language processing and machine learning systems on tasks involving these datasets.

General Classification Task 2 +1