Search Results for author: Mark Dredze

Found 104 papers, 39 papers with code

Explaining Models of Mental Health via Clinically Grounded Auxiliary Tasks

no code implementations • NAACL (CLPsych) 2022 • Ayah Zirikly, Mark Dredze

In the case of mental health diagnosis, clinicians already rely on an assessment framework to make these decisions; that framework can help a model generate meaningful explanations. In this work we propose to use PHQ-9 categories as an auxiliary task to explaining a social media based model of depression.

Multi-Task Learning

Paper
Add Code

Civil Unrest on Twitter (CUT): A Dataset of Tweets to Support Research on Civil Unrest

1 code implementation • EMNLP (WNUT) 2020 • Justin Sech, Alexandra DeLucia, Anna L. Buczak, Mark Dredze

We present baseline systems trained on this data for the identification of tweets related to civil unrest.

Paper
Code

Towards Understanding the Role of Gender in Deploying Social Media-Based Mental Health Surveillance Models

no code implementations • NAACL (CLPsych) 2021 • Eli Sherman, Keith Harrigian, Carlos Aguirre, Mark Dredze

Spurred by advances in machine learning and natural language processing, developing social media-based mental health surveillance models has received substantial recent attention.

Paper
Add Code

Study of Manifestation of Civil Unrest on Twitter

1 code implementation • WNUT (ACL) 2021 • Abhinav Chinta, Jingyu Zhang, Alexandra DeLucia, Mark Dredze, Anna L. Buczak

Twitter is commonly used for civil unrest detection and forecasting tasks, but there is a lack of work in evaluating how civil unrest manifests on Twitter across countries and events.

Paper
Code

Changes in Tweet Geolocation over Time: A Study with Carmen 2.0

1 code implementation • COLING (WNUT) 2022 • Jingyu Zhang, Alexandra DeLucia, Mark Dredze

Despite the importance of these tools for data curation, the impact of tweet language, country of origin, and creation date on tool performance remains largely unknown.

Paper
Code

Updated Headline Generation: Creating Updated Summaries for Evolving News Stories

no code implementations • ACL 2022 • Sheena Panthaplackel, Adrian Benton, Mark Dredze

We propose the task of updated headline generation, in which a system generates a headline for an updated article, considering both the previous article and headline.

Headline Generation

Paper
Add Code

Model Distillation for Faithful Explanations of Medical Code Predictions

no code implementations • BioNLP (ACL) 2022 • Zach Wood-Doughty, Isabel Cachola, Mark Dredze

We propose to use knowledge distillation, or training a student model that mimics the behavior of a trained teacher model, as a technique to generate faithful and plausible explanations.

Decision Making Knowledge Distillation

Paper
Add Code

Qualitative Analysis of Depression Models by Demographics

no code implementations • NAACL (CLPsych) 2021 • Carlos Aguirre, Mark Dredze

Models for identifying depression using social media text exhibit biases towards different gender and racial/ethnic groups.

Paper
Add Code

A Closer Look at Claim Decomposition

no code implementations • 18 Mar 2024 • Miriam Wanner, Seth Ebner, Zhengping Jiang, Mark Dredze, Benjamin Van Durme

We investigate how various methods of claim decomposition -- especially LLM-based methods -- affect the result of an evaluation approach such as the recently proposed FActScore, finding that it is sensitive to the decomposition method used.

Attribute

Paper
Add Code

Evaluating Biases in Context-Dependent Health Questions

1 code implementation • 7 Mar 2024 • Sharon Levy, Tahilin Sanchez Karver, William D. Adler, Michelle R. Kaufman, Mark Dredze

We study how large language model biases are exhibited through these contextual questions in the healthcare domain.

Language Modelling Large Language Model

Paper
Code

Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions

1 code implementation • 28 Feb 2024 • Hanjie Chen, Zhouxiang Fang, Yash Singla, Mark Dredze

To address these challenges, we construct two new datasets: JAMA Clinical Challenge and Medbullets.

Benchmarking Multiple-choice +1

Paper
Code

An Eye on Clinical BERT: Investigating Language Model Generalization for Diabetic Eye Disease Phenotyping

1 code implementation • 15 Nov 2023 • Keith Harrigian, Tina Tang, Anthony Gonzales, Cindy X. Cai, Mark Dredze

Diabetic eye disease is a major cause of blindness worldwide.

Language Modelling

Paper
Code

Selecting Shots for Demographic Fairness in Few-Shot Learning with Large Language Models

no code implementations • 14 Nov 2023 • Carlos Aguirre, Kuleen Sasse, Isabel Cachola, Mark Dredze

In this work, we explore the effect of shots, which directly affect the performance of models, on the fairness of LLMs as NLP classification systems.

Fairness Few-Shot Learning +1

Paper
Add Code

MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-Entropies

1 code implementation • 26 May 2023 • Shiyue Zhang, Shijie Wu, Ozan Irsoy, Steven Lu, Mohit Bansal, Mark Dredze, David Rosenberg

Autoregressive language models are trained by minimizing the cross-entropy of the model distribution Q relative to the data distribution P -- that is, minimizing the forward cross-entropy, which is equivalent to maximum likelihood estimation (MLE).

Paper
Code

Transferring Fairness using Multi-Task Learning with Limited Demographic Information

no code implementations • 22 May 2023 • Carlos Aguirre, Mark Dredze

Training supervised machine learning systems with a fairness loss can improve prediction fairness across different demographic groups.

Fairness Multi-Task Learning

Paper
Add Code

BloombergGPT: A Large Language Model for Finance

no code implementations • 30 Mar 2023 • Shijie Wu, Ozan Irsoy, Steven Lu, Vadim Dabravolski, Mark Dredze, Sebastian Gehrmann, Prabhanjan Kambadur, David Rosenberg, Gideon Mann

The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering.

Ranked #1 on Multiple Choice Question Answering (MCQA) on BIG-bench (Hyperbaton)

Causal Judgment Date Understanding +21

Paper
Add Code

Do Text-to-Text Multi-Task Learners Suffer from Task Conflict?

1 code implementation • 13 Dec 2022 • David Mueller, Nicholas Andrews, Mark Dredze

Learning these models often requires specialized training algorithms that address task-conflict in the shared parameter updates, which otherwise can lead to negative transfer.

Language Modelling Multi-Task Learning

Paper
Code

Using Open-Ended Stressor Responses to Predict Depressive Symptoms across Demographics

no code implementations • 15 Nov 2022 • Carlos Aguirre, Mark Dredze, Philip Resnik

Stressors are related to depression, but this relationship is complex.

Topic Models

Paper
Add Code

Zero-shot Cross-lingual Transfer is Under-specified Optimization

1 code implementation • RepL4NLP (ACL) 2022 • Shijie Wu, Benjamin Van Durme, Mark Dredze

Pretrained multilingual encoders enable zero-shot cross-lingual transfer, but often produce unreliable models that exhibit high performance variance on the target language.

Zero-Shot Cross-Lingual Transfer

Paper
Code

The Problem of Semantic Shift in Longitudinal Monitoring of Social Media: A Case Study on Mental Health During the COVID-19 Pandemic

1 code implementation • 22 Jun 2022 • Keith Harrigian, Mark Dredze

Social media allows researchers to track societal and cultural changes over time based on language analysis tools.

Paper
Code

Then and Now: Quantifying the Longitudinal Validity of Self-Disclosed Depression Diagnoses

no code implementations • NAACL (CLPsych) 2022 • Keith Harrigian, Mark Dredze

Self-disclosed mental health diagnoses, which serve as ground truth annotations of mental health status in the absence of clinical measures, underpin the conclusions behind most computational studies of mental health language from the last decade.

Selection bias

Paper
Add Code

What Makes Data-to-Text Generation Hard for Pretrained Language Models?

no code implementations • 23 May 2022 • Moniba Keymanesh, Adrian Benton, Mark Dredze

Previous work shows that pre-trained language models(PLMs) perform remarkably well on this task after fine-tuning on a significant amount of task-specific training data.

Data-to-Text Generation Few-Shot Learning +1

Paper
Add Code

Enriching Unsupervised User Embedding via Medical Concepts

1 code implementation • 20 Mar 2022 • Xiaolei Huang, Franck Dernoncourt, Mark Dredze

Clinical notes in Electronic Health Records (EHR) present rich documented information of patients to inference phenotype for disease diagnosis and study patient characteristics for cohort selection.

Mortality Prediction Phenotype classification +1

Paper
Code

Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction

2 code implementations • EMNLP 2021 • Mahsa Yarmohammadi, Shijie Wu, Marc Marone, Haoran Xu, Seth Ebner, Guanghui Qin, Yunmo Chen, Jialiang Guo, Craig Harman, Kenton Murray, Aaron Steven White, Mark Dredze, Benjamin Van Durme

Zero-shot cross-lingual information extraction (IE) describes the construction of an IE model for some target language, given existing annotations exclusively in some other language, typically English.

Dependency Parsing Event Extraction +4

Paper
Code

Learning to Look Inside: Augmenting Token-Based Encoders with Character-Level Information

no code implementations • 1 Aug 2021 • Yuval Pinter, Amanda Stent, Mark Dredze, Jacob Eisenstein

Commonly-used transformer language models depend on a tokenization schema which sets an unchangeable subword vocabulary prior to pre-training, destined to be applied to all downstream tasks regardless of domain shift, novel word formations, or other sources of vocabulary mismatch.

Paper
Add Code

Faithful and Plausible Explanations of Medical Code Predictions

1 code implementation • 16 Apr 2021 • Zach Wood-Doughty, Isabel Cachola, Mark Dredze

Machine learning models that offer excellent predictive performance often lack the interpretability necessary to support integrated human machine decision-making.

Decision Making

Paper
Code

Improving Zero-Shot Multi-Lingual Entity Linking

no code implementations • 16 Apr 2021 • Elliot Schumacher, James Mayfield, Mark Dredze

Entity linking -- the task of identifying references in free text to relevant knowledge base representations -- often focuses on single languages.

Entity Linking

Paper
Add Code

Fine-tuning Encoders for Improved Monolingual and Zero-shot Polylingual Neural Topic Modeling

1 code implementation • NAACL 2021 • Aaron Mueller, Mark Dredze

Neural topic models can augment or replace bag-of-words inputs with the learned representations of deep pre-trained transformer-based word prediction models.

Classification Cross-Lingual Transfer +3

Paper
Code

Gender and Racial Fairness in Depression Research using Social Media

no code implementations • EACL 2021 • Carlos Aguirre, Keith Harrigian, Mark Dredze

While previous research has raised concerns about possible biases in models produced from this data, no study has quantified how these biases actually manifest themselves with respect to different demographic groups, such as gender and racial/ethnic groups.

Fairness

Paper
Add Code

User Factor Adaptation for User Embedding via Multitask Learning

1 code implementation • EACL (AdaptNLP) 2021 • Xiaolei Huang, Michael J. Paul, Robin Burke, Franck Dernoncourt, Mark Dredze

In this study, we treat the user interest as domains and empirically examine how the user language can vary across the user factor in three English social media datasets.

Clustering text-classification +1

Paper
Code

Generating Synthetic Text Data to Evaluate Causal Inference Methods

no code implementations • 10 Feb 2021 • Zach Wood-Doughty, Ilya Shpitser, Mark Dredze

High-dimensional and unstructured data such as natural language complicates the evaluation of causal inference methods; such evaluations rely on synthetic datasets with known causal effects.

Causal Inference Text Generation

Paper
Add Code

On the State of Social Media Data for Mental Health Research

1 code implementation • NAACL (CLPsych) 2021 • Keith Harrigian, Carlos Aguirre, Mark Dredze

Data-driven methods for mental health treatment and surveillance have become a major focus in computational science research in the last decade.

335

Paper
Code

Do Models of Mental Health Based on Social Media Data Generalize?

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Keith Harrigian, Carlos Aguirre, Mark Dredze

Proxy-based methods for annotating mental health status in social media have grown popular in computational research due to their ability to gather large training samples.

Paper
Code

Cross-Lingual Transfer in Zero-Shot Cross-Language Entity Linking

1 code implementation • Findings (ACL) 2021 • Elliot Schumacher, James Mayfield, Mark Dredze

We find that the multilingual ability of BERT leads to robust performance in monolingual and multilingual settings.

Cross-Lingual Transfer Entity Linking

Paper
Code

Demographic Representation and Collective Storytelling in the Me Too Twitter Hashtag Activism Movement

no code implementations • 13 Oct 2020 • Aaron Mueller, Zach Wood-Doughty, Silvio Amir, Mark Dredze, Alicia L. Nobles

The #MeToo movement on Twitter has drawn attention to the pervasive nature of sexual harassment and violence.

Paper
Add Code

Do Explicit Alignments Robustly Improve Multilingual Encoders?

1 code implementation • EMNLP 2020 • Shijie Wu, Mark Dredze

Multilingual BERT (mBERT), XLM-RoBERTa (XLMR) and other unsupervised multilingual encoders can effectively learn cross-lingual representation.

Paper
Code

Clinical Concept Linking with Contextualized Neural Representations

no code implementations • ACL 2020 • Elliot Schumacher, Andriy Mulyar, Mark Dredze

We propose an approach to concept linking that leverages recent work in contextualized neural models, such as ELMo (Peters et al. 2018), which create a token representation that integrates the surrounding context of the mention and concept name.

Entity Linking

Paper
Add Code

Are All Languages Created Equal in Multilingual BERT?

1 code implementation • WS 2020 • Shijie Wu, Mark Dredze

Multilingual BERT (mBERT) trained on 104 languages has shown surprisingly good cross-lingual performance on several NLP tasks, even without explicit cross-lingual signals.

Cross-Lingual Transfer Dependency Parsing +4

Paper
Code

Sources of Transfer in Multilingual Named Entity Recognition

1 code implementation • ACL 2020 • David Mueller, Nicholas Andrews, Mark Dredze

However, a straightforward implementation of this simple idea does not always work in practice: naive training of NER models using annotated data drawn from multiple languages consistently underperforms models trained on monolingual data alone, despite having access to more training data.

Multilingual Named Entity Recognition named-entity-recognition +2

Paper
Code

Using Noisy Self-Reports to Predict Twitter User Demographics

1 code implementation • NAACL (SocialNLP) 2021 • Zach Wood-Doughty, Paiheng Xu, Xiao Liu, Mark Dredze

We present a method to identify self-reports of race and ethnicity from Twitter profile descriptions.

Paper
Code

Phenotyping of Clinical Notes with Improved Document Classification Models Using Contextualized Neural Language Models

2 code implementations • 30 Oct 2019 • Andriy Mulyar, Elliot Schumacher, Masoud Rouhizadeh, Mark Dredze

Clinical notes contain an extensive record of a patient's health status, such as smoking status or the presence of heart conditions.

Ranked #1 on Clinical Note Phenotyping on I2B2 2006: Smoking

Document Classification General Classification

153

Paper
Code

Mental Health Surveillance over Social Media with Digital Cohorts

no code implementations • WS 2019 • Silvio Amir, Mark Dredze, John W. Ayers

The ability to track mental health conditions via social media opened the doors for large-scale, automated, mental health surveillance.

Paper
Add Code

Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT

2 code implementations • IJCNLP 2019 • Shijie Wu, Mark Dredze

Pretrained contextual representation models (Peters et al., 2018; Devlin et al., 2018) have pushed forward the state-of-the-art on many NLP tasks.

Ranked #8 on Cross-Lingual NER on CoNLL Spanish

Cross-Lingual NER Dependency Parsing +6

Paper
Code

Discriminative Candidate Generation for Medical Concept Linking

no code implementations • AKBC 2019 • Elliot Schumacher, Mark Dredze

Linking mentions of medical concepts in a clinical note to a concept in an ontology enables a variety of tasks that rely on understanding the content of a medical record, such as identifying patient populations and decision support.

Paper
Add Code

Using Author Embeddings to Improve Tweet Stance Classification

no code implementations • WS 2018 • Adrian Benton, Mark Dredze

Many social media classification tasks analyze the content of a message, but do not consider the context of the message.

Classification General Classification +1

Paper
Add Code

Convolutions Are All You Need (For Classifying Character Sequences)

no code implementations • WS 2018 • Zach Wood-Doughty, Nicholas Andrews, Mark Dredze

While recurrent neural networks (RNNs) are widely used for text classification, they demonstrate poor performance and slow convergence when trained on long sequences.

Document Classification General Classification +3

Paper
Add Code

Challenges of Using Text Classifiers for Causal Inference

1 code implementation • EMNLP 2018 • Zach Wood-Doughty, Ilya Shpitser, Mark Dredze

Causal understanding is essential for many kinds of decision-making, but causal inference from observational data has typically only been applied to structured, low-dimensional datasets.

Causal Inference Decision Making

Paper
Code

Johns Hopkins or johnny-hopkins: Classifying Individuals versus Organizations on Twitter

1 code implementation • WS 2018 • Zach Wood-Doughty, Praateek Mahajan, Mark Dredze

Previous work (McCorriston et al., 2015) presented a method for determining if an account was an individual or organization based on account profile and a collection of tweets.

General Classification

Paper
Code

Predicting Twitter User Demographics from Names Alone

1 code implementation • WS 2018 • Zach Wood-Doughty, Nicholas Andrews, Rebecca Marvin, Mark Dredze

Social media analysis frequently requires tools that can automatically infer demographics to contextualize trends.

Paper
Code

Deep Dirichlet Multinomial Regression

1 code implementation • NAACL 2018 • Adrian Benton, Mark Dredze

We present deep Dirichlet Multinomial Regression (dDMR), a generative topic model that simultaneously learns document feature representations and topics.

regression Topic Models

Paper
Code

CADET: Computer Assisted Discovery Extraction and Translation

no code implementations • IJCNLP 2017 • Benjamin Van Durme, Tom Lippincott, Kevin Duh, Deana Burchfield, Adam Poliak, Cash Costello, Tim Finin, Scott Miller, James Mayfield, Philipp Koehn, Craig Harman, Dawn Lawrie, Ch May, ler, Max Thomas, Annabelle Carrell, Julianne Chaloux, Tongfei Chen, Alex Comerford, Mark Dredze, Benjamin Glass, Shudong Hao, Patrick Martin, Pushpendre Rastogi, Rashmi Sankepally, Travis Wolfe, Ying-Ying Tran, Ted Zhang

It combines a multitude of analytics together with a flexible environment for customizing the workflow for different users.

Active Learning Machine Translation +1

Paper
Add Code

Constructing an Alias List for Named Entities during an Event

no code implementations • WS 2017 • Anietie Andy, Mark Dredze, Mugizi Rwebangira, Chris Callison-Burch

EntitySpike uses a temporal heuristic to identify named entities with similar context that occur in the same time period (within minutes) during an event.

Community Question Answering

Paper
Add Code

How Does Twitter User Behavior Vary Across Demographic Groups?

no code implementations • WS 2017 • Zach Wood-Doughty, Michael Smith, David Broniatowski, Mark Dredze

Demographically-tagged social media messages are a common source of data for computational social science.

Paper
Add Code

Pocket Knowledge Base Population

no code implementations • ACL 2017 • Travis Wolfe, Mark Dredze, Benjamin Van Durme

Existing Knowledge Base Population methods extract relations from a closed relational schema with limited coverage leading to sparse KBs.

Knowledge Base Population Open Information Extraction +1

Paper
Add Code

Bayesian Modeling of Lexical Resources for Low-Resource Settings

no code implementations • ACL 2017 • Nicholas Andrews, Mark Dredze, Benjamin Van Durme, Jason Eisner

Practically, this means that we may treat the lexical resources as observations under the proposed generative model.

Low Resource Named Entity Recognition named-entity-recognition +2

Paper
Add Code

Ethical Research Protocols for Social Media Health Research

no code implementations • WS 2017 • Adrian Benton, Glen Coppersmith, Mark Dredze

Social media have transformed data-driven research in political science, the social sciences, health, and medicine.

Decision Making Ethics

Paper
Add Code

Feature Generation for Robust Semantic Role Labeling

no code implementations • 22 Feb 2017 • Travis Wolfe, Mark Dredze, Benjamin Van Durme

Hand-engineered feature sets are a well understood method for creating robust NLP models, but they require a lot of expertise and effort to create.

Semantic Role Labeling

Paper
Add Code

Harmonic Grammar, Optimality Theory, and Syntax Learnability: An Empirical Exploration of Czech Word Order

no code implementations • 19 Feb 2017 • Ann Irvine, Mark Dredze

This work presents a systematic theoretical and empirical comparison of the major algorithms that have been proposed for learning Harmonic and Optimality Theory grammars (HG and OT, respectively).

Paper
Add Code

Name Variation in Community Question Answering Systems

no code implementations • WS 2016 • Anietie Andy, Satoshi Sekine, Mugizi Rwebangira, Mark Dredze

In this paper, we propose an algorithm to reduce the number of unanswered questions in Yahoo!

Community Question Answering Entity Linking

Paper
Add Code

Demographer: Extremely Simple Name Demographics

1 code implementation • WS 2016 • Rebecca Knowles, Josh Carroll, Mark Dredze

Paper
Code

A Study of Imitation Learning Methods for Semantic Role Labeling

no code implementations • WS 2016 • Travis Wolfe, Mark Dredze, Benjamin Van Durme

Imitation Learning Semantic Role Labeling +1

Paper
Add Code

Twitter at the Grammys: A Social Media Corpus for Entity Linking and Disambiguation

1 code implementation • WS 2016 • Mark Dredze, Nicholas Andrews, Jay DeYoung

Coreference Resolution Entity Linking

Paper
Code

Multi-task Domain Adaptation for Sequence Tagging

no code implementations • WS 2017 • Nanyun Peng, Mark Dredze

Many domain adaptation approaches rely on learning cross domain shared representations to transfer the knowledge learned in one domain to other domains.

Chinese Word Segmentation Domain Adaptation +4

Paper
Add Code

Learning Multiview Embeddings of Twitter Users

no code implementations • ACL 2016 • Adrian Benton, Raman Arora, Mark Dredze

Paper
Add Code

Twitter as a Source of Global Mobility Patterns for Social Good

no code implementations • 20 Jun 2016 • Mark Dredze, Manuel García-Herranz, Alex Rutherford, Gideon Mann

Data on human spatial distribution and movement is essential for understanding and analyzing social systems.

Humanitarian

Paper
Add Code

Geolocation for Twitter: Timing Matters

no code implementations • NAACL 2016 • Mark Dredze, Miles Osborne, Prabhanjan Kambadur

Paper
Add Code

Knowledge Base Population for Organization Mentions in Email

no code implementations • WS 2016 • Ning Gao, Mark Dredze, Douglas Oard

Entity Linking Knowledge Base Population +1

Paper
Add Code

Embedding Lexical Features via Low-Rank Tensors

1 code implementation • NAACL 2016 • Mo Yu, Mark Dredze, Raman Arora, Matthew Gormley

Modern NLP models rely heavily on engineered features, which often combine word and contextual information into complex lexical features.

Relation Extraction

Paper
Code

Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning

no code implementations • ACL 2016 • Nanyun Peng, Mark Dredze

Named entity recognition, and other information extraction tasks, frequently use linguistic features such as part of speech tags or chunkings.

named-entity-recognition Named Entity Recognition +3

Paper
Add Code

Named Entity Recognition for Chinese Social Media with Jointly Trained Embeddings

1 code implementation • EMNLP 2015 • Nanyun Peng, Mark Dredze

named-entity-recognition Named Entity Recognition +1

534

Paper
Code

Approximation-Aware Dependency Parsing by Belief Propagation

no code implementations • TACL 2015 • Matthew R. Gormley, Mark Dredze, Jason Eisner

We show how to adjust the model parameters to compensate for the errors introduced by this approximation, by following the gradient of the actual loss on training data.

Dependency Parsing

Paper
Add Code

An Empirical Study of Chinese Name Matching and Applications

1 code implementation • IJCNLP 2015 • Nanyun Peng, Mo Yu, Mark Dredze

Coreference Resolution Entity Linking +1

Paper
Code

FrameNet+: Fast Paraphrastic Tripling of FrameNet

no code implementations • IJCNLP 2015 • Ellie Pavlick, Travis Wolfe, Pushpendre Rastogi, Chris Callison-Burch, Mark Dredze, Benjamin Van Durme

Knowledge Base Population Natural Language Inference +1

Paper
Add Code

A Concrete Chinese NLP Pipeline

no code implementations • NAACL 2015 • Nanyun Peng, Francis Ferraro, Mo Yu, Nicholas Andrews, Jay DeYoung, Max Thomas, Matthew R. Gormley, Travis Wolfe, Craig Harman, Benjamin Van Durme, Mark Dredze

Coreference Resolution Entity Linking +6

Paper
Add Code

CLPsych 2015 Shared Task: Depression and PTSD on Twitter

no code implementations • WS 2015 • Glen Coppersmith, Mark Dredze, Craig Harman, Kristy Hollingshead, Margaret Mitchell

Paper
Add Code

From ADHD to SAD: Analyzing the Language of Mental Health on Twitter through Self-Reported Diagnoses

no code implementations • WS 2015 • Glen Coppersmith, Mark Dredze, Craig Harman, Kristy Hollingshead

Paper
Add Code

Interactive Knowledge Base Population

no code implementations • 31 May 2015 • Travis Wolfe, Mark Dredze, James Mayfield, Paul McNamee, Craig Harman, Tim Finin, Benjamin Van Durme

Most work on building knowledge bases has focused on collecting entities and facts from as large a collection of documents as possible.

Knowledge Base Population

Paper
Add Code

Improved Relation Extraction with Feature-Rich Compositional Embedding Models

1 code implementation • EMNLP 2015 • Matthew R. Gormley, Mo Yu, Mark Dredze

We propose a Feature-rich Compositional Embedding Model (FCM) for relation extraction that is expressive, generalizes to new domains, and is easy-to-implement.

Ranked #1 on Relation Extraction on ACE 2005 (Cross Sentence metric)

Relation Relation Classification +2

Paper
Code

Predicate Argument Alignment using a Global Coherence Model

1 code implementation • HLT 2015 • Mark Dredze, Benjamin Van Durme, Travis Wolfe

coreference-resolution Cross Document Coreference Resolution +3

Paper
Code

Combining Word Embeddings and Feature Embeddings for Fine-grained Relation Extraction

1 code implementation • HLT 2015 • Mark Dredze, Mo Yu, Matthew R. Gormley

Machine Translation NER +5

Paper
Code

Entity Linking for Spoken Language

no code implementations • HLT 2015 • Mark Dredze, Adrian Benton

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Sprite: Generalizing Topic Models with Structured Priors

no code implementations • TACL 2015 • Michael J. Paul, Mark Dredze

We introduce Sprite, a family of topic models that incorporates structure into model priors as a function of underlying components.

Topic Models

Paper
Add Code

Learning Composition Models for Phrase Embeddings

1 code implementation • TACL 2015 • Mo Yu, Mark Dredze

We propose efficient unsupervised and task-specific learning objectives that scale our model to large datasets.

Language Modelling Semantic Similarity +2

Paper
Code

Learning Polylingual Topic Models from Code-Switched Social Media Documents

no code implementations • ACL 2014 • Nanyun Peng, Yiming Wang, Mark Dredze

Machine Translation Topic Models

Paper
Add Code

Improving Lexical Embeddings with Semantic Knowledge

1 code implementation • ACL 2014 • Mo Yu, Mark Dredze

Language Modelling Learning Word Embeddings +2

Paper
Code

Low-Resource Semantic Role Labeling

no code implementations • ACL 2014 • Matthew R. Gormley, Margaret Mitchell, Benjamin Van Durme, Mark Dredze

Information Retrieval Machine Translation +2

Paper
Add Code

Robust Entity Clustering via Phylogenetic Inference

no code implementations • ACL 2014 • Nicholas Andrews, Jason Eisner, Mark Dredze

Clustering Coreference Resolution

Paper
Add Code

Quantifying Mental Health Signals in Twitter

no code implementations • WS 2014 • Glen Coppersmith, Mark Dredze, Craig Harman

Paper
Add Code

PARMA: A Predicate Argument Aligner

1 code implementation • ACL 2013 • Travis Wolfe, Benjamin Van Durme, Mark Dredze, Nicholas Andrews, Charley Beller, Chris Callison-Burch, Jay DeYoung, Justin Snyder, Jonathan Weese, Tan Xu, Xuchen Yao

Coreference Resolution Entity Linking +2

Paper
Code

Drug Extraction from the Web: Summarizing Drug Experiences with Multi-Dimensional Topic Models

no code implementations • NAACL 2013 • Michael J. Paul, Mark Dredze

Document Summarization Multi-Document Summarization +1

Paper
Add Code

Broadly Improving User Classification via Communication-Based Name and Location Clustering on Twitter

no code implementations • NAACL 2013 • Shane Bergsma, Mark Dredze, Benjamin Van Durme, Theresa Wilson, David Yarowsky

Clustering General Classification

Paper
Add Code

Separating Fact from Fear: Tracking Flu Infections on Twitter

no code implementations • NAACL 2013 • Alex Lamb, Michael J. Paul, Mark Dredze

Paper
Add Code

Topic Models and Metadata for Visualizing Text Corpora

no code implementations • NAACL 2013 • Justin Snyder, Rebecca Knowles, Mark Dredze, Matthew Gormley, Travis Wolfe

Topic Models

Paper
Add Code

What's in a Domain? Multi-Domain Learning for Multi-Attribute Data

no code implementations • NAACL 2013 • Mahesh Joshi, Mark Dredze, William W. Cohen, Carolyn P. Ros{\'e}

Attribute

Paper
Add Code

Factorial LDA: Sparse Multi-Dimensional Text Models

no code implementations • NeurIPS 2012 • Michael Paul, Mark Dredze

Multi-dimensional latent variable models can capture the many latent factors in a text corpus, such as topic, author perspective and sentiment.

Paper
Add Code

Fast Syntactic Analysis for Statistical Language Modeling via Substructure Sharing and Uptraining

no code implementations • ACL 2012 • Ariya Rastrow, Mark Dredze, Sanjeev Khudanpur

Language Modelling Machine Translation +1

Paper
Add Code

Multi-Domain Learning: When Do Domains Matter?

no code implementations • EMNLP 2012 • Mahesh Joshi, Mark Dredze, William W. Cohen, Carolyn Ros{\'e}

Domain Adaptation Sentiment Analysis

Paper
Add Code

Name Phylogeny: A Generative Model of String Variation

no code implementations • EMNLP 2012 • Nicholas Andrews, Jason Eisner, Mark Dredze

Coreference Resolution Transliteration

Paper
Add Code

Revisiting the Case for Explicit Syntactic Information in Language Models

no code implementations • WS 2012 • Ariya Rastrow, Sanjeev Khudanpur, Mark Dredze

Language Modelling Machine Translation +1

Paper
Add Code

Entity Clustering Across Languages

no code implementations • NAACL 2012 • Spence Green, Nicholas Andrews, Matthew R. Gormley, Mark Dredze, Christopher D. Manning

Clustering Coreference Resolution +1

Paper
Add Code

Shared Components Topic Models

no code implementations • NAACL 2012 • Matthew R. Gormley, Mark Dredze, Benjamin Van Durme, Jason Eisner

Topic Models

Paper
Add Code

Adaptive Regularization of Weight Vectors

no code implementations • NeurIPS 2009 • Koby Crammer, Alex Kulesza, Mark Dredze

We present AROW, a new online learning algorithm that combines several properties of successful : large margin training, confidence weighting, and the capacity to handle non-separable data.

Paper
Add Code

Exact Convex Confidence-Weighted Learning

no code implementations • NeurIPS 2008 • Koby Crammer, Mark Dredze, Fernando Pereira

Confidence-weighted (CW) learning [6], an online learning method for linear classifiers, maintains a Gaussian distributions over weight vectors, with a covariance matrix that represents uncertainty about weights and correlations.

Paper
Add Code

Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification

no code implementations • 1 Jun 2007 • John Blitzer, Mark Dredze, Fernando Pereira

Automatic sentiment classification has been extensively studied and applied in recent years.

Domain Adaptation General Classification +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.