Search Results for author: Catherine Finegan-Dollak

Found 8 papers, 2 papers with code

Layout-Aware Text Representations Harm Clustering Documents by Type

no code implementations • EMNLP (insights) 2020 • Catherine Finegan-Dollak, Ashish Verma

Clustering documents by type—grouping invoices with invoices and articles with articles—is a desirable first step for organizing large collections of document scans.

Clustering Vocal Bursts Type Prediction

Paper
Add Code

GVdoc: Graph-based Visual Document Classification

1 code implementation • 26 May 2023 • Fnu Mohbat, Mohammed J. Zaki, Catherine Finegan-Dollak, Ashish Verma

Visual document classifiers have shown impressive performance on in-distribution test sets.

Classification Document Classification +3

Paper
Code

Position Masking for Improved Layout-Aware Document Understanding

no code implementations • 1 Sep 2021 • Anik Saha, Catherine Finegan-Dollak, Ashish Verma

Natural language processing for document scans and PDFs has the potential to enormously improve the efficiency of business processes.

document understanding Position +1

Paper
Add Code

Label Noise in Context

no code implementations • ACL 2020 • Michael Desmond, Catherine Finegan-Dollak, Jeff Boston, Matt Arnold

Label noise{---}incorrectly or ambiguously labeled training examples{---}can negatively impact model performance.

text-classification Text Classification

Paper
Add Code

Improving Text-to-SQL Evaluation Methodology

1 code implementation • ACL 2018 • Catherine Finegan-Dollak, Jonathan K. Kummerfeld, Li Zhang, Karthik Ramanathan, Sesh Sadasivam, Rui Zhang, Dragomir Radev

Second, we show that the current division of data into training and test sets measures robustness to variations in the way questions are asked, but only partially tests how well systems generalize to new queries; therefore, we propose a complementary dataset split for evaluation of future work.

Ranked #1 on SQL Parsing on IMDb

SQL Parsing Text-To-SQL

504

Paper
Code

Effective Crowdsourcing for a New Type of Summarization Task

no code implementations • NAACL 2018 • Youxuan Jiang, Catherine Finegan-Dollak, Jonathan K. Kummerfeld, Walter Lasecki

Most summarization research focuses on summarizing the entire given text, but in practice readers are often interested in only one aspect of the document or conversation.

Vocal Bursts Type Prediction

Paper
Add Code

Effects of Creativity and Cluster Tightness on Short Text Clustering Performance

no code implementations • ACL 2016 • Catherine Finegan-Dollak, Reed Coke, Rui Zhang, Xiangyi Ye, Dragomir Radev

Clustering Semantic Textual Similarity +2

Paper
Add Code

Content Models for Survey Generation: A Factoid-Based Evaluation

no code implementations • IJCNLP 2015 • Rahul Jha, Catherine Finegan-Dollak, Ben King, Reed Coke, Dragomir Radev

Dependency Parsing Information Retrieval +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.