Search Results for author: Catherine Finegan-Dollak

Found 8 papers, 2 papers with code

Layout-Aware Text Representations Harm Clustering Documents by Type

no code implementations EMNLP (insights) 2020 Catherine Finegan-Dollak, Ashish Verma

Clustering documents by type—grouping invoices with invoices and articles with articles—is a desirable first step for organizing large collections of document scans.

Clustering Vocal Bursts Type Prediction

Position Masking for Improved Layout-Aware Document Understanding

no code implementations1 Sep 2021 Anik Saha, Catherine Finegan-Dollak, Ashish Verma

Natural language processing for document scans and PDFs has the potential to enormously improve the efficiency of business processes.

document understanding Position +1

Label Noise in Context

no code implementations ACL 2020 Michael Desmond, Catherine Finegan-Dollak, Jeff Boston, Matt Arnold

Label noise{---}incorrectly or ambiguously labeled training examples{---}can negatively impact model performance.

text-classification Text Classification

Improving Text-to-SQL Evaluation Methodology

1 code implementation ACL 2018 Catherine Finegan-Dollak, Jonathan K. Kummerfeld, Li Zhang, Karthik Ramanathan, Sesh Sadasivam, Rui Zhang, Dragomir Radev

Second, we show that the current division of data into training and test sets measures robustness to variations in the way questions are asked, but only partially tests how well systems generalize to new queries; therefore, we propose a complementary dataset split for evaluation of future work.

SQL Parsing Text-To-SQL

Effective Crowdsourcing for a New Type of Summarization Task

no code implementations NAACL 2018 Youxuan Jiang, Catherine Finegan-Dollak, Jonathan K. Kummerfeld, Walter Lasecki

Most summarization research focuses on summarizing the entire given text, but in practice readers are often interested in only one aspect of the document or conversation.

Vocal Bursts Type Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.