1 code implementation • 23 Mar 2024 • Nan Zhang, Connor Heaton, Sean Timothy Okonsky, Prasenjit Mitra, Hilal Ezgi Toraman
To mitigate this gap, we present the Printed English and Chemical Equations (PEaCE) dataset, containing both synthetic and real-world records, and evaluate the efficacy of transformer-based OCR models when trained on this resource.
Optical Character Recognition Optical Character Recognition (OCR)
no code implementations • 15 Jan 2024 • Saptarshi Sengupta, Connor Heaton, Prasenjit Mitra, Soumalya Sarkar
Machine Reading Comprehension (MRC) has been a long-standing problem in NLP and, with the recent introduction of the BERT family of transformer based language models, it has come a long way to getting solved.
no code implementations • 25 Oct 2023 • Saptarshi Sengupta, Connor Heaton, Shreya Ghosh, Preslav Nakov, Prasenjit Mitra
Domain adaptation, the process of training a model in one domain and applying it to another, has been extensively explored in machine learning.
1 code implementation • 11 Sep 2021 • Connor Heaton, Prasenjit Mitra
Major League Baseball (MLB) has a storied history of using statistics to better understand and discuss the game of baseball, with an entire discipline of statistics dedicated to the craft, known as sabermetrics.