no code implementations • ALTA 2018 • Xavier Holt, Andrew Chisholm
Business documents encode a wealth of information in a format tailored to human consumption {--} i. e. aesthetically disbursed natural language text, graphics and tables.
BIG-bench Machine Learning Optical Character Recognition (OCR)
1 code implementation • EACL 2017 • Andrew Chisholm, Will Radford, Ben Hachey
We investigate the generation of one-sentence Wikipedia biographies from facts derived from Wikidata slot-value pairs.
no code implementations • 20 Feb 2017 • Bo Han, Will Radford, Anaïs Cadilhac, Art Harol, Andrew Chisholm, Ben Hachey
Text generation is increasingly common but often requires manual post-editing where high precision is critical to end users.
no code implementations • 7 Nov 2016 • Will Radford, Andrew Chisholm, Ben Hachey, Bo Han
We report on an exploratory analysis of Emoji Dick, a project that leverages crowdsourcing to translate Melville's Moby Dick into emoji.
1 code implementation • TACL 2015 • Andrew Chisholm, Ben Hachey
Entity disambiguation with Wikipedia relies on structured information from redirect pages, article text, inter-article links, and categories.