Search Results for author: Leander Heldring

Found 1 papers, 0 papers with code

American Stories: A Large-Scale Structured Text Dataset of Historical U.S. Newspapers

no code implementations NeurIPS 2023 Melissa Dell, Jacob Carlson, Tom Bryan, Emily Silcock, Abhishek Arora, Zejiang Shen, Luca D'Amico-Wong, Quan Le, Pablo Querubin, Leander Heldring

The resulting American Stories dataset provides high quality data that could be used for pre-training a large language model to achieve better understanding of historical English and historical world knowledge.

Language Modelling Large Language Model +3

Cannot find the paper you are looking for? You can Submit a new open access paper.