Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study

17 Aug 2020 Dara Bahri Yi Tay Che Zheng Donald Metzler Cliff Brunk Andrew Tomkins

Large generative language models such as GPT-2 are well-known for their ability to generate text as well as their utility in supervised downstream tasks via fine-tuning. Our work is twofold: firstly we demonstrate via human evaluation that classifiers trained to discriminate between human and machine-generated text emerge as unsupervised predictors of "page quality", able to detect low quality content without any training... (read more)

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper