Search Results for author: Natalia Vassilieva

Found 4 papers, 1 papers with code

SlimPajama-DC: Understanding Data Combinations for LLM Training

no code implementations19 Sep 2023 Zhiqiang Shen, Tianhua Tao, Liqun Ma, Willie Neiswanger, Zhengzhong Liu, Hongyi Wang, Bowen Tan, Joel Hestness, Natalia Vassilieva, Daria Soboleva, Eric Xing

This paper aims to understand the impacts of various data combinations (e. g., web text, wikipedia, github, books) on the training of large language models using SlimPajama.

Cannot find the paper you are looking for? You can Submit a new open access paper.