no code implementations • 31 Oct 2023 • Mattia Opper, J. Morrison, N. Siddharth
Using BabyBERTa as a probe, we find that grammar acquisition is largely driven by exposure to speech data, and in particular through exposure to two of the BabyLM training corpora: AO-Childes and Open Subtitles.