no code implementations • 13 Feb 2024 • Aishwarya P S, Pranav Ajit Nair, Yashas Samaga, Toby Boyd, Sanjiv Kumar, Prateek Jain, Praneeth Netrapalli
On the PaLM2 pretraining dataset, a tandem of PaLM2-Bison and PaLM2-Gecko demonstrates a 3. 3% improvement in next-token prediction accuracy over a standalone PaLM2-Gecko, offering a 1. 16x speedup compared to a PaLM2-Otter model with comparable downstream performance.
1 code implementation • 9 Feb 2021 • Albin Cassirer, Gabriel Barth-Maron, Eugene Brevdo, Sabela Ramos, Toby Boyd, Thibault Sottiaux, Manuel Kroiss
A central component of training in Reinforcement Learning (RL) is Experience: the data used for training.