no code implementations • 29 Apr 2024 • Scott Viteri, Max Lamparth, Peter Chatain, Clark Barrett
We formalize the idea that the truthfulness of a sender to a receiver LM is the degree to which the sender helps the receiver predict their future observations.
1 code implementation • 25 Oct 2023 • Gabriel Mukobi, Peter Chatain, Su Fong, Robert Windesheim, Gitta Kutyniok, Kush Bhatia, Silas Alberti
Here, we focus on two prevalent methods used to align these models, Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF).
no code implementations • 14 Feb 2023 • Michael Sun, Peter Chatain
In recent years, neural networks (NNs) have made giant leaps in a wide variety of domains.