Search Results for author: Gatlen Culp

Found 1 papers, 1 papers with code

Explore, Establish, Exploit: Red Teaming Language Models from Scratch

3 code implementations15 Jun 2023 Stephen Casper, Jason Lin, Joe Kwon, Gatlen Culp, Dylan Hadfield-Menell

Using a pre-existing classifier does not allow for red-teaming to be tailored to the target model.

Cannot find the paper you are looking for? You can Submit a new open access paper.