no code implementations • 14 Feb 2024 • Matthieu Meeus, Igor Shilov, Manuel Faysse, Yves-Alexandre de Montjoye
We here propose to use copyright traps, the inclusion of fictitious entries in original content, to detect the use of copyrighted materials in LLMs with a focus on models where memorization does not naturally occur.
no code implementations • 23 Oct 2023 • Matthieu Meeus, Shubham Jain, Marek Rei, Yves-Alexandre de Montjoye
First, we propose a procedure for the development and evaluation of document-level membership inference for LLMs by leveraging commonly used data sources for training and the model release date.
no code implementations • 4 Jul 2023 • Florent Guépin, Matthieu Meeus, Ana-Maria Cretu, Yves-Alexandre de Montjoye
While membership inference attacks (MIAs), based on shadow modeling, have become the standard to evaluate the privacy of synthetic data, they currently assume the attacker to have access to an auxiliary dataset sampled from a similar distribution as the training dataset.
no code implementations • 17 Jun 2023 • Matthieu Meeus, Florent Guépin, Ana-Maria Cretu, Yves-Alexandre de Montjoye
The choice of vulnerable records is as important as more accurate MIAs when evaluating the privacy of synthetic data releases, including from a legal perspective.