‘User Inference Attacks on Large Language Models’

“In this paper, we study the privacy implications of fine-tuning LLMs on user data. To this end, we define a realistic threat model, called user inference, wherein an attacker infers whether or not a user’s data was used for fine-tuning. We implement attacks for this threat model that require only a small set of samples from a user (possibly different from the samples used for training). … We find that LLMs are susceptible to user inference attacks across a variety of fine-tuning datasets, at times with near perfect attack success rates.”

Find the paper and full list of authors at ArXiv.

View on Site: ‘User Inference Attacks on Large Language Models’