Welcome to LLM-PBE’s documentation!

LLM-PBE is a toolkit to assess the data privacy of LLMs. It has the following features

  • Comprehensive attack approaches (data extraction attacks, membership inference attacks, jailbreaking attacks, prompt injection attacks).

  • Practical defense approaches (differential privacy, machine unlearning, defensive prompting).

  • Multiple types of data (personally-identificable information, copywrited work, domain knowledge, prompts).

  • Accessing different LLMs (GPT-3.5/4, HuggingFace, TogetherAI, etc).

images/components.png