API Documentation
=================

This section provides detailed information about the APIs available in the Large Language Model Privacy Benchmark (LLM-PBE) library, covering Models, Attacks, Defenses, and Data modules.

Models
------

The Models module provides interfaces to interact with various large language models, both open-source and proprietary.

ChatGPT
^^^^^^^

The `ChatGPT` class provides specific functionalities to interact with the ChatGPT model.

**Parameters:**

- **api_key** (str): API key for accessing ChatGPT. Required for proprietary models.
- **model** (object): Model name, e.g., "gpt-3.5-turbo".
- **max_attempts** (int): Maximum number of attempts to generate text in case of a failure.
- **max_tokens** (int): Maximum number of tokens to generate.
- **temperature** (float): Sampling temperature for text generation.

**Methods:**

- `query(text)`: Generates a response to the input text.


TogetherAI
^^^^^^^^^^

The `Together` class provides specific functionalities to interact with models hosted at [TogetherAI](https://www.together.ai/).

**Parameters:**

- **api_key** (str): API key for accessing TogetherAI. Required for proprietary models.
- **model** (str): Model name, e.g., "meta-llama/Llama-2-7b-hf".
- **max_attempts** (int): Maximum number of attempts to generate text in case of a failure.
- **max_tokens** (int): Maximum number of tokens to generate.
- **temperature** (float): Sampling temperature for text generation.
- **top_p** (float): The cumulative probability of the top tokens to consider at each step.
- **top_k** (int): The maximum number of tokens to consider at each step.
- **repetition_penalty** (float): The parameter for repetition penalty.

**Methods:**

- `query(text)`: Generates a response to the input text.


Data
----

The Data module provides interfaces to interact with various datasets and data sources.

Enron
^^^^^

The `Enron` class provides specific functionalities to interact with the Enron dataset.

**Parameters:**

- **sample_duplication_rate** (float): The rate of sample duplication.
- **pseudonymize** (bool): True or False to pseudonymize the dataset.
- **mode** (str): "scrubbed" or "undefended"  for the dataset.

**Methods:**

- `train_set()`: Return the training dataset.
- `test_set()`: Return the test dataset.

Echr
^^^^

The `Echr` class provides specific functionalities to interact with the Echr dataset.

**Parameters:**

- **sample_duplication_rate** (float): The rate of sample duplication.
- **pseudonymize** (bool): True or False to pseudonymize the dataset.
- **mode** (str): "scrubbed" or "undefended"  for the dataset.

**Methods:**

- `train_set()`: Return the training dataset.
- `test_set()`: Return the test dataset.


Attacks
-------

The Attacks module implements various techniques to test the privacy and security of large language models.

DEA
^^^

The `DEA` class provides specific functionalities for data extraction attacks.

**Parameters:**

- **model** (object): The model class to attack.
- **data** (object): The data class to attack.
- **attack_method** (str): The attack method to use.


**Methods:**

- `execute(data, model)`: Generates a response to the input text.
  

Jailbreak
^^^^^^^^^

The `Jailbreak` class provides specific functionalities for data extraction attacks.


**Methods:**

- `execute(data, model)`: Return the responses of the model to the input text with jailbreaking prompts.