Glossary/Data Privacy in AI — Differential Privacy, GDPR & Model Leakage
AI Safety & Ethics

Data Privacy in AI — Differential Privacy, GDPR & Model Leakage

Protecting individuals from AI systems that learn too much about them.


Definition

AI systems are voracious consumers of personal data — training on medical records, financial transactions, private messages, and browsing history. Data privacy in AI addresses three concerns: (1) Training data privacy — preventing models from memorising and leaking personal information. (2) Inference privacy — protecting the data of individuals whose information is used for predictions. (3) Model inversion — the risk that model outputs reveal training data. Differential privacy provides mathematically rigorous privacy guarantees. GDPR and CCPA create legal frameworks. The tension between AI capability (which often improves with more data) and privacy (which requires limiting data) is a central AI ethics challenge.

The model memorisation problem

Large language models memorise training data. GPT-2 can reproduce verbatim text from The New York Times articles seen during training. GPT-3 can repeat individuals' personal information from web pages. Carlini et al. (2021) demonstrated that by carefully crafting prompts, they could extract memorised credit card numbers, private emails, and personal information from GPT-2. This is not a bug — it is an inherent risk of training on real-world data containing private information.

Differential Privacy — mathematical privacy guarantee

For any two datasets D and D' differing in one individual, the mechanism M's output is nearly indistinguishable. ε (epsilon) = privacy budget: smaller = more private, more noise. δ = probability of failure. Gaussian mechanism adds calibrated noise to gradients during training.

Differential Privacy in ML training with Opacus

# Opacus: differentially private training for PyTorch
# pip install opacus

import torch
import torch.nn as nn
from opacus import PrivacyEngine
from opacus.validators import ModuleValidator

# Standard model and training setup
model     = nn.Sequential(nn.Linear(10, 64), nn.ReLU(), nn.Linear(64, 2))
model     = ModuleValidator.fix(model)   # Fix any incompatible layers
optimizer = torch.optim.SGD(model.parameters(), lr=0.05)
loss_fn   = nn.CrossEntropyLoss()

# Wrap with PrivacyEngine — adds Gaussian noise to gradients
privacy_engine = PrivacyEngine()
model, optimizer, data_loader = privacy_engine.make_private_with_epsilon(
    module=model,
    optimizer=optimizer,
    data_loader=train_loader,
    epochs=10,
    target_epsilon=1.0,       # Privacy budget: lower = more private
    target_delta=1e-5,        # Probability of privacy failure
    max_grad_norm=1.0,        # Clip gradients to bound sensitivity
)

# Training loop is identical to non-private training
for epoch in range(10):
    for X, y in data_loader:
        optimizer.zero_grad()
        loss = loss_fn(model(X), y)
        loss.backward()
        optimizer.step()

    # Report current privacy expenditure
    epsilon = privacy_engine.get_epsilon(delta=1e-5)
    print(f"Epoch {epoch}: ε = {epsilon:.3f} (target: 1.0)")

# Privacy guarantee: knowing the model output tells an adversary almost nothing
# about whether any specific individual was in the training data

# ── Membership Inference Attack (testing if data was used for training) ──
def membership_inference_attack(model, target_point, threshold=0.7):
    """
    Simple membership inference: if model is very confident on this point,
    it may have memorised it (was in training set).
    """
    with torch.no_grad():
        probs = torch.softmax(model(target_point), dim=-1)
        confidence = probs.max().item()
    return confidence > threshold, confidence

# With differential privacy: confidence should be roughly equal for
# training and non-training data — membership inference becomes hard

GDPR compliance for AI systems

GDPR RequirementAI implicationTechnical implementation
Lawful basis for processingTraining data must have consent or legitimate interestData provenance tracking, consent management systems
Purpose limitationData collected for one purpose cannot be used for anotherSeparate training pipelines, use-case documentation
Data minimisationCollect only what is necessaryFeature selection, anonymisation before training
Right to erasure (Right to be forgotten)Delete a person's data from modelMachine unlearning algorithms (active research area)
Right to explanation (Article 22)Explain automated decisionsXAI systems (SHAP, LIME, rule-based fallbacks)
Data Protection Impact AssessmentAssess privacy risks before high-risk AI deploymentPre-deployment risk assessment documentation

Machine unlearning — the hardest GDPR problem

Right to erasure (GDPR Article 17) requires that personal data be deleted on request. For traditional databases this is straightforward. For ML models trained on that data: the information is distributed across millions of parameters in ways that cannot be directly "deleted." Machine unlearning (removing a specific training example's influence from a trained model) is an active research problem with no fully satisfactory solution yet. Current approaches: retrain from scratch (impractical for LLMs), approximate unlearning (perturb weights), or certified unlearning (bounds on residual information).

Practice questions

  1. What is a membership inference attack? (Answer: An adversary queries a trained model to determine whether a specific data point was in the training set. If the model is more confident on training data than held-out data, the adversary can infer membership. Used to reveal whether medical records, private messages, or financial data were used for training.)
  2. Differential privacy with ε=0.1 vs ε=10 — which is more private and what is the cost? (Answer: ε=0.1 is far more private (smaller privacy budget = more noise added). Cost: more noise = lower model accuracy. ε=0.1 makes the model nearly indistinguishable regardless of any individual's data but may reduce accuracy significantly. ε=10 provides weak privacy guarantees but minimal accuracy loss. Typical production: ε=1-3.)
  3. An AI company trains a model on user emails without explicit consent but argues they have "legitimate interest" under GDPR. Is this valid? (Answer: Disputed. GDPR's legitimate interest basis requires the company's interest to not be overridden by the individual's privacy interests. Training commercial AI on private personal communications would likely fail this test. Most regulators and courts have ruled that legitimate interest does not justify using personal data for commercial AI training without consent.)
  4. What makes "right to erasure" for ML models technically difficult? (Answer: Neural network weights are holistically trained — no single data point has a dedicated location. The influence of one training example is spread across all parameters. You cannot "delete" a data point from a trained model. Retraining from scratch is the only complete solution — infeasible for LLMs. Active research: certified unlearning, gradient-based approximate unlearning.)
  5. Max grad norm clipping in differential privacy serves what purpose? (Answer: Gradient clipping bounds the sensitivity of each training step to any single data point. Without clipping, one outlier data point can have unbounded influence on gradient updates. Clipping to max_grad_norm=1.0 ensures each sample contributes at most 1.0 to the gradient norm — allowing precisely calibrated noise to achieve (ε,δ)-privacy.)

On LumiChats

Anthropic publicly commits to not training on user conversations by default and provides opt-out mechanisms. Understanding differential privacy explains why privacy guarantees are mathematically bounded rather than absolute — and why truly private AI requires fundamentally different training approaches than standard gradient descent.

Try it free

Try LumiChats for ₹69

39+ AI models. Study Mode with page-locked answers. Agent Mode with code execution. Pay only on days you use it.

Get Started — ₹69/day

Related Terms

4 terms