GDPR Storage Backend Configuration

Overview

The GDPR compliance endpoints (/api/v1/users/me/*) require persistent storage to meet data subject rights requirements under GDPR Articles 15, 16, 17, 20, and 21. CRITICAL: In-memory storage is NOT production-ready and will cause data loss on server restart, violating GDPR compliance requirements.

PostgreSQL storage backend is fully implemented as of ADR-0041 (2025-11-02). See ADR-0041: PostgreSQL GDPR Storage for architecture details.

Environment Variables

Required Configuration

Set these environment variables in your production deployment:

## Required: Set environment to production
ENVIRONMENT=production

## Required: Choose persistent backend (postgres or redis)
GDPR_STORAGE_BACKEND=postgres  # or "redis"

Development/Testing

For local development and testing only:

ENVIRONMENT=development
GDPR_STORAGE_BACKEND=memory

Storage Backend Options

Option 1: PostgreSQL (Recommended) ✅

PostgreSQL provides ACID compliance and is ideal for GDPR data subject rights. Fully implemented as of ADR-0041.

Production Ready: PostgreSQL storage backend is fully implemented with factory pattern, migrations, and comprehensive testing.

Configuration:

GDPR_STORAGE_BACKEND=postgres
GDPR_POSTGRES_URL=postgresql://user:pass@localhost:5432/gdpr

Architecture Benefits (see ADR-0041):

ACID Compliance: Atomic GDPR Article 17 deletions across all data
Cost-Effective: 14x cheaper than Redis for 7-year retention ( $50/month vs$ 720/month)
Audit Trail: Time-series queries for compliance reports
Already in Stack: Keycloak and OpenFGA use PostgreSQL
5-10ms Latency: Acceptable for user-initiated GDPR operations

Database Schema: The PostgreSQL schema includes 5 tables optimized for GDPR compliance:

-- User profiles (GDPR Article 15, 16, 17)
CREATE TABLE user_profiles (
    user_id VARCHAR(255) PRIMARY KEY,
    username VARCHAR(255) NOT NULL,
    name VARCHAR(255),
    email VARCHAR(255),
    preferences JSONB,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- User preferences (GDPR Article 16, 17)
CREATE TABLE user_preferences (
    user_id VARCHAR(255) PRIMARY KEY REFERENCES user_profiles(user_id) ON DELETE CASCADE,
    preferences JSONB NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Consent records (GDPR Article 21, 7-year retention)
CREATE TABLE consent_records (
    consent_id VARCHAR(255) PRIMARY KEY,
    user_id VARCHAR(255) NOT NULL,
    consent_type VARCHAR(50) NOT NULL,
    granted BOOLEAN NOT NULL,
    timestamp TIMESTAMP NOT NULL,
    ip_address VARCHAR(45),
    user_agent TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE INDEX idx_consent_user_id ON consent_records(user_id);
CREATE INDEX idx_consent_timestamp ON consent_records(timestamp);

-- Conversations (GDPR Article 15, 20, 90-day retention)
CREATE TABLE conversations (
    conversation_id VARCHAR(255) PRIMARY KEY,
    user_id VARCHAR(255) NOT NULL,
    thread_id VARCHAR(255),
    messages JSONB NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    last_message_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE INDEX idx_conversations_user_id ON conversations(user_id);
CREATE INDEX idx_conversations_last_message_at ON conversations(last_message_at);

-- Audit logs (HIPAA 7-year retention, GDPR Article 5(2))
CREATE TABLE audit_logs (
    audit_id VARCHAR(255) PRIMARY KEY,
    user_id VARCHAR(255),
    action VARCHAR(100) NOT NULL,
    resource_type VARCHAR(100),
    resource_id VARCHAR(255),
    metadata JSONB,
    timestamp TIMESTAMP NOT NULL,
    ip_address VARCHAR(45),
    user_agent TEXT
);

CREATE INDEX idx_audit_user_id ON audit_logs(user_id);
CREATE INDEX idx_audit_timestamp ON audit_logs(timestamp);
CREATE INDEX idx_audit_action ON audit_logs(action);

Kubernetes Deployment: The schema is deployed via ConfigMap in Kubernetes:

# View the schema ConfigMap
kubectl get configmap postgres-gdpr-schema -o yaml

The ConfigMap is defined in deployments/base/postgres-gdpr-schema-configmap.yaml and automatically applied during PostgreSQL initialization. Factory Pattern Usage:

from mcp_server_langgraph.compliance.gdpr.factory import get_gdpr_storage

# Dependency injection in FastAPI
async def my_endpoint(
    gdpr_storage: GDPRStorage = Depends(get_gdpr_storage)
):
    # Access user profiles
    profile = await gdpr_storage.user_profiles.get(user_id)

    # Access consents
    consents = await gdpr_storage.consents.get_user_consents(user_id)

    # Access conversations
    conversations = await gdpr_storage.conversations.list_by_user(user_id)

    # Log audit event
    await gdpr_storage.audit_logs.log(audit_entry)

Option 2: Redis

Redis provides fast, persistent key-value storage suitable for consent management. Configuration:

GDPR_STORAGE_BACKEND=redis
GDPR_REDIS_URL=redis://localhost:6379/2  # Use separate DB from sessions

Implementation Required:

Create RedisConsentStore class
Use Redis hashes for user profiles: user:profile:{user_id}
Use Redis hashes for consents: user:consents:{user_id}
Configure TTL if retention policies apply

Example Implementation:

import redis
from typing import Dict, Any, Optional

class RedisConsentStore:
    def __init__(self, redis_url: str):
        self.redis = redis.from_url(redis_url)

    def set_consent(self, user_id: str, consent_type: str, data: Dict[str, Any]):
        key = f"user:consents:{user_id}"
        self.redis.hset(key, consent_type, json.dumps(data))

    def get_consents(self, user_id: str) -> Dict[str, dict]:
        key = f"user:consents:{user_id}"
        consents = self.redis.hgetall(key)
        return {k.decode(): json.loads(v) for k, v in consents.items()}

Production Guard

The application includes a runtime guard that prevents startup if:

ENVIRONMENT=production AND GDPR_STORAGE_BACKEND=memory

Error Message:

RuntimeError: CRITICAL: GDPR endpoints cannot use in-memory storage in production.
Set GDPR_STORAGE_BACKEND=postgres or GDPR_STORAGE_BACKEND=redis,
or set ENVIRONMENT=development for testing.
Data subject rights (GDPR compliance) require persistent storage.

Migration Checklist

Before deploying GDPR endpoints to production: New in v2.8.0: PostgreSQL storage is fully implemented and production-ready. The factory pattern automatically initializes the correct storage backend based on GDPR_STORAGE_BACKEND environment variable.

Database Migrations

The migrations/ directory contains schema migrations for PostgreSQL GDPR storage:

migrations/
├── 001_initial_schema.sql       # Initial GDPR tables
├── 002_add_audit_indexes.sql    # Performance optimization indexes
└── README.md                     # Migration instructions

Applying Migrations Manually:

# Connect to GDPR database
psql $GDPR_POSTGRES_URL

# Run migrations in order
\i migrations/001_initial_schema.sql
\i migrations/002_add_audit_indexes.sql

Automated Migration (Kubernetes): Migrations are automatically applied via the postgres-gdpr-schema ConfigMap during PostgreSQL StatefulSet initialization. See deployments/base/postgres-statefulset.yaml for implementation details.

Future schema changes will be added as numbered migrations. Always apply migrations in sequential order.

Data Retention

Configure retention policies based on your legal requirements:

## Example: 90-day retention for consent records
CONSENT_RETENTION_DAYS=90
PROFILE_RETENTION_DAYS=365

Audit Trail

All GDPR operations are logged with:

User ID
Operation type (access, rectification, erasure, etc.)
Timestamp
GDPR article (15, 16, 17, 20, 21)

Log Example:

{
  "message": "User consent updated",
  "user_id": "user:alice",
  "consent_type": "analytics",
  "granted": true,
  "gdpr_article": "21",
  "timestamp": "2025-10-18T12:34:56Z"
}

Data Deletion (Article 17)

When users exercise right to erasure:

User profile and preferences are deleted
Consent records are deleted
Audit logs are anonymized (user_id replaced with hash)
Sessions are revoked
Conversations are deleted

Retention for Compliance: Some data may be retained for legal/compliance reasons:

Anonymized audit logs (for GDPR compliance proof)
Aggregated analytics (no PII)
Financial records (tax law requirements)

Testing

Unit Tests

pytest tests/integration/test_gdpr_endpoints.py -v

Integration Tests

Test with real database:

## Start PostgreSQL
docker-compose up -d postgres

## Run integration tests
GDPR_STORAGE_BACKEND=postgres \
GDPR_POSTGRES_URL=postgresql://test:test@localhost:5432/test_db \
pytest tests/integration/test_gdpr_endpoints.py

Production Guard Test

Verify production guard:

## Should fail
ENVIRONMENT=production GDPR_STORAGE_BACKEND=memory python -c "import mcp_server_langgraph.api.gdpr"

## Should succeed
ENVIRONMENT=production GDPR_STORAGE_BACKEND=postgres python -c "import mcp_server_langgraph.api.gdpr"

Deployment Examples

Docker Compose

services:
  app:
    image: mcp-server-langgraph:latest
    environment:
      - ENVIRONMENT=production
      - GDPR_STORAGE_BACKEND=postgres
      - GDPR_POSTGRES_URL=postgresql://gdpr:${DB_PASSWORD}@postgres:5432/gdpr_db
    depends_on:
      - postgres

  postgres:
    image: postgres:15
    environment:
      - POSTGRES_DB=gdpr_db
      - POSTGRES_USER=gdpr
      - POSTGRES_PASSWORD=${DB_PASSWORD}
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:

Kubernetes

apiVersion: v1
kind: ConfigMap
metadata:
  name: gdpr-config
data:
  ENVIRONMENT: "production"
  GDPR_STORAGE_BACKEND: "postgres"

---
apiVersion: v1
kind: Secret
metadata:
  name: gdpr-secrets
type: Opaque
stringData:
  GDPR_POSTGRES_URL: "postgresql://gdpr:password@postgres-service:5432/gdpr_db"

Troubleshooting

Cause: GDPR_STORAGE_BACKEND not set or set to “memory” Solution:

export GDPR_STORAGE_BACKEND=postgres
export GDPR_POSTGRES_URL=postgresql://...

Cause: Running in production with memory backend Solution: Change backend or environment:

## Option 1: Use persistent backend
export GDPR_STORAGE_BACKEND=postgres

## Option 2: Switch to development (NOT for production!)
export ENVIRONMENT=development

Data Loss on Restart

Cause: Using in-memory storage in non-development environment Solution: Migrate to PostgreSQL or Redis immediately

References

Support

For implementation assistance:

Check existing issues: https://github.com/vishnu2kmohan/mcp-server-langgraph/issues
Create new issue with gdpr and compliance labels
Include environment details and error messages

Getting Started

Deployment Options

LangGraph Platform

Kubernetes - GKE

Kubernetes - EKS & AKS

Kubernetes - Best Practices

Infrastructure as Code

Monitoring & Observability

Advanced Deployment

Configuration

Operations

Overview

Environment Variables

Required Configuration

Development/Testing

Storage Backend Options

Option 1: PostgreSQL (Recommended) ✅

Option 2: Redis

Production Guard

Migration Checklist

Database Migrations

Data Retention

Audit Trail

Data Deletion (Article 17)

Testing

Unit Tests

Integration Tests

Production Guard Test

Deployment Examples

Docker Compose

Kubernetes

Troubleshooting

Data Loss on Restart

References

Support

Getting Started

Deployment Options

LangGraph Platform

Kubernetes - GKE

Kubernetes - EKS & AKS

Kubernetes - Best Practices

Infrastructure as Code

Monitoring & Observability

Advanced Deployment

Configuration

Operations

​Overview

​GDPR Data Subject Rights Flow

​Environment Variables

​Required Configuration

​Development/Testing

​Storage Backend Options

​Option 1: PostgreSQL (Recommended) ✅

​Option 2: Redis

​Production Guard

​Migration Checklist

​Database Migrations

​GDPR Compliance Requirements

​Data Retention

​Audit Trail

​Data Deletion (Article 17)

​Testing

​Unit Tests

​Integration Tests

​Production Guard Test

​Deployment Examples

​Docker Compose

​Kubernetes

​Troubleshooting

​Error: “GDPR endpoints use in-memory storage”

​Error: “RuntimeError: CRITICAL: GDPR endpoints cannot use in-memory storage”

​Data Loss on Restart

​References

​Support

Overview

GDPR Data Subject Rights Flow

Environment Variables

Required Configuration

Development/Testing

Storage Backend Options

Option 1: PostgreSQL (Recommended) ✅

Option 2: Redis

Production Guard

Migration Checklist

Database Migrations

GDPR Compliance Requirements

Data Retention

Audit Trail

Data Deletion (Article 17)

Testing

Unit Tests

Integration Tests

Production Guard Test

Deployment Examples

Docker Compose

Kubernetes

Troubleshooting

Error: “GDPR endpoints use in-memory storage”

Error: “RuntimeError: CRITICAL: GDPR endpoints cannot use in-memory storage”

Data Loss on Restart

References

Support