Overview
The GDPR compliance endpoints (/api/v1/users/me/*) require persistent storage to meet data subject rights requirements under GDPR Articles 15, 16, 17, 20, and 21.
CRITICAL: In-memory storage is NOT production-ready and will cause data loss on server restart, violating GDPR compliance requirements.
GDPR Data Subject Rights Flow
Environment Variables
Required Configuration
Set these environment variables in your production deployment:
## Required: Set environment to production
ENVIRONMENT=production
## Required: Choose persistent backend (postgres or redis)
GDPR_STORAGE_BACKEND=postgres # or "redis"
Development/Testing
For local development and testing only:
ENVIRONMENT=development
GDPR_STORAGE_BACKEND=memory
Storage Backend Options
Option 1: PostgreSQL (Recommended) ✅
PostgreSQL provides ACID compliance and is ideal for GDPR data subject rights. Fully implemented as of ADR-0041.
Production Ready: PostgreSQL storage backend is fully implemented with factory pattern, migrations, and comprehensive testing.
Configuration:
GDPR_STORAGE_BACKEND=postgres
GDPR_POSTGRES_URL=postgresql://user:pass@localhost:5432/gdpr
Architecture Benefits (see ADR-0041):
- ACID Compliance: Atomic GDPR Article 17 deletions across all data
- Cost-Effective: 14x cheaper than Redis for 7-year retention (50/monthvs720/month)
- Audit Trail: Time-series queries for compliance reports
- Already in Stack: Keycloak and OpenFGA use PostgreSQL
- 5-10ms Latency: Acceptable for user-initiated GDPR operations
Database Schema:
The PostgreSQL schema includes 5 tables optimized for GDPR compliance:
-- User profiles (GDPR Article 15, 16, 17)
CREATE TABLE user_profiles (
user_id VARCHAR(255) PRIMARY KEY,
username VARCHAR(255) NOT NULL,
name VARCHAR(255),
email VARCHAR(255),
preferences JSONB,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- User preferences (GDPR Article 16, 17)
CREATE TABLE user_preferences (
user_id VARCHAR(255) PRIMARY KEY REFERENCES user_profiles(user_id) ON DELETE CASCADE,
preferences JSONB NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Consent records (GDPR Article 21, 7-year retention)
CREATE TABLE consent_records (
consent_id VARCHAR(255) PRIMARY KEY,
user_id VARCHAR(255) NOT NULL,
consent_type VARCHAR(50) NOT NULL,
granted BOOLEAN NOT NULL,
timestamp TIMESTAMP NOT NULL,
ip_address VARCHAR(45),
user_agent TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_consent_user_id ON consent_records(user_id);
CREATE INDEX idx_consent_timestamp ON consent_records(timestamp);
-- Conversations (GDPR Article 15, 20, 90-day retention)
CREATE TABLE conversations (
conversation_id VARCHAR(255) PRIMARY KEY,
user_id VARCHAR(255) NOT NULL,
thread_id VARCHAR(255),
messages JSONB NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
last_message_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_conversations_user_id ON conversations(user_id);
CREATE INDEX idx_conversations_last_message_at ON conversations(last_message_at);
-- Audit logs (HIPAA 7-year retention, GDPR Article 5(2))
CREATE TABLE audit_logs (
audit_id VARCHAR(255) PRIMARY KEY,
user_id VARCHAR(255),
action VARCHAR(100) NOT NULL,
resource_type VARCHAR(100),
resource_id VARCHAR(255),
metadata JSONB,
timestamp TIMESTAMP NOT NULL,
ip_address VARCHAR(45),
user_agent TEXT
);
CREATE INDEX idx_audit_user_id ON audit_logs(user_id);
CREATE INDEX idx_audit_timestamp ON audit_logs(timestamp);
CREATE INDEX idx_audit_action ON audit_logs(action);
Kubernetes Deployment:
The schema is deployed via ConfigMap in Kubernetes:
# View the schema ConfigMap
kubectl get configmap postgres-gdpr-schema -o yaml
The ConfigMap is defined in deployments/base/postgres-gdpr-schema-configmap.yaml and automatically applied during PostgreSQL initialization.
Factory Pattern Usage:
from mcp_server_langgraph.compliance.gdpr.factory import get_gdpr_storage
# Dependency injection in FastAPI
async def my_endpoint(
gdpr_storage: GDPRStorage = Depends(get_gdpr_storage)
):
# Access user profiles
profile = await gdpr_storage.user_profiles.get(user_id)
# Access consents
consents = await gdpr_storage.consents.get_user_consents(user_id)
# Access conversations
conversations = await gdpr_storage.conversations.list_by_user(user_id)
# Log audit event
await gdpr_storage.audit_logs.log(audit_entry)
Option 2: Redis
Redis provides fast, persistent key-value storage suitable for consent management.
Configuration:
GDPR_STORAGE_BACKEND=redis
GDPR_REDIS_URL=redis://localhost:6379/2 # Use separate DB from sessions
Implementation Required:
- Create
RedisConsentStore class
- Use Redis hashes for user profiles:
user:profile:{user_id}
- Use Redis hashes for consents:
user:consents:{user_id}
- Configure TTL if retention policies apply
Example Implementation:
import redis
from typing import Dict, Any, Optional
class RedisConsentStore:
def __init__(self, redis_url: str):
self.redis = redis.from_url(redis_url)
def set_consent(self, user_id: str, consent_type: str, data: Dict[str, Any]):
key = f"user:consents:{user_id}"
self.redis.hset(key, consent_type, json.dumps(data))
def get_consents(self, user_id: str) -> Dict[str, dict]:
key = f"user:consents:{user_id}"
consents = self.redis.hgetall(key)
return {k.decode(): json.loads(v) for k, v in consents.items()}
Production Guard
The application includes a runtime guard that prevents startup if:
ENVIRONMENT=production AND GDPR_STORAGE_BACKEND=memory
Error Message:
RuntimeError: CRITICAL: GDPR endpoints cannot use in-memory storage in production.
Set GDPR_STORAGE_BACKEND=postgres or GDPR_STORAGE_BACKEND=redis,
or set ENVIRONMENT=development for testing.
Data subject rights (GDPR compliance) require persistent storage.
Migration Checklist
Before deploying GDPR endpoints to production:
New in v2.8.0: PostgreSQL storage is fully implemented and production-ready. The factory pattern automatically initializes the correct storage backend based on GDPR_STORAGE_BACKEND environment variable.
Database Migrations
The migrations/ directory contains schema migrations for PostgreSQL GDPR storage:
migrations/
├── 001_initial_schema.sql # Initial GDPR tables
├── 002_add_audit_indexes.sql # Performance optimization indexes
└── README.md # Migration instructions
Applying Migrations Manually:
# Connect to GDPR database
psql $GDPR_POSTGRES_URL
# Run migrations in order
\i migrations/001_initial_schema.sql
\i migrations/002_add_audit_indexes.sql
Automated Migration (Kubernetes):
Migrations are automatically applied via the postgres-gdpr-schema ConfigMap during PostgreSQL StatefulSet initialization. See deployments/base/postgres-statefulset.yaml for implementation details.
Future schema changes will be added as numbered migrations. Always apply migrations in sequential order.
GDPR Compliance Requirements
Data Retention
Configure retention policies based on your legal requirements:
## Example: 90-day retention for consent records
CONSENT_RETENTION_DAYS=90
PROFILE_RETENTION_DAYS=365
Audit Trail
All GDPR operations are logged with:
- User ID
- Operation type (access, rectification, erasure, etc.)
- Timestamp
- GDPR article (15, 16, 17, 20, 21)
Log Example:
{
"message": "User consent updated",
"user_id": "user:alice",
"consent_type": "analytics",
"granted": true,
"gdpr_article": "21",
"timestamp": "2025-10-18T12:34:56Z"
}
Data Deletion (Article 17)
When users exercise right to erasure:
- User profile and preferences are deleted
- Consent records are deleted
- Audit logs are anonymized (user_id replaced with hash)
- Sessions are revoked
- Conversations are deleted
Retention for Compliance:
Some data may be retained for legal/compliance reasons:
- Anonymized audit logs (for GDPR compliance proof)
- Aggregated analytics (no PII)
- Financial records (tax law requirements)
Testing
Unit Tests
pytest tests/integration/test_gdpr_endpoints.py -v
Integration Tests
Test with real database:
## Start PostgreSQL
docker-compose up -d postgres
## Run integration tests
GDPR_STORAGE_BACKEND=postgres \
GDPR_POSTGRES_URL=postgresql://test:test@localhost:5432/test_db \
pytest tests/integration/test_gdpr_endpoints.py
Production Guard Test
Verify production guard:
## Should fail
ENVIRONMENT=production GDPR_STORAGE_BACKEND=memory python -c "import mcp_server_langgraph.api.gdpr"
## Should succeed
ENVIRONMENT=production GDPR_STORAGE_BACKEND=postgres python -c "import mcp_server_langgraph.api.gdpr"
Deployment Examples
Docker Compose
services:
app:
image: mcp-server-langgraph:latest
environment:
- ENVIRONMENT=production
- GDPR_STORAGE_BACKEND=postgres
- GDPR_POSTGRES_URL=postgresql://gdpr:${DB_PASSWORD}@postgres:5432/gdpr_db
depends_on:
- postgres
postgres:
image: postgres:15
environment:
- POSTGRES_DB=gdpr_db
- POSTGRES_USER=gdpr
- POSTGRES_PASSWORD=${DB_PASSWORD}
volumes:
- postgres_data:/var/lib/postgresql/data
volumes:
postgres_data:
Kubernetes
apiVersion: v1
kind: ConfigMap
metadata:
name: gdpr-config
data:
ENVIRONMENT: "production"
GDPR_STORAGE_BACKEND: "postgres"
---
apiVersion: v1
kind: Secret
metadata:
name: gdpr-secrets
type: Opaque
stringData:
GDPR_POSTGRES_URL: "postgresql://gdpr:password@postgres-service:5432/gdpr_db"
Troubleshooting
Error: “GDPR endpoints use in-memory storage”
Cause: GDPR_STORAGE_BACKEND not set or set to “memory”
Solution:
export GDPR_STORAGE_BACKEND=postgres
export GDPR_POSTGRES_URL=postgresql://...
Error: “RuntimeError: CRITICAL: GDPR endpoints cannot use in-memory storage”
Cause: Running in production with memory backend
Solution: Change backend or environment:
## Option 1: Use persistent backend
export GDPR_STORAGE_BACKEND=postgres
## Option 2: Switch to development (NOT for production!)
export ENVIRONMENT=development
Data Loss on Restart
Cause: Using in-memory storage in non-development environment
Solution: Migrate to PostgreSQL or Redis immediately
References
Support
For implementation assistance:
- Check existing issues: https://github.com/vishnu2kmohan/mcp-server-langgraph/issues
- Create new issue with
gdpr and compliance labels
- Include environment details and error messages