API Gateway Deployment Guide

Complete guide for deploying OpenAPI-compliant APIs to production API Gateways.

Prerequisites
Kong API Gateway
AWS API Gateway
Google Cloud API Gateway
Azure API Management
Kubernetes Ingress

Prerequisites

OpenAPI Specification

File: openapi/v1.json
Version: OpenAPI 3.1.0
Endpoints: 22
Schemas: 42

Required Configuration

JWT secret key
Keycloak SSO (optional)
OpenFGA authorization
PostgreSQL database (for GDPR storage)
Redis (for rate limiting)

Kong API Gateway

1. Install Kong Gateway

# Using Docker
docker run -d --name kong-gateway \
  --network=kong-net \
  -e "KONG_DATABASE=postgres" \
  -e "KONG_PG_HOST=postgres" \
  -e "KONG_PROXY_ACCESS_LOG=/dev/stdout" \
  -e "KONG_ADMIN_ACCESS_LOG=/dev/stdout" \
  -e "KONG_PROXY_ERROR_LOG=/dev/stderr" \
  -e "KONG_ADMIN_ERROR_LOG=/dev/stderr" \
  -e "KONG_ADMIN_LISTEN=0.0.0.0:8001" \
  -p 8000:8000 \
  -p 8443:8443 \
  -p 8001:8001 \
  kong:latest

2. Import OpenAPI Specification

# Install deck (Kong's declarative configuration tool)
brew install deck

# Convert OpenAPI to Kong config
deck file openapi2kong \
  -s openapi/v1.json \
  -o kong.yaml

# Apply configuration
deck gateway sync kong.yaml

3. Configure API Key Plugin

Kong configuration is already integrated:

See src/mcp_server_langgraph/api/api_keys.py:validate endpoint
Kong calls /api/v1/api-keys/validate to exchange API key for JWT

Kong Plugin Config:

plugins:
  - name: request-transformer
    config:
      add:
        headers:
          - "Authorization: Bearer {{jwt_token}}"

4. Enable Rate Limiting

curl -X POST http://localhost:8001/services/mcp-server/plugins \
  --data "name=rate-limiting" \
  --data "config.minute=60" \
  --data "config.policy=redis" \
  --data "config.redis_host=redis"

AWS API Gateway

1. Import OpenAPI Spec

# Using AWS CLI
aws apigateway import-rest-api \
  --body file://openapi/v1.json \
  --fail-on-warnings \
  --endpoint-configuration types=REGIONAL

# Save API ID
export API_ID=<api-id-from-output>

2. Configure JWT Authorizer

# Create Lambda authorizer for JWT validation
aws apigateway create-authorizer \
  --rest-api-id $API_ID \
  --name JWTAuthorizer \
  --type TOKEN \
  --authorizer-uri arn:aws:apigateway:region:lambda:path/2015-03-31/functions/arn:aws:lambda:region:account-id:function:jwt-authorizer/invocations \
  --identity-source method.request.header.Authorization

3. Deploy to Stage

# Create deployment
aws apigateway create-deployment \
  --rest-api-id $API_ID \
  --stage-name production

# Get invoke URL
echo "https://${API_ID}.execute-api.region.amazonaws.com/production"

4. Add Usage Plan (Rate Limiting)

# Create usage plan
aws apigateway create-usage-plan \
  --name "Production Tier" \
  --throttle burstLimit=100,rateLimit=50 \
  --quota limit=10000,period=DAY

# Associate with API stage
aws apigateway create-usage-plan-key \
  --usage-plan-id $USAGE_PLAN_ID \
  --key-type API_KEY \
  --key-id $API_KEY_ID

Google Cloud API Gateway

1. Create API Config

# Deploy OpenAPI spec
gcloud api-gateway api-configs create mcp-api-v1 \
  --api=mcp-server \
  --openapi-spec=openapi/v1.json \
  --project=$PROJECT_ID \
  --backend-auth-service-account=$SERVICE_ACCOUNT

2. Create Gateway

# Create gateway
gcloud api-gateway gateways create mcp-gateway \
  --api=mcp-server \
  --api-config=mcp-api-v1 \
  --location=us-central1 \
  --project=$PROJECT_ID

# Get gateway URL
gcloud api-gateway gateways describe mcp-gateway \
  --location=us-central1 \
  --format="value(defaultHostname)"

3. Configure Authentication

Option A: API Keys

# Add to openapi/v1.json
securityDefinitions:
  api_key:
    type: apiKey
    name: x-api-key
    in: header

security:
  - api_key: []

Option B: JWT (Firebase Auth)

securityDefinitions:
  firebase:
    authorizationUrl: ""
    flow: "implicit"
    type: "oauth2"
    x-google-issuer: "https://securetoken.google.com/PROJECT_ID"
    x-google-jwks_uri: "https://www.googleapis.com/service_accounts/v1/metadata/x509/securetoken@system.gserviceaccount.com"
    x-google-audiences: "PROJECT_ID"

Azure API Management

1. Import OpenAPI

# Create API Management instance
az apim create \
  --name mcp-apim \
  --resource-group mcp-resources \
  --publisher-email admin@example.com \
  --publisher-name "MCP API"

# Import OpenAPI spec
az apim api import \
  --resource-group mcp-resources \
  --service-name mcp-apim \
  --path /api/v1 \
  --specification-format OpenApi \
  --specification-path openapi/v1.json \
  --api-id mcp-server-api

2. Configure JWT Validation

<!-- Add to inbound policy -->
<validate-jwt header-name="Authorization" failed-validation-httpcode="401">
  <openid-config url="https://YOUR_KEYCLOAK/realms/YOUR_REALM/.well-known/openid-configuration" />
  <required-claims>
    <claim name="aud" match="any">
      <value>mcp-api</value>
    </claim>
  </required-claims>
</validate-jwt>

3. Enable Rate Limiting

<rate-limit calls="100" renewal-period="60" />
<quota calls="10000" renewal-period="86400" />

Kubernetes Ingress (NGINX/Traefik)

Option 1: NGINX Ingress

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: mcp-api-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/rate-limit: "100"
    nginx.ingress.kubernetes.io/cors-allow-origin: "*"
    nginx.ingress.kubernetes.io/auth-url: "http://mcp-server.default.svc.cluster.local:8000/auth/validate"
spec:
  ingressClassName: nginx
  rules:
    - host: api.example.com
      http:
        paths:
          - path: /api/v1
            pathType: Prefix
            backend:
              service:
                name: mcp-server
                port:
                  number: 8000
  tls:
    - hosts:
        - api.example.com
      secretName: api-tls-cert

Option 2: Traefik Ingress

apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: mcp-api-route
spec:
  entryPoints:
    - websecure
  routes:
    - match: Host(`api.example.com`) && PathPrefix(`/api/v1`)
      kind: Rule
      services:
        - name: mcp-server
          port: 8000
      middlewares:
        - name: rate-limit
        - name: jwt-auth
  tls:
    certResolver: letsencrypt

Environment Configuration

Required Environment Variables

# JWT Authentication
export JWT_SECRET_KEY="your-secret-key-here"

# Keycloak SSO
export KEYCLOAK_SERVER_URL="https://keycloak.example.com"
export KEYCLOAK_REALM="mcp-realm"
export KEYCLOAK_CLIENT_ID="mcp-api"
export KEYCLOAK_CLIENT_SECRET="client-secret"

# OpenFGA Authorization
export OPENFGA_API_URL="https://openfga.example.com"
export OPENFGA_STORE_ID="your-store-id"
export OPENFGA_API_TOKEN="openfga-token"

# Database
export DATABASE_URL="postgresql://user:pass@postgres:5432/mcp"

# Redis (Rate Limiting)
export REDIS_URL="redis://redis:6379/0"

# GDPR Storage
export GDPR_STORAGE_BACKEND="postgres"

# CORS
export CORS_ORIGINS="https://app.example.com,https://admin.example.com"

Testing Deployment

1. Health Check

curl https://api.example.com/health
# Expected: {"status": "healthy"}

curl https://api.example.com/health/ready
# Expected: {"status": "ready", "checks": {...}}

2. OpenAPI Documentation

# Swagger UI
https://api.example.com/docs

# ReDoc
https://api.example.com/redoc

# OpenAPI JSON
https://api.example.com/openapi.json

3. Authentication Test

# Login
curl -X POST https://api.example.com/auth/login \
  -H "Content-Type: application/json" \
  -d '{
    "username": "alice",
    "password": "secret"
  }'

# Use JWT token
export TOKEN="<access_token from response>"

curl https://api.example.com/api/v1/api-keys/ \
  -H "Authorization: Bearer $TOKEN"

4. Rate Limiting Test

# Make 101 requests (rate limit is 100/minute)
for i in {1..101}; do
  curl -s -o /dev/null -w "%{http_code}\n" \
    https://api.example.com/api/version
done

# Should see 200 responses, then 429 (Too Many Requests)

Monitoring & Observability

Prometheus Metrics

# Endpoint: /metrics/prometheus

# Metrics exposed:
# - http_requests_total
# - http_request_duration_seconds
# - circuit_breaker_state
# - rate_limit_exceeded_total

Grafana Dashboard

# Import dashboard from grafana.com
# ID: 14282 (FastAPI dashboard)

# Configure Prometheus data source
# URL: http://prometheus:9090

Traces (OpenTelemetry)

# Export to Jaeger
export OTEL_EXPORTER_OTLP_ENDPOINT="http://jaeger:4318"
export OTEL_SERVICE_NAME="mcp-server"

Security Checklist

Before production deployment:

API Gateway Features Comparison

Feature	Kong	AWS	GCP	Azure	NGINX
OpenAPI Import	✅	✅	✅	✅	⚠️
JWT Validation	✅	✅	✅	✅	✅
Rate Limiting	✅	✅	✅	✅	✅
API Key Management	✅	✅	✅	✅	❌
SCIM Support	✅	⚠️	⚠️	✅	❌
Monitoring	✅	✅	✅	✅	⚠️
Auto-scaling	✅	✅	✅	✅	⚠️

Legend: ✅ Full Support | ⚠️ Partial | ❌ Not Supported

Production Deployment Example (Kong + Kubernetes)

Complete Setup

# mcp-deployment.yaml
apiVersion: v1
kind: Service
metadata:
  name: mcp-server
spec:
  selector:
    app: mcp-server
  ports:
    - port: 8000
      targetPort: 8000
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mcp-server
spec:
  replicas: 3
  selector:
    matchLabels:
      app: mcp-server
  template:
    metadata:
      labels:
        app: mcp-server
    spec:
      containers:
        - name: mcp-server
          image: your-registry/mcp-server:2.8.0
          ports:
            - containerPort: 8000
          env:
            - name: JWT_SECRET_KEY
              valueFrom:
                secretKeyRef:
                  name: mcp-secrets
                  key: jwt-secret
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: mcp-secrets
                  key: database-url
          livenessProbe:
            httpGet:
              path: /health
              port: 8000
            initialDelaySeconds: 10
            periodSeconds: 30
          readinessProbe:
            httpGet:
              path: /health/ready
              port: 8000
            initialDelaySeconds: 5
            periodSeconds: 10
---
# Kong Ingress
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: rate-limiting
config:
  minute: 100
  policy: redis
  redis_host: redis
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: mcp-api
  annotations:
    konghq.com/plugins: rate-limiting
spec:
  ingressClassName: kong
  rules:
    - host: api.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: kong-proxy
                port:
                  number: 80

API Versioning Strategy

URL-Based Versioning (Current)

/api/v1/* - Version 1 (current)
/api/v2/* - Version 2 (future)

Header-Based Versioning (Optional)

curl https://api.example.com/api/v1/users/me/data \
  -H "X-API-Version: 1.0" \
  -H "Authorization: Bearer $TOKEN"

Deprecation Policy

Announce deprecation - 6 months notice
Mark as deprecated - Add deprecated: true to OpenAPI
Update /api/version - Add to deprecated_versions with sunset date
Return headers - Sunset: Sat, 01 Jun 2026 00:00:00 GMT
Remove version - After sunset date

Breaking Change Prevention

Automated Validation

# Run before deploying
python scripts/validation/validate_openapi.py

# Check for breaking changes
python scripts/validation/detect_breaking_changes.py \
  --baseline openapi/v1-baseline.json \
  --current openapi/v1.json

Breaking Changes (Require Version Bump)

❌ Removing endpoints
❌ Removing required fields
❌ Changing field types
❌ Renaming fields
❌ Changing authentication methods

Safe Changes (No Version Bump)

✅ Adding new endpoints
✅ Adding optional fields
✅ Adding new response fields
✅ Adding query parameters (optional)
✅ Expanding enum values

CI/CD Integration

GitHub Actions

# .github/workflows/api-deploy.yml
name: Deploy API

on:
  push:
    branches: [main]

jobs:
  validate-openapi:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Validate OpenAPI Spec
        run: |
          python scripts/validation/validate_openapi.py

      - name: Detect Breaking Changes
        run: |
          python scripts/validation/detect_breaking_changes.py \
            --baseline openapi/v1-baseline.json \
            --current openapi/v1.json

  generate-sdks:
    needs: validate-openapi
    runs-on: ubuntu-latest
    steps:
      - name: Generate Python SDK
        run: |
          docker run --rm \
            -v $(pwd):/local \
            openapitools/openapi-generator-cli generate \
            -i /local/openapi/v1.json \
            -g python \
            -o /local/clients/python

      - name: Publish to PyPI
        run: |
          cd clients/python
          python -m build
          twine upload dist/*

  deploy-to-kong:
    needs: validate-openapi
    runs-on: ubuntu-latest
    steps:
      - name: Deploy to Kong
        run: |
          deck file openapi2kong -s openapi/v1.json -o kong.yaml
          deck gateway sync kong.yaml --kong-addr https://kong-admin.example.com

Troubleshooting

Issue: 404 on API Endpoints

Cause: Routers not registered Solution: Verify all routers in server_streamable.py:1262-1271

Issue: OpenAPI Generation Fails

Cause: Pydantic model issues ($ref conflicts) Solution: See scim/schema.py:68-77 for $ref workaround

Issue: Authentication Fails

Cause: JWT secret not configured Solution: Set JWT_SECRET_KEY environment variable Cause: In-memory storage in production Solution: Set GDPR_STORAGE_BACKEND=postgres

Issue: Rate Limiting Not Working

Cause: Redis not configured Solution: Set REDIS_URL and configure SlowAPI

Performance Tuning

Recommended Settings

# Gunicorn workers
workers = (2 * cpu_cores) + 1  # e.g., 9 workers for 4 cores

# Worker class
worker_class = "uvicorn.workers.UvicornWorker"

# Connection limits
max_requests = 10000  # Restart workers after N requests
max_requests_jitter = 1000  # Add randomness to prevent thundering herd
keepalive = 65  # Seconds

# Timeouts
timeout = 120  # Request timeout
graceful_timeout = 30  # Graceful shutdown

Database Connection Pooling

# asyncpg pool settings
min_size = 10
max_size = 50
max_queries = 50000
max_inactive_connection_lifetime = 300

Support & Resources

OpenAPI Spec: openapi/v1.json
MCP Tools Spec: openapi/mcp-tools.json
API Documentation: /docs (Swagger UI)
Compliance Report: docs/api-compliance-report.mdx
SDK Usage: clients/README.md

Generated: 2025-11-02 Version: 2.8.0

Getting Started

Deployment Options

LangGraph Platform

Kubernetes - GKE

Kubernetes - EKS & AKS

Kubernetes - Best Practices

Infrastructure as Code

Monitoring & Observability

Advanced Deployment

Configuration

Operations

​API Gateway Deployment Guide

​Table of Contents

​Prerequisites

​OpenAPI Specification

​Required Configuration

​Kong API Gateway

​1. Install Kong Gateway

​2. Import OpenAPI Specification

​3. Configure API Key Plugin

​4. Enable Rate Limiting

​AWS API Gateway

​1. Import OpenAPI Spec

​2. Configure JWT Authorizer

​3. Deploy to Stage

​4. Add Usage Plan (Rate Limiting)

​Google Cloud API Gateway

​1. Create API Config

​2. Create Gateway

​3. Configure Authentication

​Azure API Management

​1. Import OpenAPI

​2. Configure JWT Validation

​3. Enable Rate Limiting

​Kubernetes Ingress (NGINX/Traefik)

​Option 1: NGINX Ingress

​Option 2: Traefik Ingress

​Environment Configuration

​Required Environment Variables

​Testing Deployment

​1. Health Check

​2. OpenAPI Documentation

​3. Authentication Test

​4. Rate Limiting Test

​Monitoring & Observability

​Prometheus Metrics

​Grafana Dashboard

​Traces (OpenTelemetry)

​Security Checklist

​API Gateway Features Comparison

​Production Deployment Example (Kong + Kubernetes)

​Complete Setup

​API Versioning Strategy

​URL-Based Versioning (Current)

​Header-Based Versioning (Optional)

​Deprecation Policy

​Breaking Change Prevention

​Automated Validation

​Breaking Changes (Require Version Bump)

​Safe Changes (No Version Bump)

​CI/CD Integration

​GitHub Actions

​Troubleshooting

​Issue: 404 on API Endpoints

​Issue: OpenAPI Generation Fails

​Issue: Authentication Fails

​Issue: GDPR Operations Fail

​Issue: Rate Limiting Not Working

​Performance Tuning

​Recommended Settings

​Database Connection Pooling

​Support & Resources

API Gateway Deployment Guide

Table of Contents

Prerequisites

OpenAPI Specification

Required Configuration

Kong API Gateway

1. Install Kong Gateway

2. Import OpenAPI Specification

3. Configure API Key Plugin

4. Enable Rate Limiting

AWS API Gateway

1. Import OpenAPI Spec

2. Configure JWT Authorizer

3. Deploy to Stage

4. Add Usage Plan (Rate Limiting)

Google Cloud API Gateway

1. Create API Config

2. Create Gateway

3. Configure Authentication

Azure API Management

1. Import OpenAPI

2. Configure JWT Validation

3. Enable Rate Limiting

Kubernetes Ingress (NGINX/Traefik)

Option 1: NGINX Ingress

Option 2: Traefik Ingress

Environment Configuration

Required Environment Variables

Testing Deployment

1. Health Check

2. OpenAPI Documentation

3. Authentication Test

4. Rate Limiting Test

Monitoring & Observability

Prometheus Metrics

Grafana Dashboard

Traces (OpenTelemetry)

Security Checklist

API Gateway Features Comparison

Production Deployment Example (Kong + Kubernetes)

Complete Setup

API Versioning Strategy

URL-Based Versioning (Current)

Header-Based Versioning (Optional)

Deprecation Policy

Breaking Change Prevention

Automated Validation

Breaking Changes (Require Version Bump)

Safe Changes (No Version Bump)

CI/CD Integration

GitHub Actions

Troubleshooting

Issue: 404 on API Endpoints

Issue: OpenAPI Generation Fails

Issue: Authentication Fails

Issue: GDPR Operations Fail

Issue: Rate Limiting Not Working

Performance Tuning

Recommended Settings

Database Connection Pooling

Support & Resources