Deployment Options

*Time estimates assume prerequisites are configured (accounts, credentials, CLI tools installed). First-time setup may take longer. See individual deployment guides for detailed requirements.

Platform Comparison

Feature	LangGraph Platform	Cloud Run	Kubernetes	Docker
Setup Time	~2 min*	~10 min*	~1-2 hrs*	~15 min*
Infrastructure	✅ None	⚠️ Minimal	❌ Complex	⚠️ Basic
Scaling	✅ Auto	✅ Auto	⚠️ Manual config	❌ Manual
LangSmith Integration	✅ Built-in	⚠️ Manual	⚠️ Manual	⚠️ Manual
Versioning	✅ Built-in	⚠️ Manual	⚠️ Manual	❌ None
Cost	Pay-per-use	Pay-per-use	Fixed + usage	Hosting only
Best For	Quick production	GCP apps	Enterprise	Development

*Time estimates with prerequisites configured. First-time setup takes longer.

Recommendation: Start with LangGraph Platform for fastest time-to-production, or Cloud Run if you’re already on GCP. Use Kubernetes for enterprise self-hosted deployments.

Architecture

Supported Platforms

Google Cloud
AWS
Azure
Other

Google Kubernetes Engine (GKE)

Fully supported and tested

Features:

✅ Autopilot and Standard clusters
✅ Workload Identity for secrets
✅ Cloud Armor for DDoS protection
✅ Cloud Load Balancing
✅ Managed Prometheus

Deploy to GKE →

Pre-Deployment Checklist

Review Security Audit

Run the security audit checklist:

python scripts/validate_production.py --strict

Ensure all checks pass before deploying.

Configure Secrets

Set up secrets management:

Configure Infisical project
Generate JWT secret: openssl rand -base64 32
Get LLM API keys
Setup OpenFGA store and model IDs

See Secrets Management

Build Container Image

docker build -t your-registry/mcp-server-langgraph:1.0.0 .
docker push your-registry/mcp-server-langgraph:1.0.0

Or use GitHub Actions for automated builds.

Prepare Infrastructure

Provision Kubernetes cluster
Setup DNS records
Configure TLS certificates
Deploy monitoring stack
Setup OpenFGA with PostgreSQL backend

Review Configuration

Update configuration for production:

Set ENVIRONMENT=production
Configure resource limits
Enable autoscaling
Setup network policies
Configure ingress and TLS

Quick Deploy (Docker)

For a quick test deployment:

## 1. Configure environment
cp .env.example .env.production
## Edit .env.production with your values

## 2. Start services
docker compose -f docker-compose.yml up -d

## 3. Verify
curl http://localhost:8000/health

Production Deploy (Kubernetes)

For production deployment:

## 1. Create namespace
kubectl create namespace mcp-server-langgraph

## 2. Create secrets
kubectl create secret generic mcp-server-langgraph-secrets \
  --from-literal=jwt-secret-key="$(openssl rand -base64 32)" \
  --from-literal=google-api-key="YOUR_KEY" \
  -n mcp-server-langgraph

## 3. Deploy with Helm
helm upgrade --install mcp-server-langgraph ./helm/mcp-server-langgraph \
  --namespace mcp-server-langgraph \
  --values values-production.yaml \
  --wait

## 4. Verify deployment
kubectl rollout status deployment/mcp-server-langgraph -n mcp-server-langgraph

Environment Configuration

Development
Staging
Production

environment: development
replicaCount: 1
autoscaling:
  enabled: false
resources:
  limits:
    cpu: 500m
    memory: 512Mi

environment: staging
replicaCount: 2
autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 5
resources:
  limits:
    cpu: 1000m
    memory: 1Gi

environment: production
replicaCount: 3
autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 10
resources:
  limits:
    cpu: 2000m
    memory: 2Gi
podDisruptionBudget:
  enabled: true
  minAvailable: 2

Monitoring & Observability

Deploy the monitoring stack:

## Prometheus + Grafana
helm install prometheus prometheus-community/kube-prometheus-stack \
  --namespace monitoring --create-namespace

## Jaeger
helm install jaeger jaegertracing/jaeger \
  --namespace monitoring

## OpenTelemetry Collector
kubectl apply -f kubernetes/otel-collector/

Access dashboards:

Grafana: http://grafana.yourdomain.com (admin/admin)
Jaeger: http://jaeger.yourdomain.com
Prometheus: http://prometheus.yourdomain.com

Scaling

Horizontal Pod Autoscaling

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: mcp-server-langgraph
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: mcp-server-langgraph
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

Vertical Pod Autoscaling

kubectl apply -f kubernetes/vpa/

High Availability

Ensure high availability with:

Multiple Replicas: Run at least 3 replicas
Pod Disruption Budget: Maintain minimum availability
Multi-Zone Deployment: Spread across availability zones
Health Checks: Liveness and readiness probes
Graceful Shutdown: Handle SIGTERM properly
Circuit Breakers: Fail fast with timeouts

Disaster Recovery

Backup Strategy

OpenFGA PostgreSQL database
Configuration and secrets
Persistent volumes

Recovery Procedure

Restore database from backup
Redeploy application
Verify functionality

Testing

Test disaster recovery quarterly
Document RTO/RPO
Update runbooks

Security Hardening

Production security checklist: See Security Best Practices

Cost Optimization

Use Autopilot for hands-off management
Enable cluster autoscaling
Use preemptible nodes for dev/staging
Configure resource requests accurately
Use committed use discounts

Next Steps

Docker Deployment

Quick deployment guide

Kubernetes Setup

Production Kubernetes deployment

Monitoring Setup

Configure observability

Security Guide

Production security hardening

Getting Started

Deployment Options

LangGraph Platform

Kubernetes - GKE

Kubernetes - EKS & AKS

Kubernetes - Best Practices

Infrastructure as Code

Monitoring & Observability

Advanced Deployment

Configuration

Operations