Skip to main content

Deployment Options

Choose the deployment method that best fits your needs:
*Time estimates assume prerequisites are configured (accounts, credentials, CLI tools installed). First-time setup may take longer. See individual deployment guides for detailed requirements.

Platform Comparison

FeatureLangGraph PlatformCloud RunKubernetesDocker
Setup Time~2 min*~10 min*~1-2 hrs*~15 min*
Infrastructure✅ None⚠️ Minimal❌ Complex⚠️ Basic
Scaling✅ Auto✅ Auto⚠️ Manual config❌ Manual
LangSmith Integration✅ Built-in⚠️ Manual⚠️ Manual⚠️ Manual
Versioning✅ Built-in⚠️ Manual⚠️ Manual❌ None
CostPay-per-usePay-per-useFixed + usageHosting only
Best ForQuick productionGCP appsEnterpriseDevelopment
*Time estimates with prerequisites configured. First-time setup takes longer.
Recommendation: Start with LangGraph Platform for fastest time-to-production, or Cloud Run if you’re already on GCP. Use Kubernetes for enterprise self-hosted deployments.

Architecture

Supported Platforms

  • Google Cloud
  • AWS
  • Azure
  • Other

Google Kubernetes Engine (GKE)

Fully supported and tested
Features:
  • ✅ Autopilot and Standard clusters
  • ✅ Workload Identity for secrets
  • ✅ Cloud Armor for DDoS protection
  • ✅ Cloud Load Balancing
  • ✅ Managed Prometheus
Deploy to GKE →

Pre-Deployment Checklist

1

Review Security Audit

Run the security audit checklist:
python scripts/validate_production.py --strict
Ensure all checks pass before deploying.
2

Configure Secrets

Set up secrets management:
  • Configure Infisical project
  • Generate JWT secret: openssl rand -base64 32
  • Get LLM API keys
  • Setup OpenFGA store and model IDs
See Secrets Management
3

Build Container Image

docker build -t your-registry/mcp-server-langgraph:1.0.0 .
docker push your-registry/mcp-server-langgraph:1.0.0
Or use GitHub Actions for automated builds.
4

Prepare Infrastructure

  • Provision Kubernetes cluster
  • Setup DNS records
  • Configure TLS certificates
  • Deploy monitoring stack
  • Setup OpenFGA with PostgreSQL backend
5

Review Configuration

Update configuration for production:
  • Set ENVIRONMENT=production
  • Configure resource limits
  • Enable autoscaling
  • Setup network policies
  • Configure ingress and TLS

Quick Deploy (Docker)

For a quick test deployment:
## 1. Configure environment
cp .env.example .env.production
## Edit .env.production with your values

## 2. Start services
docker compose -f docker-compose.yml up -d

## 3. Verify
curl http://localhost:8000/health

Production Deploy (Kubernetes)

For production deployment:
## 1. Create namespace
kubectl create namespace mcp-server-langgraph

## 2. Create secrets
kubectl create secret generic mcp-server-langgraph-secrets \
  --from-literal=jwt-secret-key="$(openssl rand -base64 32)" \
  --from-literal=google-api-key="YOUR_KEY" \
  -n mcp-server-langgraph

## 3. Deploy with Helm
helm upgrade --install mcp-server-langgraph ./helm/mcp-server-langgraph \
  --namespace mcp-server-langgraph \
  --values values-production.yaml \
  --wait

## 4. Verify deployment
kubectl rollout status deployment/mcp-server-langgraph -n mcp-server-langgraph

Environment Configuration

  • Development
  • Staging
  • Production
environment: development
replicaCount: 1
autoscaling:
  enabled: false
resources:
  limits:
    cpu: 500m
    memory: 512Mi

Monitoring & Observability

Deploy the monitoring stack:
## Prometheus + Grafana
helm install prometheus prometheus-community/kube-prometheus-stack \
  --namespace monitoring --create-namespace

## Jaeger
helm install jaeger jaegertracing/jaeger \
  --namespace monitoring

## OpenTelemetry Collector
kubectl apply -f kubernetes/otel-collector/
Access dashboards:

Scaling

Horizontal Pod Autoscaling

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: mcp-server-langgraph
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: mcp-server-langgraph
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

Vertical Pod Autoscaling

kubectl apply -f kubernetes/vpa/

High Availability

Ensure high availability with:
  1. Multiple Replicas: Run at least 3 replicas
  2. Pod Disruption Budget: Maintain minimum availability
  3. Multi-Zone Deployment: Spread across availability zones
  4. Health Checks: Liveness and readiness probes
  5. Graceful Shutdown: Handle SIGTERM properly
  6. Circuit Breakers: Fail fast with timeouts

Disaster Recovery

1

Backup Strategy

  • OpenFGA PostgreSQL database
  • Configuration and secrets
  • Persistent volumes
2

Recovery Procedure

  • Restore database from backup
  • Redeploy application
  • Verify functionality
3

Testing

  • Test disaster recovery quarterly
  • Document RTO/RPO
  • Update runbooks

Security Hardening

Production security checklist:
  • TLS enabled for all endpoints
  • Network policies applied
  • Pod security policies enforced
  • RBAC configured with least privilege
  • Secrets encrypted at rest
  • Audit logging enabled
  • Regular security scans
  • Rate limiting configured
See Security Best Practices

Cost Optimization

  • GKE
  • EKS
  • AKS
  • Use Autopilot for hands-off management
  • Enable cluster autoscaling
  • Use preemptible nodes for dev/staging
  • Configure resource requests accurately
  • Use committed use discounts

Next Steps