Overview
Deploy the MCP Server with LangGraph to Google Cloud Run for fully managed, serverless hosting on Google Cloud Platform.Serverless GCP: Auto-scaling from 0 to 100+ instances with pay-per-use pricing.
Quick Start
One-Command Deployment
- Enable required GCP APIs
- Create service account
- Configure Secret Manager
- Build Docker image
- Deploy to Cloud Run
- Output the service URL
Manual Deployment
Benefits
Serverless
Scale to zero when idle. Auto-scale to 100+ instances under load.
Pay-Per-Use
Only charged during request processing. No idle costs.
Automatic HTTPS
Free SSL certificates and automatic certificate renewal.
Secret Manager
Integrated with Google Cloud Secret Manager for secure configuration.
Fast Deployment
Deploy in 2-3 minutes with automatic rollback on failure.
Global
Deploy to regions worldwide for low latency.
Prerequisites
- Google Cloud account with billing enabled
- gcloud CLI installed: Install Guide
- Docker (optional, for local testing)
Configuration
Secret Manager Setup
Store API keys securely:Service Configuration
Editcloudrun/service.yaml:
Deployment
Initial Deployment
Update Deployment
Rollback
Monitoring
View Logs
Cloud Console
Access metrics in Cloud Console:- Request count and latency
- CPU and memory utilization
- Instance count over time
- Error rates
Scaling
Configure Autoscaling
Scale to Zero (Cost-Optimized)
Scale to Zero (Cost-Optimized)
- Pros: Lowest cost
- Cons: 1-2s cold start
- Use: Dev, low-traffic apps
Warm Pool (Low Latency)
Warm Pool (Low Latency)
- Pros: No cold starts
- Cons: Higher baseline cost
- Use: Production, latency-sensitive
Hybrid (Balanced)
Hybrid (Balanced)
- Pros: Balance cost and latency
- Cons: Some cold starts during spikes
- Use: Most production workloads
Security
Network Security
VPC Access
Connect to private resources:Authentication
Cost Optimization
Optimization tips:- Scale to zero when idle (
--min-instances 0) - CPU throttling for cost savings
- Right-size resources (start with 1 CPU, 1Gi memory)
- Request bundling to reduce request count charges
- Caching to reduce LLM API calls
Troubleshooting
503 Service Unavailable
503 Service Unavailable
Secret Access Denied
Secret Access Denied
Cause: Service account lacks permissionsSolution:
Cold Start Latency
Cold Start Latency
Solutions:
- Use
--min-instances 1or higher - Enable startup CPU boost (default in service.yaml)
- Optimize Docker image size
Complete Guide
This page provides comprehensive Cloud Run deployment instructions. For additional deployment options:Production Checklist
Pre-deployment security and compliance checklist
Monitoring Setup
Configure observability for Cloud Run
Next Steps
Deploy Now
Follow quick start guide above
Configure Monitoring
Set up observability and alerts
Production Checklist
Pre-deployment security checklist
Compare Platforms
Choose the right deployment option
Ready to deploy? Run
cd cloudrun && ./deploy.sh --setup to deploy to Google Cloud Run!