Cost Optimization - MCP Server with LangGraph

Overview

Optimize infrastructure costs while maintaining performance, security, and reliability. This guide provides actionable strategies achieving 30-66% cost reduction.

GKE Autopilot

40-60% savings vs. Standard GKE

Committed Use

25-52% additional discounts

Right-sizing

10-30% from resource optimization

GCP Cost Optimization

Current vs. Optimized

Production
All Environments

Approach	Monthly Cost	Annual Cost	Savings
Baseline GKE	$1,290	$15,480	-
Autopilot	$970	$11,640	25%
+ 1yr CUD	$728	$8,736	43%
+ 3yr CUD	$466	$5,592	66%

Maximum Savings: $9,888/year (66%)

Environment	Baseline	Optimized	Savings
Production	$1,290	$970	$320
Staging	$480	$310	$170
Development	$200	$100	$100
Total	$1,970	$1,380	$590

Strategy 1: GKE Autopilot

Pay-per-pod pricing eliminates idle node costs:

How Autopilot Saves Money

Traditional GKE Standard:

Pay for entire nodes (even when idle)
Typical utilization: 50-70%
Waste: 30-50% of spend

GKE Autopilot:

Pay only for pod resources used
No idle costs
Automatic bin-packing

Savings: 40-60%

Right-size Pod Resources:

resources:
  requests:
    cpu: 1000m      # $35/month
    memory: 2Gi     # $8/month
## Per pod: $43/month
```yaml
```yaml After (Optimized)
resources:
  requests:
    cpu: 250m       # $9/month
    memory: 512Mi   # $2/month
## Per pod: $11/month (74% savings!)

Action:

## Profile actual usage
kubectl top pods -n mcp-production --containers

## Update deployment
kubectl patch deployment production-mcp-server-langgraph \
  -n mcp-production \
  --type='json' \
  -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/cpu", "value": "250m"}]'

Strategy 2: Committed Use Discounts (CUDs)

Commit to 1-year or 3-year usage for significant discounts:

Resource	1-Year	3-Year
GKE Autopilot	25%	52%
Cloud SQL	25%	52%
Memorystore	25%	52%

Purchase via Console:

Go to: Billing → Commitments → Purchase Commitment
Select resources and term
Enable auto-renew

ROI: Pays for itself in first month

Complete Cost Guide

Comprehensive guide with 8 cost reduction strategies

Strategy 3: Development Cost Controls

Auto-shutdown dev clusters after hours:

## Cloud Scheduler: Scale down at 6 PM weekdays
gcloud scheduler jobs create http scale-down-dev \
  --schedule="0 18 * * 1-5" \
  --uri="https://container.googleapis.com/v1/projects/PROJECT_ID/zones/ZONE/clusters/mcp-dev-gke" \
  --http-method=PATCH \
  --message-body='{"desiredNodeCount":0}'

## Scale up at 6 AM weekdays
gcloud scheduler jobs create http scale-up-dev \
  --schedule="0 6 * * 1-5" \
  --message-body='{"desiredNodeCount":3}'

Savings: ~$50-70/month per dev environment

Strategy 4: Database Optimization

Right-size Cloud SQL

Analyze CPU usage:

gcloud sql instances describe mcp-prod-postgres \
  --format="yaml(settings.tier)"

If CPU < 50% consistently, downgrade:

db-custom-4-15360 → db-custom-2-7680
Savings: $140/month (50%)

Evaluate Read Replicas

Question: Do you use the read replica?Check connections:

gcloud sql instances describe mcp-prod-postgres-replica-1

If unused, remove:

cloudsql_read_replica_count = 0

Savings: $140/month

Optimize Backups

Reduce retention if compliance allows:

backup_retention_count         = 14  # vs. 30
transaction_log_retention_days = 3   # vs. 7

Savings: $5-10/month

Strategy 5: Redis Optimization

Right-size memory:

## Check usage
gcloud monitoring time-series list \
  --filter='metric.type="redis.googleapis.com/stats/memory/usage_ratio"'

If usage < 60%, downgrade:

5 GB → 3 GB: Save $88/month (40%)

For non-critical caching:

STANDARD_HA → BASIC: Save $110/month (50%)

BASIC tier has no SLA and no auto-failover. Only use for non-critical workloads.

Cost Monitoring

Set Up Budget Alerts

gcloud billing budgets create \
  --billing-account=BILLING_ACCOUNT_ID \
  --display-name="MCP Production Monthly Budget" \
  --budget-amount=1200USD \
  --threshold-rule=percent=50 \
  --threshold-rule=percent=90 \
  --threshold-rule=percent=100

Alerts fire at 50%, 90%, and 100% of budget

Cost Allocation Labels

Already configured in Terraform:

labels = {
  environment = "production"
  team        = "platform"
  cost_center = "engineering"
  application = "mcp-server"
}

View costs by label in BigQuery:

SELECT
  labels.value AS team,
  SUM(cost) AS total_cost
FROM
  `PROJECT_ID.billing_dataset.gcp_billing_export_*`
WHERE
  labels.key = 'team'
GROUP BY team
ORDER BY total_cost DESC

Quick Wins Checklist

Use GKE Autopilot

✅ Already configuredSavings: $200-400/month vs. Standard GKE

Right-size Pod Resources

kubectl top pods -n mcp-production --containers
# Adjust requests based on actual usage

Savings: $50-150/month

Purchase 1-year CUD

Console → Billing → CommitmentsSavings: $242/month (25%)

Optimize Dev/Staging

Use zonal clusters (not regional)
Smaller instances
Auto-shutdown after hours

Savings: $100-200/month

Enable VPA

Vertical Pod Autoscaler automatically optimizes requests:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: production-mcp-server-vpa
spec:
  targetRef:
    kind: Deployment
    name: production-mcp-server-langgraph
  updatePolicy:
    updateMode: Auto

Savings: 10-30%

Cost Optimization Roadmap

Month 1: Quick Wins ($150-250/month)

GKE Autopilot (vs. Standard): $200 saved
Right-size pod resources: $96 saved
Reduce dev/staging usage: $50 saved
Optimize log retention: $15 saved

Total: $361/month savings

Month 2: Commitments ($200-300/month)

Purchase 1-year CUD: $242/month saved
Right-size Cloud SQL: $140 saved (if applicable)
Optimize Redis tier: $88 saved (if non-critical)

Total: $470/month additional savings

Month 3-6: Advanced ($50-100/month)

Implement autoscaling schedules
Optimize network egress
Cleanup unused resources
FinOps automation

Total Potential: $580/month savings (60% of baseline)

AWS Cost Optimization

Current vs. Optimized

Production
All Environments

Approach	Monthly Cost	Annual Cost	Savings
Baseline EKS	$1,980	$23,760	-
Optimized	$803	$9,636	60%
+ Spot Instances	$688	$8,256	65%
+ Reserved Instances (1yr)	$545	$6,540	72%
+ Reserved Instances (3yr)	$425	$5,100	79%

Maximum Savings: $18,660/year (79%)

Environment	Baseline	Optimized	Savings
Production	$1,980	$803	$1,177
Staging	$890	$385	$505
Development	$450	$180	$270
Total	$3,320	$1,368	$1,952

Strategy 1: Spot Instances (70-90% Savings)

Use spot instances for fault-tolerant workloads:

How Spot Instances Save Money

On-Demand Pricing:

Fixed price, always available
t3.large: $0.0832/hour =$ 60.74/month

Spot Pricing:

Variable price (usually 70-90% discount)
t3.large: ~ $0.025/hour =$ 18.25/month
Can be interrupted with 2-minute warning

Savings: $42.49/month per node (70%)

Implementation:

# terraform/environments/prod/terraform.tfvars
enable_spot_node_group       = true
spot_node_group_desired_size = 5
spot_node_group_min_size     = 2
spot_node_group_max_size     = 10
spot_node_group_instance_types = [
  "t3.large", "t3a.large",
  "t3.xlarge", "t3a.xlarge"
]

Use Cases:

Development/staging environments
Batch processing workloads
Stateless applications with proper graceful shutdown
CI/CD build agents

Strategy 2: Reserved Instances & Savings Plans

Commit to 1-year or 3-year usage for significant discounts:

Resource	On-Demand	1-Year RI	3-Year RI
EC2 Instances	$0.0832/hr	$0.0540/hr	$0.0416/hr
RDS Multi-AZ	$0.165/hr	$0.110/hr	$0.085/hr
ElastiCache	$0.075/hr	$0.055/hr	$0.042/hr
Discount	-	35%	50%

Purchasing Options:

Reserved Instances
Compute Savings Plans

Best for: Predictable, always-running workloads

# Purchase RDS Reserved Instance
aws rds purchase-reserved-db-instances-offering \
  --reserved-db-instances-offering-id OFFERING_ID \
  --reserved-db-instance-id mcp-langgraph-prod-ri \
  --db-instance-count 1

# Purchase EC2 Reserved Instance
aws ec2 purchase-reserved-instances-offering \
  --reserved-instances-offering-id OFFERING_ID \
  --instance-count 3

Strategy 3: Right-Sizing

Analyze and optimize resource allocation:

# Check node utilization
kubectl top nodes

# Check pod resource usage
kubectl top pods -A --containers

# Right-size based on actual usage
# Production pods: 250m CPU, 512Mi RAM = $11/month
# vs. Over-provisioned: 1000m CPU, 2Gi RAM = $43/month
# Savings: $32/month per pod (74%)

RDS Right-Sizing:

# Check CPU utilization (CloudWatch)
aws cloudwatch get-metric-statistics \
  --namespace AWS/RDS \
  --metric-name CPUUtilization \
  --dimensions Name=DBInstanceIdentifier,Value=mcp-langgraph-prod \
  --start-time 2025-01-01T00:00:00Z \
  --end-time 2025-02-01T00:00:00Z \
  --period 3600 \
  --statistics Average

# If avg < 40%, downsize instance class
# db.t3.large ($240/mo) → db.t3.medium ($120/mo) = $120 saved

Strategy 4: VPC Endpoints (70% Data Transfer Savings)

Avoid NAT gateway data transfer charges:

# Enable VPC endpoints in Terraform
enable_vpc_endpoints = true

# Creates endpoints for:
# - S3 (Gateway - free)
# - ECR API ($7.20/month)
# - ECR DKR ($7.20/month)
# - CloudWatch ($7.20/month)

# Cost: $21.60/month
# Saves: ~$50-150/month in data transfer
# ROI: Pays for itself immediately

Data Transfer Savings:

NAT Gateway data processing: $0.045/GB
VPC Endpoint data processing: $0.01/GB
Savings: $0.035/GB (78%)

Strategy 5: Auto-Scaling

Cluster Autoscaler removes idle nodes:

# Deploy Cluster Autoscaler
kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml

# Configure
kubectl set env deployment/cluster-autoscaler \
  -n kube-system \
  --containers=cluster-autoscaler \
  AWS_REGION=us-east-1

HPA scales pods based on utilization:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: mcp-server-langgraph
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: mcp-server-langgraph
  minReplicas: 2
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Savings: 20-40% on compute costs by removing idle capacity

Strategy 6: Development Environment Controls

Stop dev/staging environments after hours:

# Lambda function to stop EKS node groups
# Runs Mon-Fri at 6 PM, starts at 6 AM
# Savings: ~65 hours/week off = $200-300/month

Single NAT Gateway for Non-Production:

# terraform/environments/dev/terraform.tfvars
single_nat_gateway = true  # vs. Multi-AZ ($97.20 → $32.40 = $65 saved)

AWS Cost Breakdown

Production Environment ($803/month):

Service	Configuration	Monthly Cost	Optimization
EKS Control Plane	1 cluster	$73.00	Required
EC2 Nodes	3×t3.xlarge on-demand	$295.20	Use 1yr RI: $192
Spot Nodes	2×t3.large equiv	$14.60	✅ Optimized
RDS PostgreSQL	db.t3.medium Multi-AZ	$157.56	Use 1yr RI: $105
ElastiCache	2×cache.r6g.large	$109.50	Use 1yr RI: $80
NAT Gateway	3×Multi-AZ	$97.20	Required for HA
VPC Endpoints	6 endpoints	$21.60	Saves $50+ transfer
EBS Volumes	5×100GB gp3	$40.00	✅ Optimized (gp3)
CloudWatch	Logs + metrics	$15.00	Set retention limits
Total		$823.66	With RIs: $545

Cost Optimization Checklist:

Compute

Enable spot instances for fault-tolerant workloads

Purchase Reserved Instances for base capacity (35-50% discount)

Deploy Cluster Autoscaler to remove idle nodes

Right-size instance types based on actual utilization

Use HPA to scale pods, not nodes

Stop dev/staging environments after hours

Storage

Use gp3 instead of gp2 (20% savings + better performance)

Enable RDS storage autoscaling (avoid over-provisioning)

Delete old EBS snapshots (lifecycle policy)

Use S3 lifecycle policies (Standard → IA → Glacier)

Network

Enable VPC endpoints (70% data transfer savings)

Use single NAT gateway in dev/staging

Optimize CloudFront caching to reduce origin requests

Use AWS PrivateLink instead of NAT for AWS services

Database

Purchase RDS Reserved Instances (35-50% discount)

Right-size RDS instance class based on CPU/memory usage

Use Aurora Serverless v2 for variable workloads

Enable automated backup retention limits

Use read replicas only when needed

Monitoring

Set CloudWatch Logs retention (7-30 days, not indefinite)

Use CloudWatch Logs Insights instead of exporting to S3

Enable AWS Cost Explorer and set budgets

Tag all resources for cost allocation

Set up billing alerts (

50,

100, $200 thresholds)

AWS Cost Monitoring

Set Up Cost Alerts:

# Create SNS topic
aws sns create-topic --name cost-alerts

# Subscribe email
aws sns subscribe \
  --topic-arn arn:aws:sns:us-east-1:ACCOUNT:cost-alerts \
  --protocol email \
  --notification-endpoint your-email@example.com

# Create budget
aws budgets create-budget \
  --account-id ACCOUNT_ID \
  --budget file://budget.json

budget.json:

{
  "BudgetName": "mcp-langgraph-monthly",
  "BudgetLimit": {
    "Amount": "900",
    "Unit": "USD"
  },
  "TimeUnit": "MONTHLY",
  "BudgetType": "COST"
}

Enable Cost Allocation Tags:

tags = {
  Environment = "production"
  Project     = "mcp-langgraph"
  ManagedBy   = "terraform"
  CostCenter  = "engineering"
}

GCP Infrastructure

GCP Terraform modules with cost-optimized defaults

AWS Infrastructure

AWS Terraform modules with cost breakdowns

GKE Production

GCP production deployment with cost estimates

EKS Production

AWS production deployment with cost estimates

Multi-Environment

Dev/staging cost optimization strategies

Complete Guide

Detailed GCP cost optimization strategies

Getting Started

Deployment Options

LangGraph Platform

Kubernetes - GKE

Kubernetes - EKS & AKS

Kubernetes - Best Practices

Infrastructure as Code

Monitoring & Observability

Advanced Deployment

Configuration

Operations

​Overview

GKE Autopilot

Committed Use

Right-sizing

​GCP Cost Optimization

​Current vs. Optimized

​Strategy 1: GKE Autopilot

​Strategy 2: Committed Use Discounts (CUDs)

Complete Cost Guide

​Strategy 3: Development Cost Controls

​Strategy 4: Database Optimization

​Strategy 5: Redis Optimization

​Cost Monitoring

​Set Up Budget Alerts

​Cost Allocation Labels

​Quick Wins Checklist

​Cost Optimization Roadmap

​Month 1: Quick Wins ($150-250/month)

​Month 2: Commitments ($200-300/month)

​Month 3-6: Advanced ($50-100/month)

​AWS Cost Optimization

​Current vs. Optimized

​Strategy 1: Spot Instances (70-90% Savings)

​Strategy 2: Reserved Instances & Savings Plans

​Strategy 3: Right-Sizing

​Strategy 4: VPC Endpoints (70% Data Transfer Savings)

​Strategy 5: Auto-Scaling

​Strategy 6: Development Environment Controls

​AWS Cost Breakdown

​AWS Cost Monitoring

​Related Documentation

GCP Infrastructure

AWS Infrastructure

GKE Production

EKS Production

Multi-Environment

Complete Guide

Overview

GCP Cost Optimization

Current vs. Optimized

Strategy 1: GKE Autopilot

Strategy 2: Committed Use Discounts (CUDs)

Strategy 3: Development Cost Controls

Strategy 4: Database Optimization

Strategy 5: Redis Optimization

Cost Monitoring

Set Up Budget Alerts

Cost Allocation Labels

Quick Wins Checklist

Cost Optimization Roadmap

Month 1: Quick Wins ($150-250/month)

Month 2: Commitments ($200-300/month)

Month 3-6: Advanced ($50-100/month)

AWS Cost Optimization

Current vs. Optimized

Strategy 1: Spot Instances (70-90% Savings)

Strategy 2: Reserved Instances & Savings Plans

Strategy 3: Right-Sizing

Strategy 4: VPC Endpoints (70% Data Transfer Savings)

Strategy 5: Auto-Scaling

Strategy 6: Development Environment Controls

AWS Cost Breakdown

AWS Cost Monitoring

Related Documentation