Skip to main content

Overview

Optimize infrastructure costs while maintaining performance, security, and reliability. This guide provides actionable strategies achieving 30-66% cost reduction.

GKE Autopilot

40-60% savings vs. Standard GKE

Committed Use

25-52% additional discounts

Right-sizing

10-30% from resource optimization

GCP Cost Optimization

Current vs. Optimized

  • Production
  • All Environments
ApproachMonthly CostAnnual CostSavings
Baseline GKE$1,290$15,480-
Autopilot$970$11,64025%
+ 1yr CUD$728$8,73643%
+ 3yr CUD$466$5,59266%
Maximum Savings: $9,888/year (66%)

Strategy 1: GKE Autopilot

Pay-per-pod pricing eliminates idle node costs:
Traditional GKE Standard:
  • Pay for entire nodes (even when idle)
  • Typical utilization: 50-70%
  • Waste: 30-50% of spend
GKE Autopilot:
  • Pay only for pod resources used
  • No idle costs
  • Automatic bin-packing
Savings: 40-60%
Right-size Pod Resources:
resources:
  requests:
    cpu: 1000m      # $35/month
    memory: 2Gi     # $8/month
## Per pod: $43/month
```yaml
```yaml After (Optimized)
resources:
  requests:
    cpu: 250m       # $9/month
    memory: 512Mi   # $2/month
## Per pod: $11/month (74% savings!)
Action:
## Profile actual usage
kubectl top pods -n mcp-production --containers

## Update deployment
kubectl patch deployment production-mcp-server-langgraph \
  -n mcp-production \
  --type='json' \
  -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/cpu", "value": "250m"}]'

Strategy 2: Committed Use Discounts (CUDs)

Commit to 1-year or 3-year usage for significant discounts:
Resource1-Year3-Year
GKE Autopilot25%52%
Cloud SQL25%52%
Memorystore25%52%
Purchase via Console:
  1. Go to: Billing → Commitments → Purchase Commitment
  2. Select resources and term
  3. Enable auto-renew
ROI: Pays for itself in first month

Complete Cost Guide

Comprehensive guide with 8 cost reduction strategies

Strategy 3: Development Cost Controls

Auto-shutdown dev clusters after hours:
## Cloud Scheduler: Scale down at 6 PM weekdays
gcloud scheduler jobs create http scale-down-dev \
  --schedule="0 18 * * 1-5" \
  --uri="https://container.googleapis.com/v1/projects/PROJECT_ID/zones/ZONE/clusters/mcp-dev-gke" \
  --http-method=PATCH \
  --message-body='{"desiredNodeCount":0}'

## Scale up at 6 AM weekdays
gcloud scheduler jobs create http scale-up-dev \
  --schedule="0 6 * * 1-5" \
  --message-body='{"desiredNodeCount":3}'
Savings: ~$50-70/month per dev environment

Strategy 4: Database Optimization

Analyze CPU usage:
gcloud sql instances describe mcp-prod-postgres \
  --format="yaml(settings.tier)"
If CPU < 50% consistently, downgrade:
  • db-custom-4-15360 → db-custom-2-7680
  • Savings: $140/month (50%)
Question: Do you use the read replica?Check connections:
gcloud sql instances describe mcp-prod-postgres-replica-1
If unused, remove:
cloudsql_read_replica_count = 0
Savings: $140/month
Reduce retention if compliance allows:
backup_retention_count         = 14  # vs. 30
transaction_log_retention_days = 3   # vs. 7
Savings: $5-10/month

Strategy 5: Redis Optimization

Right-size memory:
## Check usage
gcloud monitoring time-series list \
  --filter='metric.type="redis.googleapis.com/stats/memory/usage_ratio"'
If usage < 60%, downgrade:
  • 5 GB → 3 GB: Save $88/month (40%)
For non-critical caching:
  • STANDARD_HA → BASIC: Save $110/month (50%)
BASIC tier has no SLA and no auto-failover. Only use for non-critical workloads.

Cost Monitoring

Set Up Budget Alerts

gcloud billing budgets create \
  --billing-account=BILLING_ACCOUNT_ID \
  --display-name="MCP Production Monthly Budget" \
  --budget-amount=1200USD \
  --threshold-rule=percent=50 \
  --threshold-rule=percent=90 \
  --threshold-rule=percent=100
Alerts fire at 50%, 90%, and 100% of budget

Cost Allocation Labels

Already configured in Terraform:
labels = {
  environment = "production"
  team        = "platform"
  cost_center = "engineering"
  application = "mcp-server"
}
View costs by label in BigQuery:
SELECT
  labels.value AS team,
  SUM(cost) AS total_cost
FROM
  `PROJECT_ID.billing_dataset.gcp_billing_export_*`
WHERE
  labels.key = 'team'
GROUP BY team
ORDER BY total_cost DESC

Quick Wins Checklist

1

Use GKE Autopilot

✅ Already configuredSavings: $200-400/month vs. Standard GKE
2

Right-size Pod Resources

kubectl top pods -n mcp-production --containers
# Adjust requests based on actual usage
Savings: $50-150/month
3

Purchase 1-year CUD

Console → Billing → CommitmentsSavings: $242/month (25%)
4

Optimize Dev/Staging

  • Use zonal clusters (not regional)
  • Smaller instances
  • Auto-shutdown after hours
Savings: $100-200/month
5

Enable VPA

Vertical Pod Autoscaler automatically optimizes requests:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: production-mcp-server-vpa
spec:
  targetRef:
    kind: Deployment
    name: production-mcp-server-langgraph
  updatePolicy:
    updateMode: Auto
Savings: 10-30%

Cost Optimization Roadmap

Month 1: Quick Wins ($150-250/month)

  • GKE Autopilot (vs. Standard): $200 saved
  • Right-size pod resources: $96 saved
  • Reduce dev/staging usage: $50 saved
  • Optimize log retention: $15 saved
Total: $361/month savings

Month 2: Commitments ($200-300/month)

  • Purchase 1-year CUD: $242/month saved
  • Right-size Cloud SQL: $140 saved (if applicable)
  • Optimize Redis tier: $88 saved (if non-critical)
Total: $470/month additional savings

Month 3-6: Advanced ($50-100/month)

  • Implement autoscaling schedules
  • Optimize network egress
  • Cleanup unused resources
  • FinOps automation
Total Potential: $580/month savings (60% of baseline)

AWS Cost Optimization

Current vs. Optimized

  • Production
  • All Environments
ApproachMonthly CostAnnual CostSavings
Baseline EKS$1,980$23,760-
Optimized$803$9,63660%
+ Spot Instances$688$8,25665%
+ Reserved Instances (1yr)$545$6,54072%
+ Reserved Instances (3yr)$425$5,10079%
Maximum Savings: $18,660/year (79%)

Strategy 1: Spot Instances (70-90% Savings)

Use spot instances for fault-tolerant workloads:
On-Demand Pricing:
  • Fixed price, always available
  • t3.large: 0.0832/hour=0.0832/hour = 60.74/month
Spot Pricing:
  • Variable price (usually 70-90% discount)
  • t3.large: ~0.025/hour=0.025/hour = 18.25/month
  • Can be interrupted with 2-minute warning
Savings: $42.49/month per node (70%)
Implementation:
# terraform/environments/prod/terraform.tfvars
enable_spot_node_group       = true
spot_node_group_desired_size = 5
spot_node_group_min_size     = 2
spot_node_group_max_size     = 10
spot_node_group_instance_types = [
  "t3.large", "t3a.large",
  "t3.xlarge", "t3a.xlarge"
]
Use Cases:
  • Development/staging environments
  • Batch processing workloads
  • Stateless applications with proper graceful shutdown
  • CI/CD build agents

Strategy 2: Reserved Instances & Savings Plans

Commit to 1-year or 3-year usage for significant discounts:
ResourceOn-Demand1-Year RI3-Year RI
EC2 Instances$0.0832/hr$0.0540/hr$0.0416/hr
RDS Multi-AZ$0.165/hr$0.110/hr$0.085/hr
ElastiCache$0.075/hr$0.055/hr$0.042/hr
Discount-35%50%
Purchasing Options:
  • Reserved Instances
  • Compute Savings Plans
Best for: Predictable, always-running workloads
# Purchase RDS Reserved Instance
aws rds purchase-reserved-db-instances-offering \
  --reserved-db-instances-offering-id OFFERING_ID \
  --reserved-db-instance-id mcp-langgraph-prod-ri \
  --db-instance-count 1

# Purchase EC2 Reserved Instance
aws ec2 purchase-reserved-instances-offering \
  --reserved-instances-offering-id OFFERING_ID \
  --instance-count 3

Strategy 3: Right-Sizing

Analyze and optimize resource allocation:
# Check node utilization
kubectl top nodes

# Check pod resource usage
kubectl top pods -A --containers

# Right-size based on actual usage
# Production pods: 250m CPU, 512Mi RAM = $11/month
# vs. Over-provisioned: 1000m CPU, 2Gi RAM = $43/month
# Savings: $32/month per pod (74%)
RDS Right-Sizing:
# Check CPU utilization (CloudWatch)
aws cloudwatch get-metric-statistics \
  --namespace AWS/RDS \
  --metric-name CPUUtilization \
  --dimensions Name=DBInstanceIdentifier,Value=mcp-langgraph-prod \
  --start-time 2025-01-01T00:00:00Z \
  --end-time 2025-02-01T00:00:00Z \
  --period 3600 \
  --statistics Average

# If avg < 40%, downsize instance class
# db.t3.large ($240/mo) → db.t3.medium ($120/mo) = $120 saved

Strategy 4: VPC Endpoints (70% Data Transfer Savings)

Avoid NAT gateway data transfer charges:
# Enable VPC endpoints in Terraform
enable_vpc_endpoints = true

# Creates endpoints for:
# - S3 (Gateway - free)
# - ECR API ($7.20/month)
# - ECR DKR ($7.20/month)
# - CloudWatch ($7.20/month)

# Cost: $21.60/month
# Saves: ~$50-150/month in data transfer
# ROI: Pays for itself immediately
Data Transfer Savings:
  • NAT Gateway data processing: $0.045/GB
  • VPC Endpoint data processing: $0.01/GB
  • Savings: $0.035/GB (78%)

Strategy 5: Auto-Scaling

Cluster Autoscaler removes idle nodes:
# Deploy Cluster Autoscaler
kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml

# Configure
kubectl set env deployment/cluster-autoscaler \
  -n kube-system \
  --containers=cluster-autoscaler \
  AWS_REGION=us-east-1
HPA scales pods based on utilization:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: mcp-server-langgraph
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: mcp-server-langgraph
  minReplicas: 2
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
Savings: 20-40% on compute costs by removing idle capacity

Strategy 6: Development Environment Controls

Stop dev/staging environments after hours:
# Lambda function to stop EKS node groups
# Runs Mon-Fri at 6 PM, starts at 6 AM
# Savings: ~65 hours/week off = $200-300/month
Single NAT Gateway for Non-Production:
# terraform/environments/dev/terraform.tfvars
single_nat_gateway = true  # vs. Multi-AZ ($97.20 → $32.40 = $65 saved)

AWS Cost Breakdown

Production Environment ($803/month):
ServiceConfigurationMonthly CostOptimization
EKS Control Plane1 cluster$73.00Required
EC2 Nodes3×t3.xlarge on-demand$295.20Use 1yr RI: $192
Spot Nodes2×t3.large equiv$14.60✅ Optimized
RDS PostgreSQLdb.t3.medium Multi-AZ$157.56Use 1yr RI: $105
ElastiCache2×cache.r6g.large$109.50Use 1yr RI: $80
NAT Gateway3×Multi-AZ$97.20Required for HA
VPC Endpoints6 endpoints$21.60Saves $50+ transfer
EBS Volumes5×100GB gp3$40.00✅ Optimized (gp3)
CloudWatchLogs + metrics$15.00Set retention limits
Total$823.66With RIs: $545
Cost Optimization Checklist:
Enable spot instances for fault-tolerant workloads
Purchase Reserved Instances for base capacity (35-50% discount)
Deploy Cluster Autoscaler to remove idle nodes
Right-size instance types based on actual utilization
Use HPA to scale pods, not nodes
Stop dev/staging environments after hours
Use gp3 instead of gp2 (20% savings + better performance)
Enable RDS storage autoscaling (avoid over-provisioning)
Delete old EBS snapshots (lifecycle policy)
Use S3 lifecycle policies (Standard → IA → Glacier)
Enable VPC endpoints (70% data transfer savings)
Use single NAT gateway in dev/staging
Optimize CloudFront caching to reduce origin requests
Use AWS PrivateLink instead of NAT for AWS services
Purchase RDS Reserved Instances (35-50% discount)
Right-size RDS instance class based on CPU/memory usage
Use Aurora Serverless v2 for variable workloads
Enable automated backup retention limits
Use read replicas only when needed
Set CloudWatch Logs retention (7-30 days, not indefinite)
Use CloudWatch Logs Insights instead of exporting to S3
Enable AWS Cost Explorer and set budgets
Tag all resources for cost allocation
Set up billing alerts (50,50, 100, $200 thresholds)

AWS Cost Monitoring

Set Up Cost Alerts:
# Create SNS topic
aws sns create-topic --name cost-alerts

# Subscribe email
aws sns subscribe \
  --topic-arn arn:aws:sns:us-east-1:ACCOUNT:cost-alerts \
  --protocol email \
  --notification-endpoint your-email@example.com

# Create budget
aws budgets create-budget \
  --account-id ACCOUNT_ID \
  --budget file://budget.json
budget.json:
{
  "BudgetName": "mcp-langgraph-monthly",
  "BudgetLimit": {
    "Amount": "900",
    "Unit": "USD"
  },
  "TimeUnit": "MONTHLY",
  "BudgetType": "COST"
}
Enable Cost Allocation Tags:
tags = {
  Environment = "production"
  Project     = "mcp-langgraph"
  ManagedBy   = "terraform"
  CostCenter  = "engineering"
}