Skip to main content

GKE Autopilot Resource Constraints

Overview

Google Kubernetes Engine (GKE) Autopilot enforces strict resource constraints via LimitRange policies to optimize cost and performance. Violations cause pod creation failures. Critical Rule: CPU and memory limit/request ratios must not exceed 4.0x.

CPU Ratio Constraint

The 4.0x Rule

CPU Limit / CPU Request ≤ 4.0
Why: GKE Autopilot prevents resource waste and ensures predictable billing. A 4.0x ratio allows reasonable burst capacity while preventing excessive over-commitment.

Examples

Compliant Configurations

  • Example 1:
    resources:
      requests:
        cpu: 250m  # Base allocation
      limits:
        cpu: 1000m  # 1000m / 250m = 4.0x ✅
    
  • Example 2:
    resources:
      requests:
        cpu: 500m
      limits:
        cpu: 2000m  # 2000m / 500m = 4.0x ✅
    
  • Example 3:
    resources:
      requests:
        cpu: 125m
      limits:
        cpu: 500m  # 500m / 125m = 4.0x ✅
    

Non-Compliant Configurations

  • Example 1 - VIOLATION:
    resources:
      requests:
        cpu: 200m
      limits:
        cpu: 1000m  # 1000m / 200m = 5.0x ❌ VIOLATION
    
  • Example 2 - VIOLATION:
    resources:
      requests:
        cpu: 100m
      limits:
        cpu: 1000m  # 1000m / 100m = 10.0x ❌ VIOLATION
    

Common Services - Resource Sizing Guide

otel-collector

Recommended: 250m request / 1000m limit (4.0x ratio)
  • Handles telemetry data collection
  • Needs burst capacity for traffic spikes
resources:
  requests:
    cpu: 250m
    memory: 256Mi
  limits:
    cpu: 1000m
    memory: 512Mi

qdrant (Vector Database)

Recommended: 250m request / 1000m limit (4.0x ratio)
  • Performs vector similarity search
  • CPU-intensive during query processing
resources:
  requests:
    cpu: 250m
    memory: 256Mi
  limits:
    cpu: 1000m
    memory: 1Gi

postgres (Database)

Recommended: 500m request / 2000m limit (4.0x ratio)
  • Primary data store
  • Higher baseline due to query processing
resources:
  requests:
    cpu: 500m
    memory: 512Mi
  limits:
    cpu: 2000m
    memory: 2Gi

redis-session (Cache)

Recommended: 125m request / 500m limit (4.0x ratio)
  • Session storage and caching
  • Lower resource requirements
resources:
  requests:
    cpu: 125m
    memory: 256Mi
  limits:
    cpu: 500m
    memory: 1Gi

mcp-server-langgraph (Application)

Recommended:
  • Dev: 125m request / 500m limit (4.0x ratio)
  • Production: 500m request / 2000m limit (4.0x ratio)
# Development
resources:
  requests:
    cpu: 125m
    memory: 256Mi
  limits:
    cpu: 500m
    memory: 512Mi

# Production
resources:
  requests:
    cpu: 500m
    memory: 1Gi
  limits:
    cpu: 2000m
    memory: 2Gi

Fixing Ratio Violations

Step 1: Calculate Current Ratio

# Example: otel-collector
Request: 200m
Limit: 1000m
Ratio: 1000 / 200 = 5.0x

Step 2: Choose Fix Strategy

Option A: Increase Request (Recommended)
  • Preserves burst capacity
  • Ensures adequate baseline resources
  • Better for production workloads
# Fix: Increase request from 200m to 250m
resources:
  requests:
    cpu: 250m  # 1000m / 250m = 4.0x ✅
  limits:
    cpu: 1000m
Option B: Decrease Limit
  • Reduces burst capacity
  • Lower resource costs
  • Better for cost-sensitive environments
# Alternative: Decrease limit from 1000m to 800m
resources:
  requests:
    cpu: 200m
  limits:
    cpu: 800m  # 800m / 200m = 4.0x ✅

Step 3: Create Overlay Patch

Create deployments/overlays/{environment}/{service}-patch.yaml:
apiVersion: apps/v1
kind: Deployment  # or StatefulSet
metadata:
  name: {service-name}
spec:
  template:
    spec:
      containers:
      - name: {container-name}
        resources:
          requests:
            cpu: 250m  # Fixed value
          limits:
            cpu: 1000m

Step 4: Update Kustomization

Add patch to deployments/overlays/{environment}/kustomization.yaml:
patches:
  - path: {service}-patch.yaml
    target:
      kind: Deployment  # Match resource type
      name: {service-name}

Step 5: Validate

# Run GKE Autopilot validator
python3 scripts/validate_gke_autopilot_compliance.py

# Build with kubectl kustomize
kubectl kustomize deployments/overlays/{environment}

# Verify resources in output
kubectl kustomize deployments/overlays/{environment} | grep -A 10 "resources:"

Validation Tools

Pre-deployment Validation

# Validate all overlays
python3 scripts/validate_gke_autopilot_compliance.py

# Validate specific overlay
python3 scripts/validate_gke_autopilot_compliance.py deployments/overlays/production

Pre-commit Hook

The repository includes automatic validation via pre-commit hooks:
# .pre-commit-config.yaml
- id: gke-autopilot-validation
  name: Validate GKE Autopilot Compliance
  entry: python3 scripts/validate_gke_autopilot_compliance.py
  language: python
  files: ^deployments/.*\.yaml$
  pass_filenames: false

Unit Tests

Test coverage ensures validator correctness:
# Run validator unit tests
uv run pytest tests/unit/test_gke_autopilot_validator.py -v

Environment Variable Conflicts

Issue: value + valueFrom

Kubernetes strategic merge can create conflicts when overlays override base configurations:
# Base deployment
env:
- name: LLM_PROVIDER
  valueFrom:
    configMapKeyRef:
      name: config
      key: llm-provider

# Overlay patch (WRONG - creates conflict)
env:
- name: LLM_PROVIDER
  value: "google"  # ❌ Now has BOTH value and valueFrom

Solution: Explicit null

Set valueFrom: null to remove base definition:
# Overlay patch (CORRECT)
env:
- name: LLM_PROVIDER
  value: "google"
  valueFrom: null  # ✅ Removes valueFrom from base

Common Pitfalls

1. Wrong Resource Type in Patch

  • ❌ Using kind: Deployment for a StatefulSet:
    # redis-session is a StatefulSet, not Deployment
    kind: Deployment  # ❌ WRONG
    metadata:
      name: redis-session
    
  • ✅ Correct:
    kind: StatefulSet  # ✅ CORRECT
    metadata:
      name: redis-session
    

2. Forgetting Kustomization Update

After creating a patch file, you MUST update kustomization.yaml:
patches:
  - path: new-service-patch.yaml  # Don't forget this!
    target:
      kind: Deployment
      name: new-service

3. Memory Ratio Violations

The 4.0x ratio also applies to memory:
resources:
  requests:
    memory: 256Mi
  limits:
    memory: 2Gi  # 2048Mi / 256Mi = 8.0x ❌ VIOLATION
Fix:
resources:
  requests:
    memory: 512Mi  # 2048Mi / 512Mi = 4.0x ✅
  limits:
    memory: 2Gi

Quick Reference Calculator

RequestMax Limit (4.0x)Common Limits
100m400m400m
125m500m500m
200m800m800m
250m1000m1000m
500m2000m2000m
1000m4000m4000m

Additional Resources

Troubleshooting

Pod Creation Failed

Error: Pod creation failed: CPU limit/request ratio exceeds 4.0
Solution: Review pod resources and adjust according to this guide.

Kustomize Build Error

Error: no matches for kind "Deployment" for redis-session
Solution: redis-session is a StatefulSet, not Deployment. Update patch file.

Pre-commit Hook Failures

❌ ERRORS:
  - staging-redis-session/redis: CPU limit/request ratio 5.00 exceeds max 4.0
Solution: Fix the violation before committing. The hook prevents bad configs from reaching CI/CD.
Last Updated: 2025-11-12 Maintained By: Infrastructure Team Contact: #infrastructure-support