51. Memorystore Redis ExternalName Service with Cloud DNS
Date: 2025-11-11Status
AcceptedCategory
Infrastructure & DeploymentContext
In GKE staging and production environments, we use Google Cloud Memorystore for Redis instead of self-hosted Redis deployments. This architectural decision creates a challenge: how do we reference external managed services from within Kubernetes in a way that supports operational requirements like failover and environment portability?Problem Statement
Memorystore Redis instances are external to the GKE cluster and have static IP addresses. We need to:- Reference external Redis from pods - Apps must connect to Memorystore Redis seamlessly
- Support failover scenarios - Ability to switch between Redis instances without redeploying manifests
- Maintain environment portability - Same manifests work across staging/production with different Redis instances
- Centralize IP management - Avoid hardcoding IPs in multiple ConfigMaps/Secrets
Constraints
- Memorystore Redis is outside the GKE cluster (different network, managed by Google)
- Pods use service discovery (
redis-session:6378) to find dependencies - Zero-downtime failover requirement for production deployments
- kube-score security scanner flags ExternalName services (AVD-KSV-0108) as potential DNS rebinding risks
Decision
We use Kubernetes ExternalName Service + Google Cloud DNS to reference Memorystore Redis instances.Architecture
Implementation
File:deployments/overlays/staging-gke/redis-session-service-patch.yaml
- Private DNS zone:
staging.internal - A record:
redis-session-staging.staging.internal→ Memorystore Redis IP - Zone visibility: Limited to GKE VPC
Rationale
Why ExternalName + Cloud DNS?
| Approach | Pros | Cons | Decision |
|---|---|---|---|
| Hardcoded IP in ConfigMap | Simple | IP changes require manifest updates + rollouts; No failover | ❌ Rejected |
| Headless Service + Endpoints | Works with ClusterIP | Manual Endpoint management; Not idiomatic | ❌ Rejected |
| ExternalName + Hardcoded DNS | Standard K8s pattern | DNS changes still require manifest updates | ❌ Rejected |
| ExternalName + Cloud DNS ✅ | Zero-manifest failover; Centralized IP management; Environment portability | Requires Cloud DNS setup; Triggers security scanner warnings | ✅ Accepted |
Key Benefits
-
Zero-Downtime Failover
- Update Cloud DNS A record:
redis-session-staging.staging.internal→ new IP - No manifest changes required
- No pod restarts needed (DNS TTL-based)
- Update Cloud DNS A record:
-
Environment Portability
- Same Kustomize manifests work in staging and production
- Only DNS records differ between environments
- Simplifies multi-environment deployments
-
Centralized IP Management
- All Redis IPs managed in Cloud DNS console
- Single source of truth for service discovery
- Infrastructure team can manage IPs without K8s manifest access
-
Idiomatic Kubernetes
- Uses native Service abstraction
- Apps use standard service discovery (
redis-session:6378) - No special connection logic needed in application code
Security Considerations
AVD-KSV-0108: ExternalName DNS Rebinding Risk
Trivy/kube-score Finding:“ExternalName services can be used for DNS rebinding attacks if not properly configured”Why This Is a False Positive for Our Use Case:
-
Controlled DNS Zone
staging.internalis a private Cloud DNS zone- Zone visibility limited to GKE VPC only
- External attackers cannot modify DNS records
-
Internal Network Only
- Memorystore Redis is in the same GCP project VPC
- Traffic never leaves Google’s network
- No external DNS resolution involved
-
IAM-Protected DNS Management
- Cloud DNS updates require GCP IAM permissions
- Only authorized infrastructure team can modify records
- Audit logs track all DNS changes
-
No User-Controlled Input
- DNS name is hardcoded in manifests (
redis-session-staging.staging.internal) - Not constructed from user input
- No dynamic ExternalName generation
- DNS name is hardcoded in manifests (
Risk Classification
- Attack Vector: DNS rebinding via malicious DNS updates
- Likelihood: Very Low (requires compromised GCP IAM + VPC access)
- Impact: High (Redis access)
- Mitigation: IAM controls + VPC isolation + audit logging
- Residual Risk: Acceptable for staging and production use
Consequences
Positive
- ✅ Operational Excellence: Failover without manifest changes or pod restarts
- ✅ Environment Consistency: Same manifests across staging/production
- ✅ Simplified Operations: Infrastructure team manages IPs centrally
- ✅ Standard Pattern: Uses native Kubernetes Service abstraction
- ✅ Clear Separation: Infrastructure (DNS) vs. Application (manifests)
Negative
- ❌ Additional Setup: Requires Cloud DNS zone and A record configuration
- ❌ Security Scanner Noise: Triggers AVD-KSV-0108 (requires suppression)
- ❌ DNS Dependency: Failure if Cloud DNS or VPC DNS resolution breaks
- ❌ Debugging Complexity: Adds DNS layer to troubleshooting (use
dig,nslookup)
Trade-offs
We accept the security scanner warning and DNS complexity in exchange for operational flexibility and environment portability. The benefits outweigh the costs for managed cloud deployments.Alternatives Considered
Alternative 1: Headless Service + Manual Endpoints
- Manual Endpoint management (must update both Service and Endpoints)
- IP hardcoded in manifest (defeats environment portability)
- Not resilient to IP changes (requires redeployment)
Alternative 2: Direct Connection String in ConfigMap
- Bypasses Kubernetes service discovery
- Application code must handle direct IPs
- No service-level abstraction
- IP changes require ConfigMap update + pod restart
Alternative 3: Custom External Service Controller
Deploy a controller that automatically creates/updates Endpoints for external services. Rejected Because:- Over-engineered for simple use case
- Additional operational complexity (controller deployment, monitoring)
- ExternalName + Cloud DNS is simpler and achieves same goal
Implementation Guidelines
Deployment Checklist
-
Create Cloud DNS Private Zone (once per environment)
-
Create DNS A Record (once per Redis instance)
-
Deploy ExternalName Service (via Kustomize)
-
Verify DNS Resolution (from within a pod)
Failover Procedure
-
Update Cloud DNS A record to point to new Memorystore Redis instance:
- Wait for DNS TTL (300 seconds max, or reduce TTL beforehand)
- Verify pods connect to new instance (monitor logs, metrics)
- No manifest changes or pod restarts needed
Suppression Justification
Add to.trivyignore:
References
- Kubernetes ExternalName Services: https://kubernetes.io/docs/concepts/services-networking/service/#externalname
- Cloud DNS Private Zones: https://cloud.google.com/dns/docs/zones/zones-overview#private-zones
- Memorystore Redis: https://cloud.google.com/memorystore/docs/redis
- kube-score AVD-KSV-0108: ExternalName service DNS rebinding risk
- Implementation:
deployments/overlays/staging-gke/redis-session-service-patch.yaml - Related ADR: ADR-0013 (Multi-Deployment Target Strategy)
Revision History
- 2025-11-11: Initial version documenting ExternalName + Cloud DNS decision