Overview
Azure Kubernetes Service (AKS) is Microsoft’s managed Kubernetes offering that integrates with Azure services like Azure Database for PostgreSQL, Azure Cache for Redis, Azure Key Vault, and Azure Active Directory.This guide covers deploying to AKS with production-ready configuration including Azure AD Pod Identity, Azure Database for PostgreSQL, Azure Cache for Redis, and Azure Key Vault integration.
Prerequisites
- Azure subscription
- Azure CLI (
az) installed and configured kubectlinstalledhelminstalled
Install Prerequisites
Copy
Ask AI
## Install Azure CLI
brew install azure-cli
## Login to Azure
az login
## Set subscription
az account set --subscription "your-subscription-id"
## Install kubectl
az aks install-cli
Create Resource Group
Copy
Ask AI
## Create resource group
az group create \
--name langgraph-rg \
--location eastus
## Set default resource group
az configure --defaults group=langgraph-rg location=eastus
Create AKS Cluster
Using Azure CLI
Copy
Ask AI
## Create AKS cluster
az aks create \
--name langgraph-cluster \
--resource-group langgraph-rg \
--location eastus \
--node-count 3 \
--node-vm-size Standard_D4s_v3 \
--enable-cluster-autoscaler \
--min-count 3 \
--max-count 10 \
--network-plugin azure \
--enable-managed-identity \
--enable-aad \
--enable-azure-rbac \
--enable-addons monitoring \
--enable-msi-auth-for-monitoring \
--enable-secret-rotation \
--kubernetes-version 1.28.0 \
--zones 1 2 3
## Get credentials
az aks get-credentials \
--name langgraph-cluster \
--resource-group langgraph-rg
## Verify
kubectl get nodes
Advanced Configuration
Copy
Ask AI
## Create with custom networking
az aks create \
--name langgraph-cluster \
--resource-group langgraph-rg \
--location eastus \
--node-count 3 \
--node-vm-size Standard_D4s_v3 \
--node-osdisk-size 100 \
--node-osdisk-type Managed \
--enable-cluster-autoscaler \
--min-count 3 \
--max-count 10 \
--network-plugin azure \
--vnet-subnet-id /subscriptions/${SUBSCRIPTION_ID}/resourceGroups/langgraph-rg/providers/Microsoft.Network/virtualNetworks/langgraph-vnet/subnets/aks-subnet \
--service-cidr 10.0.0.0/16 \
--dns-service-ip 10.0.0.10 \
--docker-bridge-address 172.17.0.1/16 \
--enable-managed-identity \
--enable-aad \
--enable-azure-rbac \
--enable-addons monitoring,azure-keyvault-secrets-provider \
--enable-oidc-issuer \
--enable-workload-identity \
--enable-secret-rotation \
--kubernetes-version 1.28.0 \
--zones 1 2 3 \
--load-balancer-sku standard \
--outbound-type loadBalancer \
--attach-acr /subscriptions/${SUBSCRIPTION_ID}/resourceGroups/langgraph-rg/providers/Microsoft.ContainerRegistry/registries/langgraphacr
Azure AD Workload Identity
Enable Workload Identity
Copy
Ask AI
## Enable OIDC issuer
az aks update \
--name langgraph-cluster \
--resource-group langgraph-rg \
--enable-oidc-issuer \
--enable-workload-identity
## Get OIDC issuer URL
export AKS_OIDC_ISSUER=$(az aks show \
--name langgraph-cluster \
--resource-group langgraph-rg \
--query "oidcIssuerProfile.issuerUrl" \
--output tsv)
Create Managed Identity
Copy
Ask AI
## Create user-assigned managed identity
az identity create \
--name mcp-server-langgraph-identity \
--resource-group langgraph-rg \
--location eastus
## Get identity details
export IDENTITY_CLIENT_ID=$(az identity show \
--name mcp-server-langgraph-identity \
--resource-group langgraph-rg \
--query 'clientId' \
--output tsv)
export IDENTITY_OBJECT_ID=$(az identity show \
--name mcp-server-langgraph-identity \
--resource-group langgraph-rg \
--query 'principalId' \
--output tsv)
Create Federated Credentials
Copy
Ask AI
## Create federated credential
az identity federated-credential create \
--name mcp-server-langgraph-federated-credential \
--identity-name mcp-server-langgraph-identity \
--resource-group langgraph-rg \
--issuer $AKS_OIDC_ISSUER \
--subject system:serviceaccount:mcp-server-langgraph:mcp-server-langgraph \
--audience api://AzureADTokenExchange
Create Kubernetes Service Account
Copy
Ask AI
apiVersion: v1
kind: ServiceAccount
metadata:
name: mcp-server-langgraph
namespace: mcp-server-langgraph
annotations:
azure.workload.identity/client-id: "${IDENTITY_CLIENT_ID}"
labels:
azure.workload.identity/use: "true"
Azure Database for PostgreSQL
Create PostgreSQL Server
Copy
Ask AI
## Create PostgreSQL flexible server
az postgres flexible-server create \
--name langgraph-postgres \
--resource-group langgraph-rg \
--location eastus \
--admin-user adminuser \
--admin-password $(openssl rand -base64 32) \
--sku-name Standard_D4s_v3 \
--tier GeneralPurpose \
--version 15 \
--storage-size 128 \
--backup-retention 7 \
--geo-redundant-backup Enabled \
--high-availability ZoneRedundant \
--zone 1 \
--public-access None \
--vnet langgraph-vnet \
--subnet postgres-subnet
## Create databases (Keycloak, OpenFGA, GDPR compliance)
az postgres flexible-server db create \
--resource-group langgraph-rg \
--server-name langgraph-postgres \
--database-name keycloak
az postgres flexible-server db create \
--resource-group langgraph-rg \
--server-name langgraph-postgres \
--database-name openfga
## NEW: GDPR compliance database (ADR-0041)
az postgres flexible-server db create \
--resource-group langgraph-rg \
--server-name langgraph-postgres \
--database-name gdpr
## Create firewall rule for AKS
az postgres flexible-server firewall-rule create \
--resource-group langgraph-rg \
--name langgraph-postgres \
--rule-name allow-aks \
--start-ip-address 10.0.0.0 \
--end-ip-address 10.0.255.255
## Get connection string
POSTGRES_HOST=$(az postgres flexible-server show \
--resource-group langgraph-rg \
--name langgraph-postgres \
--query fullyQualifiedDomainName \
--output tsv)
Three databases required: Keycloak (identity), OpenFGA (authorization), and GDPR (compliance data storage per ADR-0041).
Initialize GDPR Schema
After creating the databases, initialize the GDPR schema:Copy
Ask AI
## Apply GDPR schema (5 tables: user_profiles, user_preferences, consent_records, conversations, audit_logs)
PGPASSWORD=your-admin-password psql \
-h $POSTGRES_HOST \
-U adminuser \
-d gdpr \
-f deployments/base/postgres-gdpr-schema.sql
- user_profiles: User profile data (GDPR Article 15, 16, 17)
- user_preferences: User preferences (GDPR Article 16, 17)
- consent_records: Consent audit trail, 7-year retention (GDPR Article 21, Article 7)
- conversations: Conversation history, 90-day retention (GDPR Article 15, 20)
- audit_logs: Compliance audit trail, 7-year retention (HIPAA §164.316(b)(2)(i), SOC2 CC6.6)
Configure Private Endpoint
Copy
Ask AI
## Create private endpoint
az network private-endpoint create \
--name postgres-private-endpoint \
--resource-group langgraph-rg \
--vnet-name langgraph-vnet \
--subnet aks-subnet \
--private-connection-resource-id $(az postgres flexible-server show \
--resource-group langgraph-rg \
--name langgraph-postgres \
--query id \
--output tsv) \
--group-id postgresqlServer \
--connection-name postgres-connection
## Create private DNS zone
az network private-dns zone create \
--resource-group langgraph-rg \
--name privatelink.postgres.database.azure.com
## Link DNS zone to VNet
az network private-dns link vnet create \
--resource-group langgraph-rg \
--zone-name privatelink.postgres.database.azure.com \
--name postgres-dns-link \
--virtual-network langgraph-vnet \
--registration-enabled false
Azure Cache for Redis
Create Redis Cache
Copy
Ask AI
## Create Redis cache
az redis create \
--name langgraph-redis \
--resource-group langgraph-rg \
--location eastus \
--sku Premium \
--vm-size P1 \
--enable-non-ssl-port false \
--minimum-tls-version 1.2 \
--replicas-per-master 1 \
--zones 1 2 \
--subnet-id /subscriptions/${SUBSCRIPTION_ID}/resourceGroups/langgraph-rg/providers/Microsoft.Network/virtualNetworks/langgraph-vnet/subnets/redis-subnet
## Get Redis connection details
REDIS_HOST=$(az redis show \
--name langgraph-redis \
--resource-group langgraph-rg \
--query hostName \
--output tsv)
REDIS_KEY=$(az redis list-keys \
--name langgraph-redis \
--resource-group langgraph-rg \
--query primaryKey \
--output tsv)
## Configure Redis for persistence
az redis patch-schedule set \
--name langgraph-redis \
--resource-group langgraph-rg \
--schedule-entries '[{"dayOfWeek":"Sunday","startHourUtc":3,"maintenanceWindow":"PT5H"}]'
Azure Key Vault
Create Key Vault
Copy
Ask AI
## Create Key Vault
az keyvault create \
--name langgraph-keyvault \
--resource-group langgraph-rg \
--location eastus \
--enable-rbac-authorization true \
--enable-purge-protection true \
--retention-days 90
## Assign permissions to managed identity
az role assignment create \
--role "Key Vault Secrets User" \
--assignee $IDENTITY_OBJECT_ID \
--scope $(az keyvault show \
--name langgraph-keyvault \
--resource-group langgraph-rg \
--query id \
--output tsv)
Store Secrets
Copy
Ask AI
## Store secrets
az keyvault secret set \
--vault-name langgraph-keyvault \
--name anthropic-api-key \
--value "sk-ant-your-key"
az keyvault secret set \
--vault-name langgraph-keyvault \
--name jwt-secret \
--value $(openssl rand -base64 32)
az keyvault secret set \
--vault-name langgraph-keyvault \
--name redis-password \
--value "$REDIS_KEY"
az keyvault secret set \
--vault-name langgraph-keyvault \
--name postgres-password \
--value "your-postgres-password"
CSI Driver Integration
Copy
Ask AI
## Enable Key Vault addon
az aks enable-addons \
--name langgraph-cluster \
--resource-group langgraph-rg \
--addons azure-keyvault-secrets-provider \
--enable-secret-rotation
## Create SecretProviderClass
cat << 'EOF' | kubectl apply -f -
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
name: azure-keyvault
namespace: mcp-server-langgraph
spec:
provider: azure
parameters:
usePodIdentity: "false"
useVMManagedIdentity: "false"
clientID: "${IDENTITY_CLIENT_ID}"
keyvaultName: langgraph-keyvault
cloudName: ""
objects: |
array:
- |
objectName: anthropic-api-key
objectType: secret
objectVersion: ""
- |
objectName: jwt-secret
objectType: secret
objectVersion: ""
- |
objectName: redis-password
objectType: secret
objectVersion: ""
tenantId: "${AZURE_TENANT_ID}"
secretObjects:
- secretName: mcp-server-langgraph-secrets
type: Opaque
data:
- objectName: anthropic-api-key
key: ANTHROPIC_API_KEY
- objectName: jwt-secret
key: JWT_SECRET
- objectName: redis-password
key: REDIS_PASSWORD
EOF
Use Secrets in Deployment
Copy
Ask AI
apiVersion: apps/v1
kind: Deployment
metadata:
name: mcp-server-langgraph
namespace: mcp-server-langgraph
spec:
template:
spec:
serviceAccountName: mcp-server-langgraph
containers:
- name: agent
image: langgraphacr.azurecr.io/langgraph/agent:latest
envFrom:
- secretRef:
name: mcp-server-langgraph-secrets
volumeMounts:
- name: secrets-store
mountPath: "/mnt/secrets-store"
readOnly: true
volumes:
- name: secrets-store
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: "azure-keyvault"
Azure Container Registry (ACR)
Create ACR
Copy
Ask AI
## Create container registry
az acr create \
--name langgraphacr \
--resource-group langgraph-rg \
--sku Premium \
--location eastus \
--admin-enabled false
## Attach ACR to AKS
az aks update \
--name langgraph-cluster \
--resource-group langgraph-rg \
--attach-acr langgraphacr
## Login to ACR
az acr login --name langgraphacr
Build and Push Images
Copy
Ask AI
## Build in ACR (no local Docker needed)
az acr build \
--registry langgraphacr \
--image langgraph/agent:latest \
--file Dockerfile .
## Or build locally and push
docker build -t mcp-server-langgraph:latest .
docker tag mcp-server-langgraph:latest langgraphacr.azurecr.io/langgraph/agent:latest
docker push langgraphacr.azurecr.io/langgraph/agent:latest
Application Gateway Ingress Controller
Install AGIC
Copy
Ask AI
## Create Application Gateway
az network application-gateway create \
--name langgraph-appgw \
--resource-group langgraph-rg \
--location eastus \
--sku WAF_v2 \
--capacity 2 \
--vnet-name langgraph-vnet \
--subnet appgw-subnet \
--public-ip-address langgraph-appgw-pip
## Install AGIC addon
az aks enable-addons \
--name langgraph-cluster \
--resource-group langgraph-rg \
--addons ingress-appgw \
--appgw-id $(az network application-gateway show \
--name langgraph-appgw \
--resource-group langgraph-rg \
--query id \
--output tsv)
Configure Ingress
Copy
Ask AI
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: langgraph-ingress
namespace: mcp-server-langgraph
annotations:
kubernetes.io/ingress.class: azure/application-gateway
appgw.ingress.kubernetes.io/ssl-redirect: "true"
appgw.ingress.kubernetes.io/backend-path-prefix: "/"
appgw.ingress.kubernetes.io/health-probe-path: "/health/ready"
appgw.ingress.kubernetes.io/health-probe-interval: "15"
appgw.ingress.kubernetes.io/health-probe-timeout: "5"
appgw.ingress.kubernetes.io/health-probe-unhealthy-threshold: "3"
spec:
tls:
- hosts:
- api.yourdomain.com
secretName: langgraph-tls
rules:
- host: api.yourdomain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: mcp-server-langgraph
port:
number: 8000
Azure Monitor Integration
Enable Container Insights
Copy
Ask AI
## Enable monitoring
az aks enable-addons \
--name langgraph-cluster \
--resource-group langgraph-rg \
--addons monitoring \
--workspace-resource-id $(az monitor log-analytics workspace create \
--resource-group langgraph-rg \
--workspace-name langgraph-logs \
--query id \
--output tsv)
Custom Metrics
Copy
Ask AI
from azure.monitor.opentelemetry import configure_azure_monitor
from opentelemetry import trace
## Configure Azure Monitor
configure_azure_monitor(
connection_string="InstrumentationKey=${INSTRUMENTATION_KEY};IngestionEndpoint=https://eastus-8.in.applicationinsights.azure.com/"
)
tracer = trace.get_tracer(__name__)
## Create spans
with tracer.start_as_current_span("llm_request") as span:
span.set_attribute("provider", "anthropic")
span.set_attribute("model", "claude-sonnet-4-5")
# Your code here
Complete Deployment
Copy
Ask AI
apiVersion: apps/v1
kind: Deployment
metadata:
name: mcp-server-langgraph
namespace: mcp-server-langgraph
labels:
app: mcp-server-langgraph
azure.workload.identity/use: "true"
spec:
replicas: 3
selector:
matchLabels:
app: mcp-server-langgraph
template:
metadata:
labels:
app: mcp-server-langgraph
azure.workload.identity/use: "true"
spec:
serviceAccountName: mcp-server-langgraph
containers:
- name: agent
image: langgraphacr.azurecr.io/langgraph/agent:latest
imagePullPolicy: Always
ports:
- containerPort: 8000
name: http
- containerPort: 9090
name: metrics
env:
- name: ENV
value: production
- name: AUTH_PROVIDER
value: keycloak
- name: SESSION_PROVIDER
value: redis
- name: REDIS_URL
value: rediss://:$(REDIS_PASSWORD)@langgraph-redis.redis.cache.windows.net:6380
- name: KC_DB_URL
value: jdbc:postgresql://langgraph-postgres.postgres.database.azure.com:5432/keycloak?sslmode=require
envFrom:
- secretRef:
name: mcp-server-langgraph-secrets
volumeMounts:
- name: secrets-store
mountPath: "/mnt/secrets-store"
readOnly: true
livenessProbe:
httpGet:
path: /health/live
port: 8000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/ready
port: 8000
initialDelaySeconds: 10
periodSeconds: 5
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"
volumes:
- name: secrets-store
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: "azure-keyvault"
---
apiVersion: v1
kind: Service
metadata:
name: mcp-server-langgraph
namespace: mcp-server-langgraph
spec:
type: ClusterIP
selector:
app: mcp-server-langgraph
ports:
- name: http
port: 8000
targetPort: 8000
- name: metrics
port: 9090
targetPort: 9090
Copy
Ask AI
## Create namespace
kubectl create namespace mcp-server-langgraph
## Apply deployment
kubectl apply -f deployment.yaml
## Verify
kubectl get pods -n mcp-server-langgraph
kubectl logs -f deployment/mcp-server-langgraph -n mcp-server-langgraph
Auto-Scaling
Cluster Autoscaler
Copy
Ask AI
## Enable cluster autoscaler
az aks update \
--name langgraph-cluster \
--resource-group langgraph-rg \
--enable-cluster-autoscaler \
--min-count 3 \
--max-count 20
Horizontal Pod Autoscaler
Copy
Ask AI
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: mcp-server-langgraph
namespace: mcp-server-langgraph
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: mcp-server-langgraph
minReplicas: 3
maxReplicas: 50
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Cost Optimization
Use Spot Node Pools
Use Spot Node Pools
Copy
Ask AI
az aks nodepool add \
--cluster-name langgraph-cluster \
--resource-group langgraph-rg \
--name spot \
--priority Spot \
--eviction-policy Delete \
--spot-max-price -1 \
--enable-cluster-autoscaler \
--min-count 0 \
--max-count 10 \
--node-vm-size Standard_D4s_v3
Azure Reservations
Azure Reservations
- 1-year: Up to 40% savings
- 3-year: Up to 60% savings
- Apply to VMs, databases, and other services
Use Azure Advisor
Use Azure Advisor
Copy
Ask AI
# Get cost recommendations
az advisor recommendation list \
--category Cost \
--output table
Security Best Practices
Enable Azure Policy
Enable Azure Policy
Copy
Ask AI
az aks enable-addons \
--name langgraph-cluster \
--resource-group langgraph-rg \
--addons azure-policy
Use Azure Defender
Use Azure Defender
Copy
Ask AI
az security pricing create \
--name KubernetesService \
--tier standard
Network Policies
Network Policies
Copy
Ask AI
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-agent
namespace: mcp-server-langgraph
spec:
podSelector:
matchLabels:
app: mcp-server-langgraph
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
ports:
- protocol: TCP
port: 8000
Next Steps
Monitoring
Setup comprehensive monitoring
Disaster Recovery
Backup and recovery strategies
Security
Security hardening guide
Kustomize
Configuration management
AKS Deployment Ready: Production-grade deployment on Azure Kubernetes Service!