Skip to main content

Overview

Azure Kubernetes Service (AKS) is Microsoft’s managed Kubernetes offering that integrates with Azure services like Azure Database for PostgreSQL, Azure Cache for Redis, Azure Key Vault, and Azure Active Directory.
This guide covers deploying to AKS with production-ready configuration including Azure AD Pod Identity, Azure Database for PostgreSQL, Azure Cache for Redis, and Azure Key Vault integration.

Prerequisites

  • Azure subscription
  • Azure CLI (az) installed and configured
  • kubectl installed
  • helm installed

Install Prerequisites

## Install Azure CLI
brew install azure-cli

## Login to Azure
az login

## Set subscription
az account set --subscription "your-subscription-id"

## Install kubectl
az aks install-cli

Create Resource Group

## Create resource group
az group create \
  --name langgraph-rg \
  --location eastus

## Set default resource group
az configure --defaults group=langgraph-rg location=eastus

Create AKS Cluster

Using Azure CLI

## Create AKS cluster
az aks create \
  --name langgraph-cluster \
  --resource-group langgraph-rg \
  --location eastus \
  --node-count 3 \
  --node-vm-size Standard_D4s_v3 \
  --enable-cluster-autoscaler \
  --min-count 3 \
  --max-count 10 \
  --network-plugin azure \
  --enable-managed-identity \
  --enable-aad \
  --enable-azure-rbac \
  --enable-addons monitoring \
  --enable-msi-auth-for-monitoring \
  --enable-secret-rotation \
  --kubernetes-version 1.28.0 \
  --zones 1 2 3

## Get credentials
az aks get-credentials \
  --name langgraph-cluster \
  --resource-group langgraph-rg

## Verify
kubectl get nodes

Advanced Configuration

## Create with custom networking
az aks create \
  --name langgraph-cluster \
  --resource-group langgraph-rg \
  --location eastus \
  --node-count 3 \
  --node-vm-size Standard_D4s_v3 \
  --node-osdisk-size 100 \
  --node-osdisk-type Managed \
  --enable-cluster-autoscaler \
  --min-count 3 \
  --max-count 10 \
  --network-plugin azure \
  --vnet-subnet-id /subscriptions/${SUBSCRIPTION_ID}/resourceGroups/langgraph-rg/providers/Microsoft.Network/virtualNetworks/langgraph-vnet/subnets/aks-subnet \
  --service-cidr 10.0.0.0/16 \
  --dns-service-ip 10.0.0.10 \
  --docker-bridge-address 172.17.0.1/16 \
  --enable-managed-identity \
  --enable-aad \
  --enable-azure-rbac \
  --enable-addons monitoring,azure-keyvault-secrets-provider \
  --enable-oidc-issuer \
  --enable-workload-identity \
  --enable-secret-rotation \
  --kubernetes-version 1.28.0 \
  --zones 1 2 3 \
  --load-balancer-sku standard \
  --outbound-type loadBalancer \
  --attach-acr /subscriptions/${SUBSCRIPTION_ID}/resourceGroups/langgraph-rg/providers/Microsoft.ContainerRegistry/registries/langgraphacr

Azure AD Workload Identity

Enable Workload Identity

## Enable OIDC issuer
az aks update \
  --name langgraph-cluster \
  --resource-group langgraph-rg \
  --enable-oidc-issuer \
  --enable-workload-identity

## Get OIDC issuer URL
export AKS_OIDC_ISSUER=$(az aks show \
  --name langgraph-cluster \
  --resource-group langgraph-rg \
  --query "oidcIssuerProfile.issuerUrl" \
  --output tsv)

Create Managed Identity

## Create user-assigned managed identity
az identity create \
  --name mcp-server-langgraph-identity \
  --resource-group langgraph-rg \
  --location eastus

## Get identity details
export IDENTITY_CLIENT_ID=$(az identity show \
  --name mcp-server-langgraph-identity \
  --resource-group langgraph-rg \
  --query 'clientId' \
  --output tsv)

export IDENTITY_OBJECT_ID=$(az identity show \
  --name mcp-server-langgraph-identity \
  --resource-group langgraph-rg \
  --query 'principalId' \
  --output tsv)

Create Federated Credentials

## Create federated credential
az identity federated-credential create \
  --name mcp-server-langgraph-federated-credential \
  --identity-name mcp-server-langgraph-identity \
  --resource-group langgraph-rg \
  --issuer $AKS_OIDC_ISSUER \
  --subject system:serviceaccount:mcp-server-langgraph:mcp-server-langgraph \
  --audience api://AzureADTokenExchange

Create Kubernetes Service Account

apiVersion: v1
kind: ServiceAccount
metadata:
  name: mcp-server-langgraph
  namespace: mcp-server-langgraph
  annotations:
    azure.workload.identity/client-id: "${IDENTITY_CLIENT_ID}"
  labels:
    azure.workload.identity/use: "true"

Azure Database for PostgreSQL

Create PostgreSQL Server

## Create PostgreSQL flexible server
az postgres flexible-server create \
  --name langgraph-postgres \
  --resource-group langgraph-rg \
  --location eastus \
  --admin-user adminuser \
  --admin-password $(openssl rand -base64 32) \
  --sku-name Standard_D4s_v3 \
  --tier GeneralPurpose \
  --version 15 \
  --storage-size 128 \
  --backup-retention 7 \
  --geo-redundant-backup Enabled \
  --high-availability ZoneRedundant \
  --zone 1 \
  --public-access None \
  --vnet langgraph-vnet \
  --subnet postgres-subnet

## Create databases (Keycloak, OpenFGA, GDPR compliance)
az postgres flexible-server db create \
  --resource-group langgraph-rg \
  --server-name langgraph-postgres \
  --database-name keycloak

az postgres flexible-server db create \
  --resource-group langgraph-rg \
  --server-name langgraph-postgres \
  --database-name openfga

## NEW: GDPR compliance database (ADR-0041)
az postgres flexible-server db create \
  --resource-group langgraph-rg \
  --server-name langgraph-postgres \
  --database-name gdpr

## Create firewall rule for AKS
az postgres flexible-server firewall-rule create \
  --resource-group langgraph-rg \
  --name langgraph-postgres \
  --rule-name allow-aks \
  --start-ip-address 10.0.0.0 \
  --end-ip-address 10.0.255.255

## Get connection string
POSTGRES_HOST=$(az postgres flexible-server show \
  --resource-group langgraph-rg \
  --name langgraph-postgres \
  --query fullyQualifiedDomainName \
  --output tsv)
Three databases required: Keycloak (identity), OpenFGA (authorization), and GDPR (compliance data storage per ADR-0041).

Initialize GDPR Schema

After creating the databases, initialize the GDPR schema:
## Apply GDPR schema (5 tables: user_profiles, user_preferences, consent_records, conversations, audit_logs)
PGPASSWORD=your-admin-password psql \
  -h $POSTGRES_HOST \
  -U adminuser \
  -d gdpr \
  -f deployments/base/postgres-gdpr-schema.sql
Schema Details:
  • user_profiles: User profile data (GDPR Article 15, 16, 17)
  • user_preferences: User preferences (GDPR Article 16, 17)
  • consent_records: Consent audit trail, 7-year retention (GDPR Article 21, Article 7)
  • conversations: Conversation history, 90-day retention (GDPR Article 15, 20)
  • audit_logs: Compliance audit trail, 7-year retention (HIPAA §164.316(b)(2)(i), SOC2 CC6.6)
See GDPR Storage Configuration for retention policies and backup procedures.

Configure Private Endpoint

## Create private endpoint
az network private-endpoint create \
  --name postgres-private-endpoint \
  --resource-group langgraph-rg \
  --vnet-name langgraph-vnet \
  --subnet aks-subnet \
  --private-connection-resource-id $(az postgres flexible-server show \
    --resource-group langgraph-rg \
    --name langgraph-postgres \
    --query id \
    --output tsv) \
  --group-id postgresqlServer \
  --connection-name postgres-connection

## Create private DNS zone
az network private-dns zone create \
  --resource-group langgraph-rg \
  --name privatelink.postgres.database.azure.com

## Link DNS zone to VNet
az network private-dns link vnet create \
  --resource-group langgraph-rg \
  --zone-name privatelink.postgres.database.azure.com \
  --name postgres-dns-link \
  --virtual-network langgraph-vnet \
  --registration-enabled false

Azure Cache for Redis

Create Redis Cache

## Create Redis cache
az redis create \
  --name langgraph-redis \
  --resource-group langgraph-rg \
  --location eastus \
  --sku Premium \
  --vm-size P1 \
  --enable-non-ssl-port false \
  --minimum-tls-version 1.2 \
  --replicas-per-master 1 \
  --zones 1 2 \
  --subnet-id /subscriptions/${SUBSCRIPTION_ID}/resourceGroups/langgraph-rg/providers/Microsoft.Network/virtualNetworks/langgraph-vnet/subnets/redis-subnet

## Get Redis connection details
REDIS_HOST=$(az redis show \
  --name langgraph-redis \
  --resource-group langgraph-rg \
  --query hostName \
  --output tsv)

REDIS_KEY=$(az redis list-keys \
  --name langgraph-redis \
  --resource-group langgraph-rg \
  --query primaryKey \
  --output tsv)

## Configure Redis for persistence
az redis patch-schedule set \
  --name langgraph-redis \
  --resource-group langgraph-rg \
  --schedule-entries '[{"dayOfWeek":"Sunday","startHourUtc":3,"maintenanceWindow":"PT5H"}]'

Azure Key Vault

Create Key Vault

## Create Key Vault
az keyvault create \
  --name langgraph-keyvault \
  --resource-group langgraph-rg \
  --location eastus \
  --enable-rbac-authorization true \
  --enable-purge-protection true \
  --retention-days 90

## Assign permissions to managed identity
az role assignment create \
  --role "Key Vault Secrets User" \
  --assignee $IDENTITY_OBJECT_ID \
  --scope $(az keyvault show \
    --name langgraph-keyvault \
    --resource-group langgraph-rg \
    --query id \
    --output tsv)

Store Secrets

## Store secrets
az keyvault secret set \
  --vault-name langgraph-keyvault \
  --name anthropic-api-key \
  --value "sk-ant-your-key"

az keyvault secret set \
  --vault-name langgraph-keyvault \
  --name jwt-secret \
  --value $(openssl rand -base64 32)

az keyvault secret set \
  --vault-name langgraph-keyvault \
  --name redis-password \
  --value "$REDIS_KEY"

az keyvault secret set \
  --vault-name langgraph-keyvault \
  --name postgres-password \
  --value "your-postgres-password"

CSI Driver Integration

## Enable Key Vault addon
az aks enable-addons \
  --name langgraph-cluster \
  --resource-group langgraph-rg \
  --addons azure-keyvault-secrets-provider \
  --enable-secret-rotation

## Create SecretProviderClass
cat << 'EOF' | kubectl apply -f -
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: azure-keyvault
  namespace: mcp-server-langgraph
spec:
  provider: azure
  parameters:
    usePodIdentity: "false"
    useVMManagedIdentity: "false"
    clientID: "${IDENTITY_CLIENT_ID}"
    keyvaultName: langgraph-keyvault
    cloudName: ""
    objects: |
      array:
        - |
          objectName: anthropic-api-key
          objectType: secret
          objectVersion: ""
        - |
          objectName: jwt-secret
          objectType: secret
          objectVersion: ""
        - |
          objectName: redis-password
          objectType: secret
          objectVersion: ""
    tenantId: "${AZURE_TENANT_ID}"
  secretObjects:
  - secretName: mcp-server-langgraph-secrets
    type: Opaque
    data:
    - objectName: anthropic-api-key
      key: ANTHROPIC_API_KEY
    - objectName: jwt-secret
      key: JWT_SECRET
    - objectName: redis-password
      key: REDIS_PASSWORD
EOF

Use Secrets in Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mcp-server-langgraph
  namespace: mcp-server-langgraph
spec:
  template:
    spec:
      serviceAccountName: mcp-server-langgraph
      containers:
      - name: agent
        image: langgraphacr.azurecr.io/langgraph/agent:latest
        envFrom:
        - secretRef:
            name: mcp-server-langgraph-secrets
        volumeMounts:
        - name: secrets-store
          mountPath: "/mnt/secrets-store"
          readOnly: true
      volumes:
      - name: secrets-store
        csi:
          driver: secrets-store.csi.k8s.io
          readOnly: true
          volumeAttributes:
            secretProviderClass: "azure-keyvault"

Azure Container Registry (ACR)

Create ACR

## Create container registry
az acr create \
  --name langgraphacr \
  --resource-group langgraph-rg \
  --sku Premium \
  --location eastus \
  --admin-enabled false

## Attach ACR to AKS
az aks update \
  --name langgraph-cluster \
  --resource-group langgraph-rg \
  --attach-acr langgraphacr

## Login to ACR
az acr login --name langgraphacr

Build and Push Images

## Build in ACR (no local Docker needed)
az acr build \
  --registry langgraphacr \
  --image langgraph/agent:latest \
  --file Dockerfile .

## Or build locally and push
docker build -t mcp-server-langgraph:latest .
docker tag mcp-server-langgraph:latest langgraphacr.azurecr.io/langgraph/agent:latest
docker push langgraphacr.azurecr.io/langgraph/agent:latest

Application Gateway Ingress Controller

Install AGIC

## Create Application Gateway
az network application-gateway create \
  --name langgraph-appgw \
  --resource-group langgraph-rg \
  --location eastus \
  --sku WAF_v2 \
  --capacity 2 \
  --vnet-name langgraph-vnet \
  --subnet appgw-subnet \
  --public-ip-address langgraph-appgw-pip

## Install AGIC addon
az aks enable-addons \
  --name langgraph-cluster \
  --resource-group langgraph-rg \
  --addons ingress-appgw \
  --appgw-id $(az network application-gateway show \
    --name langgraph-appgw \
    --resource-group langgraph-rg \
    --query id \
    --output tsv)

Configure Ingress

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: langgraph-ingress
  namespace: mcp-server-langgraph
  annotations:
    kubernetes.io/ingress.class: azure/application-gateway
    appgw.ingress.kubernetes.io/ssl-redirect: "true"
    appgw.ingress.kubernetes.io/backend-path-prefix: "/"
    appgw.ingress.kubernetes.io/health-probe-path: "/health/ready"
    appgw.ingress.kubernetes.io/health-probe-interval: "15"
    appgw.ingress.kubernetes.io/health-probe-timeout: "5"
    appgw.ingress.kubernetes.io/health-probe-unhealthy-threshold: "3"
spec:
  tls:
  - hosts:
    - api.yourdomain.com
    secretName: langgraph-tls
  rules:
  - host: api.yourdomain.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: mcp-server-langgraph
            port:
              number: 8000

Azure Monitor Integration

Enable Container Insights

## Enable monitoring
az aks enable-addons \
  --name langgraph-cluster \
  --resource-group langgraph-rg \
  --addons monitoring \
  --workspace-resource-id $(az monitor log-analytics workspace create \
    --resource-group langgraph-rg \
    --workspace-name langgraph-logs \
    --query id \
    --output tsv)

Custom Metrics

from azure.monitor.opentelemetry import configure_azure_monitor
from opentelemetry import trace

## Configure Azure Monitor
configure_azure_monitor(
    connection_string="InstrumentationKey=${INSTRUMENTATION_KEY};IngestionEndpoint=https://eastus-8.in.applicationinsights.azure.com/"
)

tracer = trace.get_tracer(__name__)

## Create spans
with tracer.start_as_current_span("llm_request") as span:
    span.set_attribute("provider", "anthropic")
    span.set_attribute("model", "claude-sonnet-4-5")
    # Your code here

Complete Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mcp-server-langgraph
  namespace: mcp-server-langgraph
  labels:
    app: mcp-server-langgraph
    azure.workload.identity/use: "true"
spec:
  replicas: 3
  selector:
    matchLabels:
      app: mcp-server-langgraph
  template:
    metadata:
      labels:
        app: mcp-server-langgraph
        azure.workload.identity/use: "true"
    spec:
      serviceAccountName: mcp-server-langgraph
      containers:
      - name: agent
        image: langgraphacr.azurecr.io/langgraph/agent:latest
        imagePullPolicy: Always
        ports:
        - containerPort: 8000
          name: http
        - containerPort: 9090
          name: metrics
        env:
        - name: ENV
          value: production
        - name: AUTH_PROVIDER
          value: keycloak
        - name: SESSION_PROVIDER
          value: redis
        - name: REDIS_URL
          value: rediss://:$(REDIS_PASSWORD)@langgraph-redis.redis.cache.windows.net:6380
        - name: KC_DB_URL
          value: jdbc:postgresql://langgraph-postgres.postgres.database.azure.com:5432/keycloak?sslmode=require
        envFrom:
        - secretRef:
            name: mcp-server-langgraph-secrets
        volumeMounts:
        - name: secrets-store
          mountPath: "/mnt/secrets-store"
          readOnly: true
        livenessProbe:
          httpGet:
            path: /health/live
            port: 8000
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health/ready
            port: 8000
          initialDelaySeconds: 10
          periodSeconds: 5
        resources:
          requests:
            memory: "1Gi"
            cpu: "500m"
          limits:
            memory: "2Gi"
            cpu: "1000m"
      volumes:
      - name: secrets-store
        csi:
          driver: secrets-store.csi.k8s.io
          readOnly: true
          volumeAttributes:
            secretProviderClass: "azure-keyvault"
---
apiVersion: v1
kind: Service
metadata:
  name: mcp-server-langgraph
  namespace: mcp-server-langgraph
spec:
  type: ClusterIP
  selector:
    app: mcp-server-langgraph
  ports:
  - name: http
    port: 8000
    targetPort: 8000
  - name: metrics
    port: 9090
    targetPort: 9090
Deploy:
## Create namespace
kubectl create namespace mcp-server-langgraph

## Apply deployment
kubectl apply -f deployment.yaml

## Verify
kubectl get pods -n mcp-server-langgraph
kubectl logs -f deployment/mcp-server-langgraph -n mcp-server-langgraph

Auto-Scaling

Cluster Autoscaler

## Enable cluster autoscaler
az aks update \
  --name langgraph-cluster \
  --resource-group langgraph-rg \
  --enable-cluster-autoscaler \
  --min-count 3 \
  --max-count 20

Horizontal Pod Autoscaler

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: mcp-server-langgraph
  namespace: mcp-server-langgraph
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: mcp-server-langgraph
  minReplicas: 3
  maxReplicas: 50
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

Cost Optimization

az aks nodepool add \
  --cluster-name langgraph-cluster \
  --resource-group langgraph-rg \
  --name spot \
  --priority Spot \
  --eviction-policy Delete \
  --spot-max-price -1 \
  --enable-cluster-autoscaler \
  --min-count 0 \
  --max-count 10 \
  --node-vm-size Standard_D4s_v3
  • 1-year: Up to 40% savings
  • 3-year: Up to 60% savings
  • Apply to VMs, databases, and other services
# Get cost recommendations
az advisor recommendation list \
  --category Cost \
  --output table

Security Best Practices

az aks enable-addons \
  --name langgraph-cluster \
  --resource-group langgraph-rg \
  --addons azure-policy
az security pricing create \
  --name KubernetesService \
  --tier standard
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-agent
  namespace: mcp-server-langgraph
spec:
  podSelector:
    matchLabels:
      app: mcp-server-langgraph
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: ingress-nginx
    ports:
    - protocol: TCP
      port: 8000

Next Steps


AKS Deployment Ready: Production-grade deployment on Azure Kubernetes Service!