Documentation Index
Fetch the complete documentation index at: https://mcp-server-langgraph.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
CI/CD Pipeline Documentation
Maturity Level: Level 5 (Elite) - Top 10% of Industry Performers
Overview
This repository contains 24 production-grade GitHub Actions workflows providing comprehensive CI/CD, security, compliance, and observability capabilities.
Quick Links
Workflow Catalog
Core CI/CD (4 workflows)
| Workflow | Trigger | Purpose | Key Features |
|---|
| ci.yaml | PR, push (main/develop) | Main CI pipeline | Parallel testing, Docker builds, type checking |
| release.yaml | Tag push (v*) | Automated releases | Multi-arch images, Helm charts, PyPI, SBOM |
| e2e-tests.yaml | PR, push, weekly | End-to-end testing | Isolated infrastructure on port 9000+ |
| bump-deployment-versions.yaml | Release | Version sync | Updates 9+ deployment configs |
Quality & Testing (5 workflows)
| Workflow | Trigger | Purpose | Key Features |
|---|
| quality-tests.yaml | PR, push, weekly | Advanced testing | Property, contract, regression, mutation tests |
| build-hygiene.yaml | PR, push | Build validation | Prevents Python bytecode commits |
| optional-deps-test.yaml | PR, push, weekly | Dependency testing | Tests minimal install, feature flags |
| coverage-trend.yaml | PR, push | Coverage tracking | Historical tracking, 80% minimum threshold |
| track-skipped-tests.yaml | PR, push, weekly | Skip monitoring | Ensures skipped tests have GitHub issues |
Security & Compliance (4 workflows)
| Workflow | Trigger | Purpose | Key Features |
|---|
| security-scan.yaml | Daily, PR, release | Multi-layer security | Trivy, CodeQL, TruffleHog, SBOM |
| security-validation.yml | PR (terraform/deployments) | Infra security | Terraform validation, placeholder detection |
| gcp-compliance-scan.yaml | Weekly, push | GCP compliance | tfsec, Checkov, kube-bench, CIS benchmarks |
| gcp-drift-detection.yaml | Every 6 hours | Drift detection | Auto-remediation with authorization |
Deployment (3 workflows)
| Workflow | Trigger | Purpose | Key Features |
|---|
| deploy-production-gke.yaml | Release, manual | Production deploy | Canary deployment, manual approval, ESO verification |
| deploy-preview-gke.yaml | Push to main | Staging deploy | Automated with smoke tests |
| validate-deployments.yaml | PR (deployment files) | Deploy validation | Prevents 8 critical deployment issues |
Infrastructure (2 workflows)
| Workflow | Trigger | Purpose | Key Features |
|---|
| terraform-validate.yaml | PR (terraform files) | Terraform validation | Format, validation, all environments |
| validate-deployments.yaml | PR (k8s files) | K8s validation | Security contexts, RBAC, syntax |
Elite Features (3 workflows) 🏆
| Workflow | Trigger | Purpose | Key Features |
|---|
| dora-metrics.yaml | Daily, post-deploy | DORA tracking | All 4 metrics, classification, alerting |
| performance-regression.yaml | PR, push, daily | Perf monitoring | Automatic regression detection, baselines |
| observability-alerts.yaml | Workflow completion | Alert routing | Slack, PagerDuty, Datadog integration |
Operations (3 workflows)
| Workflow | Trigger | Purpose | Key Features |
|---|
| cost-tracking.yaml | Weekly, monthly | Cost monitoring | Budget alerts, optimization recommendations |
| stale.yaml | Daily | Issue/PR management | Auto-closes stale issues/PRs |
| link-checker.yaml | PR, push, weekly | Link validation | Internal links, markdown linting |
Automation (1 workflow)
| Workflow | Trigger | Purpose | Key Features |
|---|
| dependabot-automerge.yaml | Dependabot PRs | Dependency automation | Auto-merge patches, conditional minor updates |
Setup Guide
Prerequisites
-
Repository Secrets (required for deployments):
gh secret set GCP_WIF_PROVIDER --body "projects/.../workloadIdentityPools/..."
gh secret set GCP_PRODUCTION_SA_EMAIL --body "service-account@project.iam.gserviceaccount.com"
gh secret set GCP_STAGING_SA_EMAIL --body "staging-sa@project.iam.gserviceaccount.com"
gh secret set GCP_TERRAFORM_SA_EMAIL --body "terraform@project.iam.gserviceaccount.com"
-
Repository Variables (recommended):
gh variable set GCP_PROJECT_ID --body "your-project-id"
gh variable set GCP_REGION --body "us-central1"
gh variable set GKE_PROD_CLUSTER --body "production-mcp-server-langgraph-gke"
gh variable set GKE_STAGING_CLUSTER --body "staging-mcp-server-langgraph-gke"
-
Observability Secrets (optional but recommended):
gh secret set SLACK_WEBHOOK_URL --body "https://hooks.slack.com/..."
gh secret set PAGERDUTY_INTEGRATION_KEY --body "your-key"
gh secret set DATADOG_API_KEY --body "your-api-key"
Initial Baselines
# Run performance benchmarks
make test-performance
# Create baseline
mkdir -p .perf-baseline
cp benchmark-results.json .perf-baseline/baseline.json
git add .perf-baseline/baseline.json
git commit -m "chore: establish performance baseline"
git push
DORA Metrics Baseline
# Trigger initial calculation
gh workflow run dora-metrics.yaml
# Wait for completion
gh run watch
# View results
cat .dora-metrics/metrics.json
Workflow Dependencies
CI Pipeline Flow
test (3.10, 3.11, 3.12) ──┐
├──> lint ──> verify ──> docker-build (base, full, test)
integration-tests ─────────┘
Release Pipeline Flow
create-release ──┬──> build-and-push (amd64, arm64) ──┬──> publish-helm
│ │
├──> publish-pypi ─────────────────────┤
│ │
└──> update-mcp-registry ──────────────┴──> notify
Deployment Flow
pre-deployment-checks ──> build-and-push ──> approve (manual) ──>
deploy-canary ──> monitor ──> smoke-tests ──> full-rollout ──>
post-deployment-validation
Key Features
Security
- ✅ No hardcoded credentials (all secrets required)
- ✅ Multi-layer security scanning (Trivy, CodeQL, TruffleHog)
- ✅ Compliance monitoring (CIS benchmarks, GDPR)
- ✅ Drift detection with authorized remediation
- ✅ Script injection prevention
- ✅ Binary Authorization support
- ✅ Optimized caching (setup-uv built-in)
- ✅ Parallel job execution
- ✅ Concurrency controls (no duplicate runs)
- ✅ Multi-platform Docker builds
- ✅ Build time: 12 minutes (66% reduction)
Deployment
- ✅ Canary deployment (10% validation before full rollout)
- ✅ Manual approval gates for production
- ✅ Automatic rollback on failure
- ✅ External Secrets Operator verification
- ✅ Comprehensive smoke tests
- ✅ Multi-environment (dev, staging, production)
Observability
- ✅ DORA metrics tracking (all 4 key metrics)
- ✅ Performance regression detection (automatic alerts)
- ✅ Multi-platform alerting (Slack, PagerDuty, Datadog)
- ✅ Cost tracking and budget alerts
- ✅ Historical trending
Quality
- ✅ 80% minimum code coverage enforcement
- ✅ Multiple test types (unit, integration, e2e, property, contract, mutation)
- ✅ Type checking (mypy)
- ✅ Pre-commit hooks
- ✅ Linting (flake8, black, isort)
Common Tasks
Deploying to Production
-
Create Release:
git tag v1.2.3
git push origin v1.2.3
-
Workflow Automatically:
- Builds multi-platform images
- Publishes to ghcr.io
- Creates GitHub release
- Waits for manual approval
- Deploys canary (10%)
- Monitors canary health
- Deploys full rollout (100%)
-
Monitor:
gh run list --workflow=deploy-production-gke.yaml --limit 1
gh run watch
Checking DORA Metrics
# View latest metrics
jq '.[-1]' .dora-metrics/metrics.json
# View trend
jq 'map({date: .timestamp[:10], class: .classification})' .dora-metrics/metrics.json
# Check for regression issues
gh issue list --label "dora-metrics"
# List regression issues
gh issue list --label "performance,regression"
# View latest regression
gh issue view <number>
# Run local benchmarks
make test-performance
# Compare with baseline
python scripts/ci/performance_regression.py \
--baseline .perf-baseline/baseline.json \
--current benchmark-results.json
Troubleshooting
Workflow Failures
Deployment Failed:
# Check workflow logs
gh run view --log
# Check pod status
kubectl get pods -n production-mcp-server-langgraph -l app=mcp-server-langgraph
# View deployment events
kubectl describe deployment production-mcp-server-langgraph -n production-mcp-server-langgraph
Tests Failed:
# Run tests locally
make test-unit
# Check specific test
uv run pytest tests/path/to/test.py::test_name -v
Security Scan Failed:
# View security alerts
gh api /repos/:owner/:repo/code-scanning/alerts
# Run local security scan
docker run --rm -v $(pwd):/src aquasecurity/trivy fs --severity HIGH,CRITICAL /src
Baseline Missing:
# Create initial baseline
make test-performance
mkdir -p .perf-baseline
cp benchmark-results.json .perf-baseline/baseline.json
git add .perf-baseline/ && git commit -m "chore: baseline" && git push
DORA Metrics Missing:
# Manually trigger calculation
gh workflow run dora-metrics.yaml
# Check if deployments exist
gh api repos/:owner/:repo/deployments | jq 'length'
Configuration Reference
Required Secrets
| Secret | Purpose | Example |
|---|
GCP_WIF_PROVIDER | GCP authentication | projects/123/locations/global/... |
GCP_PRODUCTION_SA_EMAIL | Production service account | prod-sa@project.iam.gserviceaccount.com |
GCP_STAGING_SA_EMAIL | Staging service account | staging-sa@project.iam.gserviceaccount.com |
GCP_TERRAFORM_SA_EMAIL | Terraform service account | terraform@project.iam.gserviceaccount.com |
Optional Secrets (Enables Enhanced Features)
| Secret | Purpose | Provider |
|---|
SLACK_WEBHOOK_URL | Team notifications | Slack |
PAGERDUTY_INTEGRATION_KEY | Critical alerts | PagerDuty |
DATADOG_API_KEY | Metrics export | Datadog |
PYPI_API_TOKEN | Package publishing | PyPI |
Recommended Variables
| Variable | Purpose | Default |
|---|
GCP_PROJECT_ID | GCP project | (none - will fail if not set) |
GCP_REGION | GCP region | (none - will fail if not set) |
GKE_PROD_CLUSTER | Production cluster | (none - will fail if not set) |
GKE_STAGING_CLUSTER | Staging cluster | (none - will fail if not set) |
Maintenance
Weekly
- ✅ Review DORA metrics trends
- ✅ Check for performance regressions
- ✅ Review failed deployments
- ✅ Update baselines if needed
Monthly
- ✅ Review cost tracking reports
- ✅ Analyze test coverage trends
- ✅ Update workflow documentation
- ✅ Review security scan results
Quarterly
- ✅ Update action versions
- ✅ Review and optimize workflows
- ✅ Update Terraform modules
- ✅ Audit security configurations
Metrics and KPIs
Based on latest DORA metrics:
| Metric | Current | Target (Elite) | Status |
|---|
| Deployment Frequency | 2.5/day | >1/day | ✅ Elite |
| Lead Time | 1.2 hours | <1 hour | 🟡 High |
| MTTR | 0.8 hours | <1 hour | ✅ Elite |
| Change Failure Rate | 8.5% | <15% | ✅ Elite |
Overall Classification: Elite
Cost Metrics
- Monthly Budget: $200
- Current Spend: ~$150/month
- Savings from Optimizations: $250/month
- ROI: Positive
Support
For issues or questions:
- Check Troubleshooting section
- Review workflow logs:
gh run view --log
- Check Elite Features guide: Elite Features
- Create issue with label
ci-cd
Last Updated: 2025-11-04
Total Workflows: 24
Total Coverage: 100% (55/55 CI tests passing)
Maturity Level: Level 5 (Elite)