Skip to main content

Monitoring & Observability

Production-grade monitoring and observability are essential for maintaining reliability, performance, and security. This guide covers the complete observability stack including metrics, traces, logs, and alerts.
The MCP Server uses a dual observability stack: OpenTelemetry for infrastructure metrics and traces, plus LangSmith for LLM-specific observability.

Observability Stack

Metrics

Prometheus + Grafana
  • Resource utilization
  • Request rates
  • Error rates
  • Custom business metrics

Distributed Tracing

Jaeger + OpenTelemetry
  • Request flow visualization
  • Latency breakdown
  • Service dependencies
  • Performance bottlenecks

Logging

Structured JSON Logging
  • Centralized log aggregation
  • Correlation IDs
  • Error tracking
  • Audit trails

LLM Observability

LangSmith
  • Prompt tracking
  • Token usage
  • Model performance
  • Chain visualization

Monitoring Topics

Explore detailed guides for each monitoring component:

Quick Start

For a rapid setup, follow this recommended order:
  1. Prometheus Metrics - Start with metrics collection
  2. Distributed Tracing - Add tracing for request flow
  3. Structured Logging - Implement centralized logging
  4. Grafana Dashboards - Visualize metrics and traces
  5. Alerting & SLOs - Set up alerts and health checks
  6. LangSmith Integration - Add LLM-specific observability

Next Steps


Ready to Start: Choose a monitoring component above to begin your observability setup!