Skip to main content

Monitoring & Observability

Production-grade monitoring and observability are essential for maintaining reliability, performance, and security. This guide covers the complete observability stack including metrics, traces, logs, and alerts.
The MCP Server uses a dual observability stack: OpenTelemetry for infrastructure metrics and traces, plus LangSmith for LLM-specific observability.

Observability Stack

Metrics

Prometheus + Grafana
  • Resource utilization
  • Request rates
  • Error rates
  • Custom business metrics

Distributed Tracing

Jaeger + OpenTelemetry
  • Request flow visualization
  • Latency breakdown
  • Service dependencies
  • Performance bottlenecks

Logging

Structured JSON Logging
  • Centralized log aggregation
  • Correlation IDs
  • Error tracking
  • Audit trails

LLM Observability

LangSmith
  • Prompt tracking
  • Token usage
  • Model performance
  • Chain visualization

Monitoring Topics

Explore detailed guides for each monitoring component:

Prometheus Metrics

Set up Prometheus for metrics collection, custom business metrics, and application monitoring

Distributed Tracing

Configure Jaeger and OpenTelemetry for distributed tracing and performance analysis

Structured Logging

Implement structured JSON logging with correlation IDs and centralized aggregation

LangSmith Integration

Track LLM performance, prompts, and chain execution with LangSmith

Grafana Dashboards

Create comprehensive dashboards for metrics visualization and alerting

Alerting & SLOs

Configure Alertmanager, define SLOs, and set up health checks

Quick Start

For a rapid setup, follow this recommended order:
  1. Prometheus Metrics - Start with metrics collection
  2. Distributed Tracing - Add tracing for request flow
  3. Structured Logging - Implement centralized logging
  4. Grafana Dashboards - Visualize metrics and traces
  5. Alerting & SLOs - Set up alerts and health checks
  6. LangSmith Integration - Add LLM-specific observability

Next Steps

Scaling

Auto-scaling configuration

Disaster Recovery

Backup and recovery

Alerting

Alert configuration with Alertmanager

Security Best Practices

Security hardening guide

Ready to Start: Choose a monitoring component above to begin your observability setup!