Overview
Comprehensive observability with dual backends - OpenTelemetry for distributed tracing and metrics, plus LangSmith for LLM-specific insights. Track every request from ingress to LLM response with full context correlation.Dual observability provides both infrastructure monitoring (OpenTelemetry) and AI-specific insights (LangSmith) in a unified platform.
Architecture
OpenTelemetry Instrumentation Flow
The following diagram illustrates how OpenTelemetry auto-instrumentation captures telemetry data from your application and routes it through the collector to various backend systems:Auto-instrumentation captures telemetry without code changes. The collector batches, samples, and filters data before exporting to multiple backends for analysis and visualization.
Quick Start
1
Deploy Observability Stack
2
Configure Application
3
Generate Traces
4
View in Jaeger
- Open http://localhost:16686
- Select service:
mcp-server-langgraph - Click “Find Traces”
- Click on trace to see details
OpenTelemetry Tracing
Trace Structure
Every request creates a trace with multiple spans:Trace Attributes
Each span includes rich metadata:- HTTP Spans
- LLM Spans
- Auth Spans
Custom Instrumentation
Add custom spans to your code:Metrics
Available Metrics
Request Metrics
Request Metrics
HTTP request metrics:
http_requests_total- Total requests by method, statushttp_request_duration_seconds- Request latency histogramhttp_requests_in_progress- Active requests gauge
Authentication Metrics
Authentication Metrics
Auth metrics (30+ metrics):
auth_attempts_total- Auth attempts by resultauth_session_created_total- Sessions createdauth_session_active- Active sessions gaugeauth_token_validation_duration_seconds- Token validation latency
Authorization Metrics
Authorization Metrics
LLM Metrics
LLM Metrics
LLM usage metrics:
llm_requests_total- LLM requests by provider, modelllm_tokens_total- Token usage by type (prompt, completion)llm_latency_seconds- LLM response timellm_errors_total- LLM errors by type
Custom Metrics
Create custom metrics:Prometheus Configuration
Scraping
Add scraping configuration:Alerting Rules
Grafana Dashboards
Import Dashboards
Key Panels
- Overview
- Authentication
- LLM
- Request rate (RED metrics)
- Error rate
- p50/p95/p99 latency
- Active sessions
- LLM token usage
LangSmith Integration
Setup
1
Create Account
Sign up at https://smith.langchain.com
2
Get API Key
Generate API key from settings
3
Configure Application
4
Verify
Make a request and view in LangSmith UI
Features
Prompt Tracking
Prompt Tracking
View full prompts and responses:
- Input messages
- System prompts
- LLM responses
- Token counts
- Latency breakdown
Chain Visualization
Chain Visualization
See execution flow:
- Agent state transitions
- Tool invocations
- LLM calls
- Conditional routing
Evaluations
Evaluations
Test and evaluate:
- Accuracy metrics
- Cost analysis
- Latency benchmarks
- A/B testing
Debugging
Debugging
Debug issues:
- Error traces
- Failed requests
- Slow queries
- Token usage spikes
Custom Annotations
Logging
Structured Logging
All logs are JSON-formatted with trace context:Log Aggregation
- Loki
- ELK Stack
- Cloud Logging
Production Setup
Kubernetes
Sampling
Configure trace sampling for high-traffic:Data Retention
Troubleshooting
No traces appearing
No traces appearing
High cardinality metrics
High cardinality metrics
Limit label values:
Slow trace queries
Slow trace queries
- Add indexes on trace_id, span_id
- Reduce retention period
- Enable sampling
- Archive old traces
Next Steps
Monitoring Guide
Production monitoring setup
Alerting
Configure alerts
Health Checks
Health check endpoints
Production Checklist
Observability requirements
Full Visibility: Comprehensive observability with OpenTelemetry and LangSmith for complete system insights!