Overview
LangGraph Platform deployments automatically integrate with LangSmith for comprehensive observability. Every request is traced with full LLM details.Built-in Tracing: No configuration needed - all deployments are automatically traced in LangSmith.
Viewing Traces
1
Access LangSmith
Go to smith.langchain.com
2
Select Project
Choose your project (e.g., “mcp-server-langgraph”)
3
View Traces
See all requests with:
- Full prompts and completions
- Token usage and costs
- Latency breakdown
- Error details
What’s Captured
Every trace includes:LLM Calls
- Full prompts sent to LLM
- Complete model responses
- Token counts (input/output)
- Model parameters
- Latency per call
Agent Steps
- Routing decisions
- Tool invocations
- State transitions
- Conditional flows
- Execution order
Metadata
- User ID and session ID
- Request timestamp
- Environment (prod/staging)
- Custom tags
- Deployment version
Errors
- Full stack traces
- Input that caused error
- Error context
- Failure timing
- Retry attempts
Metrics Dashboard
View key metrics in LangSmith:Request Volume
- Total invocations over time
- Requests per second
- Peak traffic periods
Latency
- P50 Latency: Median response time
- P95 Latency: 95th percentile
- P99 Latency: 99th percentile
- Max Latency: Slowest requests
Success Rate
- Successful requests (200 OK)
- Failed requests (4xx, 5xx)
- Error rate percentage
- Error types breakdown
Token Usage
- Total tokens consumed
- Input vs output tokens
- Tokens per request
- Token usage trends
Cost Tracking
- Estimated costs by model
- Cost per user/session
- Daily/monthly spend
- Cost breakdown by feature
Filtering Traces
By Status
By Latency
By User
By Tags
By Date
Debugging Workflow
1
Find Failing Traces
Filter by
status:error and sort by timestamp descending2
Analyze Error
Click on trace to see:
- Exact input that caused failure
- Full Python stack trace
- All steps before error
- Timing information
3
Compare with Success
Find similar successful trace and compare side-by-side
4
Fix and Redeploy
Fix issue in code and redeploy:
5
Verify Fix
Monitor new traces to confirm error is resolved
Performance Optimization
Identify Slow Traces
- Filter:
latency > 5s - Sort by latency descending
- Expand trace to see timing breakdown
- Identify bottlenecks:
- Slow LLM calls → Try faster model
- Slow tool calls → Add caching
- Redundant calls → Optimize logic
Example Optimization
Before: 8.5s total latency- LLM call 1: 3.2s
- Tool call: 2.1s
- LLM call 2: 3.2s
- LLM call 1: 3.2s
- Tool call (cached): 0.1s
- LLM call 2: 1.2s (smaller context)
Alerts
Set up alerts in LangSmith:1
Go to Project Settings
Navigate to Settings → Alerts
2
Create Alert Rule
Configure alert conditions:
- High Error Rate: Error rate > 5%
- High Latency: P95 > 5 seconds
- Budget Exceeded: Daily cost > $50
3
Configure Notifications
Choose notification channels:
- Slack
- Webhook
- PagerDuty
Custom Metadata
Add custom metadata to traces for better filtering:tags:"premium-user"metadata.cost_center:"sales"metadata.feature:"analysis"
Datasets & Evaluation
Create Dataset from Production
1
Filter Successful Traces
Filter:
status:success AND tags:"production"2
Select Examples
Choose representative traces (varied inputs/outputs)
3
Add to Dataset
Click “Add to Dataset” → Name it “prod-examples-oct-2025”
Run Evaluation
Compare model performance:- Latency
- Token usage
- Cost
- Quality (with custom evaluators)
Viewing Logs
Via CLI
Via LangSmith UI
Logs are included in each trace - expand trace to see full logs.Best Practices
Use Consistent Tagging
Use Consistent Tagging
Add Business Context
Add Business Context
Monitor Key Metrics Daily
Monitor Key Metrics Daily
Check daily:
- Error rate (should be < 1%)
- P95 latency (should be < 5s)
- Daily cost (should be within budget)
- User satisfaction (via feedback)
Set Up Alerts
Set Up Alerts
Configure alerts for:
- Error rate > 5%
- P95 latency > 5s
- Daily cost > $100
- Budget 80% consumed
Troubleshooting
No traces appearing
No traces appearing
Solution:
- Verify
LANGSMITH_TRACING=truein environment - Check LangSmith API key is set
- Confirm correct project name
- Make test request to generate trace
Traces missing metadata
Traces missing metadata
Solution:
High latency in traces
High latency in traces
Investigation:
- Expand trace to see timing breakdown
- Identify slowest step
- Optimize:
- LLM calls: Try faster model or smaller prompts
- Tool calls: Add caching or parallel execution
- State operations: Optimize state size
Next Steps
LangSmith Tracing
Complete LangSmith guide
CI/CD
Automate deployments
Configuration
Optimize configuration
Quickstart
Deploy your agent
All set! Your LangGraph Platform deployment is automatically monitored with comprehensive LangSmith tracing.