Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mcp-server-langgraph.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Overview

LangGraph Platform deployments automatically integrate with LangSmith for comprehensive observability. Every request is traced with full LLM details.
Built-in Tracing: No configuration needed - all deployments are automatically traced in LangSmith.

Viewing Traces

1

Access LangSmith

2

Select Project

Choose your project (e.g., “mcp-server-langgraph”)
3

View Traces

See all requests with:
  • Full prompts and completions
  • Token usage and costs
  • Latency breakdown
  • Error details

What’s Captured

Every trace includes:

LLM Calls

  • Full prompts sent to LLM
  • Complete model responses
  • Token counts (input/output)
  • Model parameters
  • Latency per call

Agent Steps

  • Routing decisions
  • Tool invocations
  • State transitions
  • Conditional flows
  • Execution order

Metadata

  • User ID and session ID
  • Request timestamp
  • Environment (prod/staging)
  • Custom tags
  • Deployment version

Errors

  • Full stack traces
  • Input that caused error
  • Error context
  • Failure timing
  • Retry attempts

Metrics Dashboard

View key metrics in LangSmith:

Request Volume

  • Total invocations over time
  • Requests per second
  • Peak traffic periods

Latency

  • P50 Latency: Median response time
  • P95 Latency: 95th percentile
  • P99 Latency: 99th percentile
  • Max Latency: Slowest requests

Success Rate

  • Successful requests (200 OK)
  • Failed requests (4xx, 5xx)
  • Error rate percentage
  • Error types breakdown

Token Usage

  • Total tokens consumed
  • Input vs output tokens
  • Tokens per request
  • Token usage trends

Cost Tracking

  • Estimated costs by model
  • Cost per user/session
  • Daily/monthly spend
  • Cost breakdown by feature

Filtering Traces

By Status

status:error
status:success

By Latency

latency > 5s
latency < 1s

By User

metadata.user_id:"alice@example.com"

By Tags

tags:"production"
tags:"high-priority"

By Date

timestamp > 2025-10-01
timestamp < 2025-10-10

Debugging Workflow

1

Find Failing Traces

Filter by status:error and sort by timestamp descending
2

Analyze Error

Click on trace to see:
  • Exact input that caused failure
  • Full Python stack trace
  • All steps before error
  • Timing information
3

Compare with Success

Find similar successful trace and compare side-by-side
4

Fix and Redeploy

Fix issue in code and redeploy:
langgraph deploy
5

Verify Fix

Monitor new traces to confirm error is resolved

Performance Optimization

Identify Slow Traces

  1. Filter: latency > 5s
  2. Sort by latency descending
  3. Expand trace to see timing breakdown
  4. Identify bottlenecks:
    • Slow LLM calls → Try faster model
    • Slow tool calls → Add caching
    • Redundant calls → Optimize logic

Example Optimization

Before: 8.5s total latency
  • LLM call 1: 3.2s
  • Tool call: 2.1s
  • LLM call 2: 3.2s
Optimization: Add caching to tool call After: 4.5s total latency
  • LLM call 1: 3.2s
  • Tool call (cached): 0.1s
  • LLM call 2: 1.2s (smaller context)

Alerts

Set up alerts in LangSmith:
1

Go to Project Settings

Navigate to Settings → Alerts
2

Create Alert Rule

Configure alert conditions:
  • High Error Rate: Error rate > 5%
  • High Latency: P95 > 5 seconds
  • Budget Exceeded: Daily cost > $50
3

Configure Notifications

Choose notification channels:
  • Email
  • Slack
  • Webhook
  • PagerDuty

Custom Metadata

Add custom metadata to traces for better filtering:
from langchain_core.runnables import RunnableConfig

config = RunnableConfig(
    tags=["premium-user", "high-priority"],
    metadata={
        "user_id": "alice@example.com",
        "session_id": "sess_abc123",
        "feature": "analysis",
        "cost_center": "sales"
    }
)

result = await graph.ainvoke(input, config=config)
Now filter in LangSmith:
  • tags:"premium-user"
  • metadata.cost_center:"sales"
  • metadata.feature:"analysis"

Datasets & Evaluation

Create Dataset from Production

1

Filter Successful Traces

Filter: status:success AND tags:"production"
2

Select Examples

Choose representative traces (varied inputs/outputs)
3

Add to Dataset

Click “Add to Dataset” → Name it “prod-examples-oct-2025”

Run Evaluation

Compare model performance:
from langsmith import Client

client = Client()

## Test on production dataset
results = client.run_on_dataset(
    dataset_name="prod-examples-oct-2025",
    llm_or_chain_factory=lambda: graph,
    project_name="eval-claude-vs-gpt4"
)
View results in LangSmith to compare:
  • Latency
  • Token usage
  • Cost
  • Quality (with custom evaluators)

Viewing Logs

Via CLI

## Stream logs in real-time
langgraph deployment logs my-agent-prod --follow

## View recent logs
langgraph deployment logs my-agent-prod --limit 100

## Filter by level
langgraph deployment logs my-agent-prod --level ERROR

Via LangSmith UI

Logs are included in each trace - expand trace to see full logs.

Best Practices

# Good: Consistent tags
tags=["production", "premium-tier", "chat-feature"]

# Bad: Inconsistent tags
tags=["prod", "Premium User", "CHAT"]
metadata={
    "user_id": "alice@example.com",
    "user_tier": "premium",
    "cost_center": "sales",
    "session_id": "sess_123",
    "request_source": "mobile_app"
}
Check daily:
  • Error rate (should be < 1%)
  • P95 latency (should be < 5s)
  • Daily cost (should be within budget)
  • User satisfaction (via feedback)
Configure alerts for:
  • Error rate > 5%
  • P95 latency > 5s
  • Daily cost > $100
  • Budget 80% consumed

Troubleshooting

Solution:
  • Verify LANGSMITH_TRACING=true in environment
  • Check LangSmith API key is set
  • Confirm correct project name
  • Make test request to generate trace
Solution:
# Ensure metadata is passed to invoke
config = RunnableConfig(
    tags=["your-tags"],
    metadata={"user_id": "alice"}
)
result = await graph.ainvoke(input, config=config)
Investigation:
  1. Expand trace to see timing breakdown
  2. Identify slowest step
  3. Optimize:
    • LLM calls: Try faster model or smaller prompts
    • Tool calls: Add caching or parallel execution
    • State operations: Optimize state size

Next Steps

LangSmith Tracing

Complete LangSmith guide

CI/CD

Automate deployments

Configuration

Optimize configuration

Quickstart

Deploy your agent

All set! Your LangGraph Platform deployment is automatically monitored with comprehensive LangSmith tracing.