Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mcp-server-langgraph.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Let’s send your first message to the MCP Server with LangGraph! This guide walks you through authentication, making a request, and understanding the response.
Before you start: Make sure you’ve completed the Quick Start and have the services running.

Prerequisites

Verify services are running:
## Check agent health
curl http://localhost:8000/health

## Expected response
{
  "status": "healthy",
  "service": "mcp-server-langgraph",
  "version": "2.8.0"
}
All services healthy? Let’s make your first request!

Step-by-Step Tutorial

1

Get an Authentication Token

from mcp_server_langgraph.auth.middleware import AuthMiddleware

# Create auth instance
auth = AuthMiddleware()

# Get token for user 'alice'
token = auth.create_token("alice", expires_in=3600)
print(f"Token: {token}")
In production, obtain tokens through proper authentication flows (Keycloak OAuth2, etc.). See Authentication Guide.
2

Send Your First Message

import httpx

# API endpoint
url = "http://localhost:8000/message"

# Request with auth header
headers = {
    "Authorization": f"Bearer {token}",
    "Content-Type": "application/json"
}

# Message payload
data = {
    "query": "Hello! What can you help me with today?"
}

# Send request
response = httpx.post(url, headers=headers, json=data)
print(response.json())
3

Understanding the Response

The agent returns a structured JSON response:
{
  "content": "Hello! I'm an AI assistant powered by LangGraph. I can help you with:\n- Answering questions\n- Information lookup\n- Task automation\n- And more!\n\nWhat would you like to know?",
  "role": "assistant",
  "model": "gemini-2.5-flash-002",
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 52,
    "total_tokens": 80
  },
  "trace_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "authorized": true
}
content
string
The agent’s response text
role
string
Always "assistant" for agent responses
model
string
LLM model used (supports fallback to alternative models)
usage
object
Token usage statistics:
  • prompt_tokens: Input tokens
  • completion_tokens: Output tokens
  • total_tokens: Sum of both
trace_id
string
OpenTelemetry trace ID for debugging. View in Jaeger UI.
authorized
boolean
Whether user was authorized (OpenFGA check passed)
4

View Trace in Jaeger

Every request is traced end-to-end:
  1. Open Jaeger UI: http://localhost:16686
  2. Select service: mcp-server-langgraph
  3. Click “Find Traces”
  4. Click on your trace to see:
    • Request flow
    • LLM call with prompt
    • Authorization check
    • Response generation
    • Timing breakdown
View distributed traces in the Jaeger UI at http://localhost:16686 to see the full request flow.

Complete Example

Here’s a full working example:
import httpx
from mcp_server_langgraph.auth.middleware import AuthMiddleware

def main():
    # 1. Get authentication token
    auth = AuthMiddleware()
    token = auth.create_token("alice", expires_in=3600)

    # 2. Prepare request
    url = "http://localhost:8000/message"
    headers = {
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json"
    }

    # 3. Send message
    data = {"query": "What is the capital of France?"}
    response = httpx.post(url, headers=headers, json=data)

    # 4. Handle response
    if response.status_code == 200:
        result = response.json()
        print(f"Agent: {result['content']}")
        print(f"Model: {result['model']}")
        print(f"Tokens: {result['usage']['total_tokens']}")
        print(f"Trace: http://localhost:16686/trace/{result['trace_id']}")
    else:
        print(f"Error: {response.status_code} - {response.text}")

if __name__ == "__main__":
    main()

Common Use Cases

Simple Q&A

## Ask a question
response = httpx.post(url, headers=headers, json={
    "query": "Explain quantum computing in simple terms"
})

Multi-Turn Conversation

## Conversation with context
conversation = [
    {"role": "user", "content": "I'm learning Python"},
    {"role": "assistant", "content": "Great! What would you like to know?"},
    {"role": "user", "content": "How do I read a file?"}
]

response = httpx.post(url, headers=headers, json={
    "messages": conversation
})

Tool Usage

## Agent can use tools automatically
response = httpx.post(url, headers=headers, json={
    "query": "Search for the latest news about AI"
})
## Agent will invoke search tool and return results

Streaming Responses

## Get streaming response (SSE)
async with httpx.AsyncClient() as client:
    async with client.stream(
        'POST',
        'http://localhost:8000/message/stream',
        headers=headers,
        json={"query": "Tell me a story"}
    ) as response:
        async for chunk in response.aiter_text():
            print(chunk, end='', flush=True)

Error Handling

Authentication Errors

## 401 Unauthorized
{
  "error": "unauthorized",
  "message": "Invalid or expired token"
}
Solution: Get a new token or check token expiration.

Authorization Errors

## 403 Forbidden
{
  "error": "forbidden",
  "message": "User 'bob' is not authorized to execute 'tool:chat'"
}
Solution: Check OpenFGA permissions. See Authorization Guide.

Rate Limiting

## 429 Too Many Requests
{
  "error": "rate_limit_exceeded",
  "message": "Rate limit exceeded. Try again in 60 seconds.",
  "retry_after": 60
}
Solution: Implement exponential backoff or reduce request rate.

Server Errors

## 500 Internal Server Error
{
  "error": "internal_error",
  "message": "LLM provider error: quota exceeded",
  "trace_id": "abc123..."
}
Solution: Check trace in Jaeger, verify LLM API keys and quotas.

Best Practices

# ✅ Good: Always send auth token
headers = {"Authorization": f"Bearer {token}"}

# ❌ Bad: Missing authentication
headers = {}
try:
    response = httpx.post(url, headers=headers, json=data)
    response.raise_for_status()
    return response.json()
except httpx.HTTPStatusError as e:
    print(f"HTTP error: {e.response.status_code}")
    print(f"Details: {e.response.json()}")
except httpx.RequestError as e:
    print(f"Request failed: {e}")
# Prevent hanging requests
response = httpx.post(
    url,
    headers=headers,
    json=data,
    timeout=30.0  # 30 second timeout
)
# Monitor costs
total_tokens = 0
for response in responses:
    total_tokens += response['usage']['total_tokens']

print(f"Total tokens used: {total_tokens}")
# Log trace IDs for support requests
result = response.json()
logger.info(
    "Agent request completed",
    trace_id=result['trace_id'],
    user="alice",
    tokens=result['usage']['total_tokens']
)

Testing Your Integration

Unit Tests

import pytest
from unittest.mock import Mock, patch

def test_agent_request():
    with patch('httpx.post') as mock_post:
        # Mock response
        mock_post.return_value.json.return_value = {
            "content": "Test response",
            "role": "assistant",
            "model": "gemini-2.5-flash-002"
        }

        # Test your code
        result = send_agent_request("Hello")
        assert result['content'] == "Test response"

Integration Tests

def test_agent_integration():
    # Get real token
    token = auth.create_token("alice")

    # Real API call
    response = httpx.post(
        "http://localhost:8000/message",
        headers={"Authorization": f"Bearer {token}"},
        json={"query": "Test query"}
    )

    # Verify response
    assert response.status_code == 200
    result = response.json()
    assert "content" in result
    assert result['authorized'] is True

Next Steps

Authentication

Learn about JWT and Keycloak authentication

Authorization

Configure fine-grained permissions

API Reference

Explore all API endpoints

Multi-LLM Setup

Configure multiple LLM providers

Congratulations! You’ve sent your first request to the MCP agent. Ready to build something amazing?