Skip to main content

Overview

Let’s send your first message to the MCP Server with LangGraph! This guide walks you through authentication, making a request, and understanding the response.
Before you start: Make sure you’ve completed the Quick Start and have the services running.

Prerequisites

Verify services are running:
## Check agent health
curl http://localhost:8000/health

## Expected response
{
  "status": "healthy",
  "service": "mcp-server-langgraph",
  "version": "2.8.0"
}
All services healthy? Let’s make your first request!

Step-by-Step Tutorial

1

Get an Authentication Token

  • Python
  • cURL
  • JavaScript
from mcp_server_langgraph.auth.middleware import AuthMiddleware

# Create auth instance
auth = AuthMiddleware()

# Get token for user 'alice'
token = auth.create_token("alice", expires_in=3600)
print(f"Token: {token}")
In production, obtain tokens through proper authentication flows (Keycloak OAuth2, etc.). See Authentication Guide.
2

Send Your First Message

  • Python
  • cURL
  • JavaScript
import httpx

# API endpoint
url = "http://localhost:8000/message"

# Request with auth header
headers = {
    "Authorization": f"Bearer {token}",
    "Content-Type": "application/json"
}

# Message payload
data = {
    "query": "Hello! What can you help me with today?"
}

# Send request
response = httpx.post(url, headers=headers, json=data)
print(response.json())
3

Understanding the Response

The agent returns a structured JSON response:
{
  "content": "Hello! I'm an AI assistant powered by LangGraph. I can help you with:\n- Answering questions\n- Information lookup\n- Task automation\n- And more!\n\nWhat would you like to know?",
  "role": "assistant",
  "model": "gemini-2.5-flash-002",
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 52,
    "total_tokens": 80
  },
  "trace_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "authorized": true
}
content
string
The agent’s response text
role
string
Always "assistant" for agent responses
model
string
LLM model used (supports fallback to alternative models)
usage
object
Token usage statistics:
  • prompt_tokens: Input tokens
  • completion_tokens: Output tokens
  • total_tokens: Sum of both
trace_id
string
OpenTelemetry trace ID for debugging. View in Jaeger UI.
authorized
boolean
Whether user was authorized (OpenFGA check passed)
4

View Trace in Jaeger

Every request is traced end-to-end:
  1. Open Jaeger UI: http://localhost:16686
  2. Select service: mcp-server-langgraph
  3. Click “Find Traces”
  4. Click on your trace to see:
    • Request flow
    • LLM call with prompt
    • Authorization check
    • Response generation
    • Timing breakdown
View distributed traces in the Jaeger UI at http://localhost:16686 to see the full request flow.

Complete Example

Here’s a full working example:
import httpx
from mcp_server_langgraph.auth.middleware import AuthMiddleware

def main():
    # 1. Get authentication token
    auth = AuthMiddleware()
    token = auth.create_token("alice", expires_in=3600)

    # 2. Prepare request
    url = "http://localhost:8000/message"
    headers = {
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json"
    }

    # 3. Send message
    data = {"query": "What is the capital of France?"}
    response = httpx.post(url, headers=headers, json=data)

    # 4. Handle response
    if response.status_code == 200:
        result = response.json()
        print(f"Agent: {result['content']}")
        print(f"Model: {result['model']}")
        print(f"Tokens: {result['usage']['total_tokens']}")
        print(f"Trace: http://localhost:16686/trace/{result['trace_id']}")
    else:
        print(f"Error: {response.status_code} - {response.text}")

if __name__ == "__main__":
    main()

Common Use Cases

Simple Q&A

## Ask a question
response = httpx.post(url, headers=headers, json={
    "query": "Explain quantum computing in simple terms"
})

Multi-Turn Conversation

## Conversation with context
conversation = [
    {"role": "user", "content": "I'm learning Python"},
    {"role": "assistant", "content": "Great! What would you like to know?"},
    {"role": "user", "content": "How do I read a file?"}
]

response = httpx.post(url, headers=headers, json={
    "messages": conversation
})

Tool Usage

## Agent can use tools automatically
response = httpx.post(url, headers=headers, json={
    "query": "Search for the latest news about AI"
})
## Agent will invoke search tool and return results

Streaming Responses

## Get streaming response (SSE)
async with httpx.AsyncClient() as client:
    async with client.stream(
        'POST',
        'http://localhost:8000/message/stream',
        headers=headers,
        json={"query": "Tell me a story"}
    ) as response:
        async for chunk in response.aiter_text():
            print(chunk, end='', flush=True)

Error Handling

Authentication Errors

## 401 Unauthorized
{
  "error": "unauthorized",
  "message": "Invalid or expired token"
}
Solution: Get a new token or check token expiration.

Authorization Errors

## 403 Forbidden
{
  "error": "forbidden",
  "message": "User 'bob' is not authorized to execute 'tool:chat'"
}
Solution: Check OpenFGA permissions. See Authorization Guide.

Rate Limiting

## 429 Too Many Requests
{
  "error": "rate_limit_exceeded",
  "message": "Rate limit exceeded. Try again in 60 seconds.",
  "retry_after": 60
}
Solution: Implement exponential backoff or reduce request rate.

Server Errors

## 500 Internal Server Error
{
  "error": "internal_error",
  "message": "LLM provider error: quota exceeded",
  "trace_id": "abc123..."
}
Solution: Check trace in Jaeger, verify LLM API keys and quotas.

Best Practices

# ✅ Good: Always send auth token
headers = {"Authorization": f"Bearer {token}"}

# ❌ Bad: Missing authentication
headers = {}
try:
    response = httpx.post(url, headers=headers, json=data)
    response.raise_for_status()
    return response.json()
except httpx.HTTPStatusError as e:
    print(f"HTTP error: {e.response.status_code}")
    print(f"Details: {e.response.json()}")
except httpx.RequestError as e:
    print(f"Request failed: {e}")
# Prevent hanging requests
response = httpx.post(
    url,
    headers=headers,
    json=data,
    timeout=30.0  # 30 second timeout
)
# Monitor costs
total_tokens = 0
for response in responses:
    total_tokens += response['usage']['total_tokens']

print(f"Total tokens used: {total_tokens}")
# Log trace IDs for support requests
result = response.json()
logger.info(
    "Agent request completed",
    trace_id=result['trace_id'],
    user="alice",
    tokens=result['usage']['total_tokens']
)

Testing Your Integration

Unit Tests

import pytest
from unittest.mock import Mock, patch

def test_agent_request():
    with patch('httpx.post') as mock_post:
        # Mock response
        mock_post.return_value.json.return_value = {
            "content": "Test response",
            "role": "assistant",
            "model": "gemini-2.5-flash-002"
        }

        # Test your code
        result = send_agent_request("Hello")
        assert result['content'] == "Test response"

Integration Tests

def test_agent_integration():
    # Get real token
    token = auth.create_token("alice")

    # Real API call
    response = httpx.post(
        "http://localhost:8000/message",
        headers={"Authorization": f"Bearer {token}"},
        json={"query": "Test query"}
    )

    # Verify response
    assert response.status_code == 200
    result = response.json()
    assert "content" in result
    assert result['authorized'] is True

Next Steps


Congratulations! You’ve sent your first request to the MCP agent. Ready to build something amazing?