Your First Request

Overview

Let’s send your first message to the MCP Server with LangGraph! This guide walks you through authentication, making a request, and understanding the response.

Before you start: Make sure you’ve completed the Quick Start and have the services running.

Prerequisites

Verify services are running:

## Check agent health
curl http://localhost:8000/health

## Expected response
{
  "status": "healthy",
  "service": "mcp-server-langgraph",
  "version": "2.8.0"
}

All services healthy? Let’s make your first request!

Step-by-Step Tutorial

Get an Authentication Token

Python
cURL
JavaScript

from mcp_server_langgraph.auth.middleware import AuthMiddleware

# Create auth instance
auth = AuthMiddleware()

# Get token for user 'alice'
token = auth.create_token("alice", expires_in=3600)
print(f"Token: {token}")

In production, obtain tokens through proper authentication flows (Keycloak OAuth2, etc.). See Authentication Guide.

Send Your First Message

Python
cURL
JavaScript

import httpx

# API endpoint
url = "http://localhost:8000/message"

# Request with auth header
headers = {
    "Authorization": f"Bearer {token}",
    "Content-Type": "application/json"
}

# Message payload
data = {
    "query": "Hello! What can you help me with today?"
}

# Send request
response = httpx.post(url, headers=headers, json=data)
print(response.json())

Understanding the Response

The agent returns a structured JSON response:

{
  "content": "Hello! I'm an AI assistant powered by LangGraph. I can help you with:\n- Answering questions\n- Information lookup\n- Task automation\n- And more!\n\nWhat would you like to know?",
  "role": "assistant",
  "model": "gemini-2.5-flash-002",
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 52,
    "total_tokens": 80
  },
  "trace_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "authorized": true
}

content

string

The agent’s response text

role

string

Always "assistant" for agent responses

model

string

LLM model used (supports fallback to alternative models)

usage

object

Token usage statistics:

prompt_tokens: Input tokens
completion_tokens: Output tokens
total_tokens: Sum of both

trace_id

string

OpenTelemetry trace ID for debugging. View in Jaeger UI.

authorized

boolean

Whether user was authorized (OpenFGA check passed)

View Trace in Jaeger

Every request is traced end-to-end:

Open Jaeger UI: http://localhost:16686
Select service: mcp-server-langgraph
Click “Find Traces”
Click on your trace to see:
- Request flow
- LLM call with prompt
- Authorization check
- Response generation
- Timing breakdown

View distributed traces in the Jaeger UI at http://localhost:16686 to see the full request flow.

Complete Example

Here’s a full working example:

import httpx
from mcp_server_langgraph.auth.middleware import AuthMiddleware

def main():
    # 1. Get authentication token
    auth = AuthMiddleware()
    token = auth.create_token("alice", expires_in=3600)

    # 2. Prepare request
    url = "http://localhost:8000/message"
    headers = {
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json"
    }

    # 3. Send message
    data = {"query": "What is the capital of France?"}
    response = httpx.post(url, headers=headers, json=data)

    # 4. Handle response
    if response.status_code == 200:
        result = response.json()
        print(f"Agent: {result['content']}")
        print(f"Model: {result['model']}")
        print(f"Tokens: {result['usage']['total_tokens']}")
        print(f"Trace: http://localhost:16686/trace/{result['trace_id']}")
    else:
        print(f"Error: {response.status_code} - {response.text}")

if __name__ == "__main__":
    main()

Common Use Cases

Simple Q&A

## Ask a question
response = httpx.post(url, headers=headers, json={
    "query": "Explain quantum computing in simple terms"
})

Multi-Turn Conversation

## Conversation with context
conversation = [
    {"role": "user", "content": "I'm learning Python"},
    {"role": "assistant", "content": "Great! What would you like to know?"},
    {"role": "user", "content": "How do I read a file?"}
]

response = httpx.post(url, headers=headers, json={
    "messages": conversation
})

Tool Usage

## Agent can use tools automatically
response = httpx.post(url, headers=headers, json={
    "query": "Search for the latest news about AI"
})
## Agent will invoke search tool and return results

Streaming Responses

## Get streaming response (SSE)
async with httpx.AsyncClient() as client:
    async with client.stream(
        'POST',
        'http://localhost:8000/message/stream',
        headers=headers,
        json={"query": "Tell me a story"}
    ) as response:
        async for chunk in response.aiter_text():
            print(chunk, end='', flush=True)

Error Handling

Authentication Errors

## 401 Unauthorized
{
  "error": "unauthorized",
  "message": "Invalid or expired token"
}

Solution: Get a new token or check token expiration.

Authorization Errors

## 403 Forbidden
{
  "error": "forbidden",
  "message": "User 'bob' is not authorized to execute 'tool:chat'"
}

Solution: Check OpenFGA permissions. See Authorization Guide.

Rate Limiting

## 429 Too Many Requests
{
  "error": "rate_limit_exceeded",
  "message": "Rate limit exceeded. Try again in 60 seconds.",
  "retry_after": 60
}

Solution: Implement exponential backoff or reduce request rate.

Server Errors

## 500 Internal Server Error
{
  "error": "internal_error",
  "message": "LLM provider error: quota exceeded",
  "trace_id": "abc123..."
}

Solution: Check trace in Jaeger, verify LLM API keys and quotas.

Best Practices

Always Include Authorization

# ✅ Good: Always send auth token
headers = {"Authorization": f"Bearer {token}"}

# ❌ Bad: Missing authentication
headers = {}

Handle Errors Gracefully

try:
    response = httpx.post(url, headers=headers, json=data)
    response.raise_for_status()
    return response.json()
except httpx.HTTPStatusError as e:
    print(f"HTTP error: {e.response.status_code}")
    print(f"Details: {e.response.json()}")
except httpx.RequestError as e:
    print(f"Request failed: {e}")

Set Timeouts

# Prevent hanging requests
response = httpx.post(
    url,
    headers=headers,
    json=data,
    timeout=30.0  # 30 second timeout
)

Track Token Usage

# Monitor costs
total_tokens = 0
for response in responses:
    total_tokens += response['usage']['total_tokens']

print(f"Total tokens used: {total_tokens}")

Use Trace IDs for Debugging

# Log trace IDs for support requests
result = response.json()
logger.info(
    "Agent request completed",
    trace_id=result['trace_id'],
    user="alice",
    tokens=result['usage']['total_tokens']
)

Testing Your Integration

Unit Tests

import pytest
from unittest.mock import Mock, patch

def test_agent_request():
    with patch('httpx.post') as mock_post:
        # Mock response
        mock_post.return_value.json.return_value = {
            "content": "Test response",
            "role": "assistant",
            "model": "gemini-2.5-flash-002"
        }

        # Test your code
        result = send_agent_request("Hello")
        assert result['content'] == "Test response"

Integration Tests

def test_agent_integration():
    # Get real token
    token = auth.create_token("alice")

    # Real API call
    response = httpx.post(
        "http://localhost:8000/message",
        headers={"Authorization": f"Bearer {token}"},
        json={"query": "Test query"}
    )

    # Verify response
    assert response.status_code == 200
    result = response.json()
    assert "content" in result
    assert result['authorized'] is True

Next Steps

Authentication

Learn about JWT and Keycloak authentication

Authorization

Configure fine-grained permissions

API Reference

Explore all API endpoints

Multi-LLM Setup

Configure multiple LLM providers

Congratulations! You’ve sent your first request to the MCP agent. Ready to build something amazing?

Getting Started

Core Concepts

Framework Comparisons

Security

Local Development

Testing

Contributing

Workflows

Troubleshooting

Integrations

Diagrams

Overview

Prerequisites

Step-by-Step Tutorial

Complete Example

Common Use Cases

Simple Q&A

Multi-Turn Conversation

Tool Usage

Streaming Responses

Error Handling

Authentication Errors

Authorization Errors

Rate Limiting

Server Errors

Best Practices

Testing Your Integration

Unit Tests

Integration Tests

Next Steps

Authentication

Authorization

API Reference

Multi-LLM Setup

Getting Started

Core Concepts

Framework Comparisons

Security

Local Development

Testing

Contributing

Workflows

Troubleshooting

Integrations

Diagrams

​Overview

​Prerequisites

​Step-by-Step Tutorial

​Complete Example

​Common Use Cases

​Simple Q&A

​Multi-Turn Conversation

​Tool Usage

​Streaming Responses

​Error Handling

​Authentication Errors

​Authorization Errors

​Rate Limiting

​Server Errors

​Best Practices

​Testing Your Integration

​Unit Tests

​Integration Tests

​Next Steps

Authentication

Authorization

API Reference

Multi-LLM Setup

Overview

Prerequisites

Step-by-Step Tutorial

Complete Example

Common Use Cases

Simple Q&A

Multi-Turn Conversation

Tool Usage

Streaming Responses

Error Handling

Authentication Errors

Authorization Errors

Rate Limiting

Server Errors

Best Practices

Testing Your Integration

Unit Tests

Integration Tests

Next Steps