Skip to main content

Overview

The MCP Server with LangGraph exposes a RESTful API following the Model Context Protocol (MCP) specification with additional custom endpoints for health checks and metrics.

Base URL

http://localhost:8000
```yaml
```bash Production
https://your-domain.com

Authentication

All API requests require JWT authentication.

Getting a Token

from mcp_server_langgraph.auth.middleware import AuthMiddleware

auth = AuthMiddleware(secret_key="your-secret-key")
token = auth.create_token("username", expires_in=3600)

Using the Token

Include the token in the Authorization header:
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...

Rate Limiting

Rate limits are enforced via Kong API Gateway when enabled.
TierRequests/MinuteBurst
Free6010
Standard30050
Premium1000100
When rate limited, you’ll receive a 429 Too Many Requests response:
{
  "error": "rate_limit_exceeded",
  "message": "Too many requests",
  "retry_after": 60
}

Response Format

All successful responses follow this structure:
{
  "content": "Response content",
  "role": "assistant",
  "model": "gemini-2.5-flash-002",
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  },
  "trace_id": "abc123-def456-ghi789",
  "authorized": true
}

Error Handling

Error responses include detailed information:
{
  "error": "authentication_failed",
  "message": "Invalid or expired token",
  "trace_id": "abc123-def456",
  "timestamp": "2025-10-10T12:34:56Z"
}

HTTP Status Codes

CodeMeaningDescription
200OKRequest successful
400Bad RequestInvalid request parameters
401UnauthorizedMissing or invalid token
403ForbiddenInsufficient permissions
404Not FoundResource not found
429Too Many RequestsRate limit exceeded
500Internal Server ErrorServer error
503Service UnavailableService temporarily unavailable

API Endpoints

SDK Libraries

  • Python
  • JavaScript
  • cURL
from langgraph_mcp import MCPClient

client = MCPClient(
    base_url="http://localhost:8000",
    api_key="your-token"
)

response = await client.send_message(
    "Hello, how can you help me?"
)
Python SDK is coming soon. Use the HTTP API directly for now.

OpenAPI Specification

Access the OpenAPI spec interactively:
http://localhost:8000/docs
```yaml
```bash ReDoc
http://localhost:8000/redoc

Versioning

The API uses URL versioning:
/v1/message
/v1/tools
/v1/health
Current version: v1
Version 1 is stable and production-ready. Breaking changes will increment the version number.

Common Patterns

Streaming Responses

async with client.stream_message("Tell me a long story") as stream:
    async for chunk in stream:
        print(chunk.content, end='', flush=True)

Batch Requests

messages = [
    "What is AI?",
    "Explain machine learning",
    "What is deep learning?"
]

responses = await client.batch_send(messages)

Context Management

## Maintain conversation context
context = {}

response1 = await client.send_message(
    "My name is Alice",
    context=context
)
context.update(response1.context)

response2 = await client.send_message(
    "What's my name?",
    context=context
)
## Response: "Your name is Alice"

Next Steps