API Reference

Overview

The MCP Server with LangGraph exposes a RESTful API following the Model Context Protocol (MCP) specification with additional custom endpoints for health checks and metrics.

Base URL

http://localhost:8000
```yaml
```bash Production
https://your-domain.com

Authentication

All API requests require JWT authentication.

Getting a Token

from mcp_server_langgraph.auth.middleware import AuthMiddleware

auth = AuthMiddleware(secret_key="your-secret-key")
token = auth.create_token("username", expires_in=3600)

Using the Token

Include the token in the Authorization header:

Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...

Rate Limiting

Rate limits are enforced via Kong API Gateway when enabled.

Tier	Requests/Minute	Burst
Free	60	10
Standard	300	50
Premium	1000	100

When rate limited, you’ll receive a 429 Too Many Requests response:

{
  "error": "rate_limit_exceeded",
  "message": "Too many requests",
  "retry_after": 60
}

Response Format

All successful responses follow this structure:

{
  "content": "Response content",
  "role": "assistant",
  "model": "gemini-2.5-flash-002",
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  },
  "trace_id": "abc123-def456-ghi789",
  "authorized": true
}

Error Handling

Error responses include detailed information:

{
  "error": "authentication_failed",
  "message": "Invalid or expired token",
  "trace_id": "abc123-def456",
  "timestamp": "2025-10-10T12:34:56Z"
}

HTTP Status Codes

Code	Meaning	Description
`200`	OK	Request successful
`400`	Bad Request	Invalid request parameters
`401`	Unauthorized	Missing or invalid token
`403`	Forbidden	Insufficient permissions
`404`	Not Found	Resource not found
`429`	Too Many Requests	Rate limit exceeded
`500`	Internal Server Error	Server error
`503`	Service Unavailable	Service temporarily unavailable

API Endpoints

MCP Endpoints

Core MCP protocol operations

Health Checks

Kubernetes health probes

Messages

Send messages to the agent

Tools

List and execute tools

SDK Libraries

Python
JavaScript
cURL

from langgraph_mcp import MCPClient

client = MCPClient(
    base_url="http://localhost:8000",
    api_key="your-token"
)

response = await client.send_message(
    "Hello, how can you help me?"
)

Python SDK is coming soon. Use the HTTP API directly for now.

OpenAPI Specification

Access the OpenAPI spec interactively:

http://localhost:8000/docs
```yaml
```bash ReDoc
http://localhost:8000/redoc

Versioning

The API uses URL versioning:

/v1/message
/v1/tools
/v1/health

Current version: v1

Version 1 is stable and production-ready. Breaking changes will increment the version number.

Common Patterns

Streaming Responses

async with client.stream_message("Tell me a long story") as stream:
    async for chunk in stream:
        print(chunk.content, end='', flush=True)

Batch Requests

messages = [
    "What is AI?",
    "Explain machine learning",
    "What is deep learning?"
]

responses = await client.batch_send(messages)

Context Management

## Maintain conversation context
context = {}

response1 = await client.send_message(
    "My name is Alice",
    context=context
)
context.update(response1.context)

response2 = await client.send_message(
    "What's my name?",
    context=context
)
## Response: "Your name is Alice"

Next Steps

MCP Endpoints

Explore MCP protocol endpoints

Authentication

Learn about auth and authorization

Health Checks

Kubernetes health probes

Quick Start

Get started in 5 minutes

API Documentation

MCP Protocol

Overview

Base URL

Authentication

Getting a Token

Using the Token

Rate Limiting

Response Format

Error Handling

HTTP Status Codes

API Endpoints

MCP Endpoints

Health Checks

Messages

Tools

SDK Libraries

OpenAPI Specification

Versioning

Common Patterns

Streaming Responses

Batch Requests

Context Management

Next Steps

MCP Endpoints

Authentication

Health Checks

Quick Start

API Documentation

MCP Protocol

​Overview

​Base URL

​Authentication

​Getting a Token

​Using the Token

​Rate Limiting

​Response Format

​Error Handling

​HTTP Status Codes

​API Endpoints

MCP Endpoints

Health Checks

Messages

Tools

​SDK Libraries

​OpenAPI Specification

​Versioning

​Common Patterns

​Streaming Responses

​Batch Requests

​Context Management

​Next Steps

MCP Endpoints

Authentication

Health Checks

Quick Start

Overview

Base URL

Authentication

Getting a Token

Using the Token

Rate Limiting

Response Format

Error Handling

HTTP Status Codes

API Endpoints

SDK Libraries

OpenAPI Specification

Versioning

Common Patterns

Streaming Responses

Batch Requests

Context Management

Next Steps