LiteLLM Integration Guide
Complete guide for using multiple LLM providers with the MCP Server with LangGraph.Table of Contents
- Overview
- Supported Providers
- Configuration
- Provider Setup
- Model Examples
- Fallback Strategy
- Best Practices
Overview
The MCP Server with LangGraph uses LiteLLM to support 100+ LLM providers with a unified interface. This allows you to:- ✅ Switch between providers without code changes
- ✅ Use open-source models (Llama, Qwen, Mistral, etc.)
- ✅ Implement automatic fallback between models
- ✅ Optimize costs by provider/model selection
- ✅ Test locally with Ollama before deploying
Supported Providers
Cloud Providers
| Provider | Models | Configuration Required |
|---|---|---|
| Anthropic | Claude Sonnet 4.5, Claude Opus 4.1, Claude Haiku 4.5 | ANTHROPIC_API_KEY |
| OpenAI | GPT-5, GPT-5 Pro, GPT-5 Mini, GPT-5 Nano | OPENAI_API_KEY |
| Gemini 2.5 Flash, Gemini 2.5 Pro, Gemini 2.0 Pro | GOOGLE_API_KEY | |
| Azure OpenAI | GPT-4, GPT-3.5 | AZURE_API_KEY, AZURE_API_BASE |
| AWS Bedrock | Claude, Llama, Titan | AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY |
Open-Source (Ollama)
| Model Family | Models | Local Setup |
|---|---|---|
| Llama | Llama 3.1, Llama 2 (7B-70B) | Install Ollama |
| Qwen | Qwen 2.5 (0.5B-72B) | Install Ollama |
| Mistral | Mistral 7B, Mixtral 8x7B | Install Ollama |
| DeepSeek | DeepSeek Coder, DeepSeek LLM | Install Ollama |
| Others | Phi-3, Gemma, Yi, etc. | Install Ollama |
Configuration
Environment Variables
Create or update.env:
API Keys
Provider Setup
2. Anthropic (Claude)
3. OpenAI
1. Google Gemini (Default - Recommended)
4. Azure OpenAI
5. AWS Bedrock
6. Ollama (Local/Open-Source)
Model Examples
Anthropic Models
OpenAI Models
Google Gemini Models (Default/Recommended)
Ollama (Open-Source)
Fallback Strategy
The agent automatically falls back to alternative models if the primary fails:Fallback Order Example
Fallback Behavior
- Primary model fails → Try first fallback
- First fallback fails → Try second fallback
- All fallbacks fail → Return error
- API rate limits
- Model unavailability
- Network errors
- Timeout errors
Best Practices
1. Cost Optimization
2. Latency Optimization
Fastest models:3. Context Length
Large context needs:4. Multilingual Support
Best for non-English:5. Code Generation
Best for coding:Testing Different Providers
Quick Test Script
Test with MCP Server
Monitoring
LiteLLM usage is automatically tracked with OpenTelemetry:Troubleshooting
API Key Not Working
Ollama Connection Failed
Model Not Found
Resources
Support
For LiteLLM issues:Last Updated: 2025-01-10