1. Multi-Provider LLM Support via LiteLLM
Date: 2025-10-11Status
AcceptedCategory
Core ArchitectureContext
The MCP server needs to support multiple LLM providers to offer flexibility, avoid vendor lock-in, and enable fallback mechanisms for high availability. Users may want to:- Use different providers based on cost, performance, or features
- Switch providers without code changes
- Implement automatic fallback when one provider fails
- Support both cloud and local/open-source models
- Maintaining separate code paths for each provider
- Different message formats and APIs
- Complex fallback logic
- Difficulty adding new providers
Decision
We will use LiteLLM as the unified interface for all LLM providers. LiteLLM provides:- Single API interface compatible with 100+ LLM providers
- Automatic message format translation
- Built-in retry and fallback logic
- Support for both cloud and local models (Ollama)
- OpenAI-compatible API format
llm_factory.py:
LLMFactoryclass wraps LiteLLM- Provider selection via configuration
- Automatic fallback to alternative models
- Consistent interface regardless of provider
Consequences
Positive Consequences
- Flexibility: Easy to switch providers via configuration
- Reliability: Automatic fallback increases uptime
- Simplicity: Single code path for all providers
- Extensibility: New providers supported automatically
- Local Development: Can use Ollama for offline development
- Cost Optimization: Easy to use cheaper models as fallbacks
Negative Consequences
- Abstraction Layer: Additional dependency between code and LLM APIs
- Feature Limitations: Provider-specific features may not be exposed
- Debugging Complexity: Errors may be obscured by abstraction
- Dependency Risk: Reliant on LiteLLM maintenance
Neutral Consequences
- Performance: Minimal overhead from abstraction layer
- Learning Curve: Developers must learn LiteLLM patterns
Alternatives Considered
1. Direct SDK Integration
Description: Use native SDKs (anthropic, openai, google-generativeai) directly Pros:- Full access to provider-specific features
- No abstraction layer
- Direct control over API calls
- Separate code paths for each provider (2-5x code)
- Complex fallback logic to implement
- Difficult to add new providers
- Harder to maintain consistency
2. LangChain ChatModels
Description: Use LangChain’s ChatModel abstraction Pros:- Already using LangChain for agent
- Built-in provider support
- Good integration with LangGraph
- Heavier dependency (full LangChain)
- Less flexible fallback logic
- Slower to add new provider support
- More opinionated architecture
3. Custom Abstraction Layer
Description: Build our own provider abstraction Pros:- Full control over implementation
- Exactly what we need
- No external dependencies for core logic
- Significant development effort
- Maintenance burden
- Reinventing the wheel
- Slower to add provider support
Implementation Details
Provider Configuration
Factory Pattern
Usage
References
- LiteLLM Documentation
- Supported Providers
- integrations/litellm.md
- Related ADRs: 0005 (Pydantic AI uses LiteLLM)