1. Multi-Provider LLM Support via LiteLLM
Status
Category
Context
Decision
Consequences
Positive Consequences
Negative Consequences
Neutral Consequences
Alternatives Considered
1. Direct SDK Integration
2. LangChain ChatModels
3. Custom Abstraction Layer
Implementation Details
Provider Configuration
Factory Pattern
Usage
References

1. Multi-Provider LLM Support via LiteLLM

Date: 2025-10-11

Status

Accepted

Context

The MCP server needs to support multiple LLM providers to offer flexibility, avoid vendor lock-in, and enable fallback mechanisms for high availability. Users may want to:

Use different providers based on cost, performance, or features
Switch providers without code changes
Implement automatic fallback when one provider fails
Support both cloud and local/open-source models

Direct integration with each provider (Anthropic SDK, OpenAI SDK, Google SDK, etc.) would require:

Maintaining separate code paths for each provider
Different message formats and APIs
Complex fallback logic
Difficulty adding new providers

Decision

We will use LiteLLM as the unified interface for all LLM providers. LiteLLM provides:

Single API interface compatible with 100+ LLM providers
Automatic message format translation
Built-in retry and fallback logic
Support for both cloud and local models (Ollama)
OpenAI-compatible API format

Implementation in llm_factory.py:

LLMFactory class wraps LiteLLM
Provider selection via configuration
Automatic fallback to alternative models
Consistent interface regardless of provider

Consequences

Positive Consequences

Flexibility: Easy to switch providers via configuration
Reliability: Automatic fallback increases uptime
Simplicity: Single code path for all providers
Extensibility: New providers supported automatically
Local Development: Can use Ollama for offline development
Cost Optimization: Easy to use cheaper models as fallbacks

Negative Consequences

Abstraction Layer: Additional dependency between code and LLM APIs
Feature Limitations: Provider-specific features may not be exposed
Debugging Complexity: Errors may be obscured by abstraction
Dependency Risk: Reliant on LiteLLM maintenance

Neutral Consequences

Performance: Minimal overhead from abstraction layer
Learning Curve: Developers must learn LiteLLM patterns

Alternatives Considered

1. Direct SDK Integration

Description: Use native SDKs (anthropic, openai, google-generativeai) directly Pros:

Full access to provider-specific features
No abstraction layer
Direct control over API calls

Cons:

Separate code paths for each provider (2-5x code)
Complex fallback logic to implement
Difficult to add new providers
Harder to maintain consistency

Why Rejected: Too much duplication and complexity

2. LangChain ChatModels

Description: Use LangChain’s ChatModel abstraction Pros:

Already using LangChain for agent
Built-in provider support
Good integration with LangGraph

Cons:

Heavier dependency (full LangChain)
Less flexible fallback logic
Slower to add new provider support
More opinionated architecture

Why Rejected: LiteLLM is more lightweight and flexible

3. Custom Abstraction Layer

Description: Build our own provider abstraction Pros:

Full control over implementation
Exactly what we need
No external dependencies for core logic

Cons:

Significant development effort
Maintenance burden
Reinventing the wheel
Slower to add provider support

Why Rejected: Not worth reinventing when good solution exists

Implementation Details

Provider Configuration

# src/mcp_server_langgraph/core/config.py
llm_provider: str = "google"  # google, anthropic, openai, ollama
model_name: str = "gemini-2.5-flash-002"
fallback_models: list[str] = ["gemini-2.5-pro", "claude-sonnet-4-5"]

Factory Pattern

# llm_factory.py
def create_llm_from_config(config) -> LLMFactory:
    return LLMFactory(
        provider=config.llm_provider,
        model_name=config.model_name,
        enable_fallback=True,
        fallback_models=config.fallback_models
    )

Usage

llm = create_llm_from_config(settings)
response = await llm.ainvoke(messages)  # Works with any provider

References

LiteLLM Documentation
Supported Providers
integrations/litellm.md
Related ADRs: 0005 (Pydantic AI uses LiteLLM)

Migration Guides 2. Fine-Grained Authorization with OpenFGA

⌘I

Overview

Project

Core Platform

Authentication & Identity

Infrastructure & Deployment

Development & Quality

Testing Infrastructure

CI/CD & Operations

Tooling & Standards

Compliance

1. Multi-Provider LLM Support via LiteLLM

1. Multi-Provider LLM Support via LiteLLM

Status

Category

Context

Decision

Consequences

Positive Consequences

Negative Consequences

Neutral Consequences

Alternatives Considered

1. Direct SDK Integration

2. LangChain ChatModels

3. Custom Abstraction Layer

Implementation Details

Provider Configuration

Factory Pattern

Usage

References

Overview

Project

Core Platform

Authentication & Identity

Infrastructure & Deployment

Development & Quality

Testing Infrastructure

CI/CD & Operations

Tooling & Standards

Compliance

​1. Multi-Provider LLM Support via LiteLLM

​Status

​Category

​Context

​Decision

​Consequences

​Positive Consequences

​Negative Consequences

​Neutral Consequences

​Alternatives Considered

​1. Direct SDK Integration

​2. LangChain ChatModels

​3. Custom Abstraction Layer

​Implementation Details

​Provider Configuration

​Factory Pattern

​Usage

​References

1. Multi-Provider LLM Support via LiteLLM

Status

Category

Context

Decision

Consequences

Positive Consequences

Negative Consequences

Neutral Consequences

Alternatives Considered

1. Direct SDK Integration

2. LangChain ChatModels

3. Custom Abstraction Layer

Implementation Details

Provider Configuration

Factory Pattern

Usage

References