Google Gemini 2.5 Setup Guide
Quick setup guide for using Google Gemini 2.5 models (default configuration).Why Gemini 2.5?
The MCP Server with LangGraph defaults to Gemini 2.5 Flash for several reasons:- ✅ Latest Technology - Google’s newest model family (2025)
- ✅ Fastest Performance - Sub-second response times
- ✅ Cost Efficient - More affordable than GPT-4 or Claude
- ✅ Multimodal - Native support for text, images, video, audio
- ✅ Large Context - 1M+ token context window
- ✅ High Quality - Competitive with GPT-4o and Claude 3.5 Sonnet
Quick Start
1. Get API Key
- Visit: https://aistudio.google.com/apikey
- Sign in with your Google account
- Click “Get API key”
- Create a new API key or use existing one
- Copy the key (starts with
AI...)
2. Configure Environment
3. Test Connection
4. Run MCP Server
Gemini 2.5 Models
gemini-2.5-flash (Default - Recommended)
- Speed: Fastest Gemini model (sub-second responses)
- Cost: Most cost-effective
- Context: 1M+ tokens
- Use Cases: Production applications, chatbots, real-time apps
- Best For: 95% of use cases
gemini-2.5-pro (Most Capable)
- Speed: Slower but more capable
- Cost: Higher cost, premium quality
- Context: 1M+ tokens
- Use Cases: Complex reasoning, research, analysis
- Best For: High-complexity tasks requiring deep reasoning
Model Comparison
| Model | Speed | Cost | Quality | Context | Best For |
|---|---|---|---|---|---|
| Gemini 2.5 Flash | ⚡⚡⚡ | 💰 | ⭐⭐⭐⭐ | 1M+ | Production (Default) |
| Gemini 2.5 Pro | ⚡⚡ | 💰💰💰 | ⭐⭐⭐⭐⭐ | 1M+ | Complex reasoning |
| Claude 3.5 Sonnet | ⚡⚡ | 💰💰 | ⭐⭐⭐⭐⭐ | 200K | Coding, analysis |
| GPT-4o | ⚡⚡ | 💰💰 | ⭐⭐⭐⭐ | 128K | General purpose |
Pricing (Approximate)
Gemini 2.5 Flash:- Input: $0.075 per 1M tokens
- Output: $0.30 per 1M tokens
- ~4x cheaper than GPT-4o
- ~3x cheaper than Claude 3.5
- Input: $1.25 per 1M tokens
- Output: $5.00 per 1M tokens
- Comparable to GPT-4o pricing
Fallback Configuration
The default configuration includes automatic fallback:- Rate limits
- API errors
- Timeouts
- Model unavailability
Advanced Configuration
Increase Context Window
Adjust Temperature
Timeout Settings
Multimodal Capabilities
Gemini 2.5 natively supports:- ✅ Text - Natural language
- ✅ Images - Image understanding and generation
- ✅ Video - Video analysis
- ✅ Audio - Speech and audio processing
- ✅ Code - Programming languages
Example: Image Analysis
Rate Limits
Free Tier:- 15 requests per minute
- 1 million tokens per day
- 1,500 requests per day
- 360 requests per minute
- 4 million tokens per minute
- No daily limits
Troubleshooting
API Key Not Working
Rate Limit Errors
Model Not Found
Slow Responses
Monitoring
Gemini usage is automatically tracked:Switching to Other Providers
Switch to Anthropic
Switch to OpenAI
Switch to Local (Ollama)
Resources
- API Documentation: https://ai.google.dev/docs
- Get API Key: https://aistudio.google.com/apikey
- Pricing: https://ai.google.dev/pricing
- Model Info: https://ai.google.dev/models/gemini
- AI Studio: https://aistudio.google.com
Support
- Google AI Forum: https://discuss.ai.google.dev
- GitHub Issues: Report issues with the agent
- Documentation: See integrations/litellm.md for all providers
Default Configuration Summary:
- Provider: Google
- Model: gemini-2.5-flash
- Fallback: gemini-2.5-pro, claude-sonnet-4-5, gpt-5.1
- Cost: ~75% cheaper than alternatives
- Speed: Fastest available