Google Gemini provides state-of-the-art multimodal AI models through two access methods: Google AI Studio (direct API) and Vertex AI (enterprise platform). This guide covers both approaches with the MCP Server.
Gemini 3.0 Pro (Nov 2025) is the latest model with a 1M token context window and advanced reasoning. For production workloads, Gemini 2.5 Flash and Gemini 2.5 Pro are stable, production-grade models.
New: Anthropic Claude models are also available via Vertex AI! See the Vertex AI Setup Guide for unified access to both Claude and Gemini models.
# Add to InfisicalGOOGLE_API_KEY=AIzaSy...your-key# In .env, reference InfisicalINFISICAL_PROJECT_ID=your-project-id
3
Test
Copy
Ask AI
from mcp_server_langgraph.llm.factory import LLMFactoryllm = LLMFactory( provider="google", model_name="gemini-2.5-flash")response = await llm.ainvoke("What is the capital of France?")print(response.content)# Output: "The capital of France is Paris."
from langchain_core.tools import tool@tooldef get_weather(location: str) -> str: """Get current weather for a location""" # Implementation return f"Weather in {location}: 72°F, Sunny"@tooldef search_web(query: str) -> str: """Search the web for information""" # Implementation return f"Search results for: {query}"## Bind tools to LLMllm_with_tools = llm.bind_tools([get_weather, search_web])## Use with agentfrom langgraph.prebuilt import create_react_agentagent = create_react_agent( llm_with_tools, tools=[get_weather, search_web])## Run agentresponse = await agent.ainvoke({ "messages": [("user", "What's the weather in Paris?")]})print(response["messages"][-1].content)
## Enable prompt caching (Gemini 2.5 Pro only)from langchain_google_genai import ChatGoogleGenerativeAIllm = ChatGoogleGenerativeAI( model="gemini-2.5-pro", google_api_key=settings.google_api_key, cache_content=True # Enable caching)## First call - full costresponse1 = await llm.ainvoke("Analyze this long document...")## Second call with similar prompt - cached, cheaperresponse2 = await llm.ainvoke("What are the key points in the document?")
from langchain_google_genai import HarmBlockThreshold, HarmCategory## Configure safety settingsllm = ChatGoogleGenerativeAI( model="gemini-2.5-flash", safety_settings={ HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE, HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE, HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE, HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE, })## Handle safety blockstry: response = await llm.ainvoke(query)except Exception as e: if "SAFETY" in str(e): logger.warning(f"Content blocked by safety filters: {query}") return "I cannot provide a response to that query due to safety concerns." raise
Error: google.api_core.exceptions.Unauthenticated: 401 API key not validSolutions:
Copy
Ask AI
# Verify API keyecho $GOOGLE_API_KEY | head -c 10# Should start with: AIza# Test keycurl "https://generativelanguage.googleapis.com/v1beta/models?key=$GOOGLE_API_KEY"# Regenerate key# 1. Go to https://aistudio.google.com/app/apikey# 2. Delete old key# 3. Create new key# 4. Update in Infisical/config