26. Lazy Observability Initialization
Date: 2025-01-17Status
AcceptedCategory
Core ArchitectureContext
Prior to v2.8.0, the observability system (OpenTelemetry tracing, metrics, and logging) was initialized automatically at module import time insrc/mcp_server_langgraph/observability/telemetry.py:264-285.
This approach caused several operational and ergonomic issues:
1. Circular Import Dependency
Problem: The import chain created a circular dependency:- Module initialization order was non-deterministic
settings.langsmith_tracingandsettings.log_formatwere often unavailable during telemetry init- Silently fell back to default values, ignoring user configuration
2. Filesystem Operations at Import Time
Problem: Importing the module triggered:- Creation of
logs/directory (line 166) - Creation of 4 log file handlers (lines 197-219)
- Writes to log files even when just importing for inspection
- Broke read-only containers: Serverless environments, read-only root filesystems failed
- Broke library usage: Packages importing mcp-server-langgraph as a dependency couldn’t control initialization
- Noise in non-production environments: CI, tests, development all created unnecessary log files
3. Configuration Race Condition
Problem: Telemetry initialized before configuration was fully loaded:- Users couldn’t reliably configure
langsmith_tracingorlog_format - Environment variables weren’t consistently honored
- Debugging configuration issues was difficult
4. Inflexible for Embedding
Problem: No way to customize observability for library consumers Impact:- Other packages couldn’t reuse mcp-server-langgraph without accepting our telemetry configuration
- Testing frameworks couldn’t easily mock observability
- Multi-tenant applications couldn’t isolate telemetry per tenant
Decision
We will refactor observability to use lazy initialization with an explicitinit_observability() function that entry points must call after loading configuration.
Design Principles
- Explicit Over Implicit: Entry points must explicitly initialize observability
- Fail-Fast: Accessing observability before initialization raises
RuntimeError - Configuration Respect: Settings are read at initialization time, not import time
- Backward Compatible (Runtime): Existing code using logger/tracer continues to work after init
- Opt-In File Logging: File-based rotation is opt-in to support containerized deployments
Architecture
Module-Level Lazy Proxies
- Import statements remain unchanged:
from ... import logger - Type checkers see correct types
- Runtime error if accessed before init (fail-fast)
Secrets Manager Decoupling
- Breaks circular dependency
- Works during early initialization
- Gracefully degrades to stdlib logging
File Logging Opt-In
- Works in read-only filesystems by default
- Users opt-in via
enable_file_logging=True - Reduces startup noise in development/CI
Consequences
Positive
- ✅ No Circular Import: Config → secrets → telemetry chain is broken
- ✅ Configuration Honored: Settings are read at init time, not import time
- ✅ Container-Friendly: No filesystem writes unless explicitly requested
- ✅ Library-Reusable: Can be imported without side effects
- ✅ Testable: Easy to mock/customize observability per test
- ✅ Explicit Dependencies: Clear when observability is needed
Negative
- ⚠️ Breaking Change: Entry points must add
init_observability()call - ⚠️ Migration Required: All existing entry points need updating
- ⚠️ Runtime Errors: Forgetting to initialize causes runtime failures (but fail-fast is good!)
- ⚠️ Documentation Burden: Need clear migration guide and examples
Neutral
ℹ️ Test Fixtures Updated:conftest.py adds pytest_configure() hook
ℹ️ File Logging Default: Changed from always-on to opt-in
ℹ️ Import Statements Unchanged: Lazy proxies preserve existing import syntax
Implementation
Files Modified
| File | Changes | Purpose |
|---|---|---|
src/mcp_server_langgraph/observability/telemetry.py | +168, -21 | Add lazy init, remove module-level init |
src/mcp_server_langgraph/secrets/manager.py | +50, -3 | Break circular import, add lazy logging |
src/mcp_server_langgraph/mcp/server_stdio.py | +10 | Add init_observability() call |
src/mcp_server_langgraph/core/config.py | +1 | Add enable_file_logging setting |
tests/conftest.py | +13 | Add pytest_configure() hook |
API Surface
Usage Pattern
Alternatives Considered
Alternative 1: Keep Auto-Initialization, Fix Circular Import
Approach: Break circular import by moving config access into function calls Pros:- No breaking changes
- Existing code continues to work
- Doesn’t solve filesystem operations problem
- Doesn’t solve configuration race condition
- Doesn’t make library embeddable
- Band-aid solution that doesn’t address root cause
Alternative 2: Dependency Injection
Approach: Pass observability instances as parameters throughout codebase Pros:- Maximum flexibility
- Testability
- No globals
- Massive refactor (1000s of lines)
- Breaking change to ALL modules
- Complex for users
- Overkill for current needs
Alternative 3: Singleton with Lazy Evaluation
Approach: Use singleton pattern with lazy evaluation on first access Pros:- No explicit initialization needed
- More implicit than current solution
- Same circular import issues
- Same filesystem operations at first use
- Less explicit (harder to debug)
- Doesn’t solve configuration race condition
References
- Issue: ultrathink analysis issue #1 (Decouple observability bootstrapping from import time)
- Related ADRs:
- ADR-0023: Anthropic Tool Design Best Practices
- ADR-0024: Agentic Loop Implementation
- Anthropic Best Practices: https://docs.anthropic.com/en/docs/build-with-claude/tool-use
Validation
Success Criteria
- ✅ Package can be imported without filesystem writes
- ✅ Settings.log_format and langsmith_tracing are honored
- ✅ No circular import errors
- ✅ Works in read-only containers
- ✅ All tests pass (42/42 new tests + existing suite)
- ✅ CI build hygiene check passes (no .pyc files committed)
Test Coverage
tests/unit/test_observability_lazy_init.py(13 tests)- Import without filesystem ops
- Lazy accessors raise before init
- Init with defaults
- Init with settings
- Idempotent initialization
- File logging opt-in
- Settings values honored
- Secrets manager works before init
- Context propagation requires init
- Multiple entry points can init
Rollback Plan
If critical issues discovered:- Revert to v2.7.0:
pip install mcp-server-langgraph==2.7.0 - Remove
init_observability()calls from entry points - File logging will be auto-enabled again
Notes
- Timeline: Implemented 2025-01-17
- Breaking Change Version: v2.8.0
- Backward Compatibility: Runtime API is backward compatible after calling
init_observability() - Migration Effort: Low (~5 lines per entry point)
- Monitoring: No special monitoring needed - standard logging/tracing works
Decision Makers: @vishnu2kmohan Implementation: @claude-sonnet-4-5 Review: Pending Date: 2025-01-17