System Architecture¶

This document explains the architectural design of AutoDocs MCP Server, covering the layered architecture, component responsibilities, and design principles that guide the implementation.

Architectural Overview¶

AutoDocs MCP Server follows a modular, layered architecture with comprehensive separation of concerns. The system is designed for production reliability, maintainability, and extensibility.

Architecture Principles¶

Layered Design: Clear separation between infrastructure, core services, and application layers
Dependency Inversion: Higher layers depend on abstractions, not concrete implementations
Single Responsibility: Each module has a focused, well-defined purpose
Graceful Degradation: System continues operating with partial failures
Async-First: Full asynchronous support for optimal performance

System Layers¶

1. Infrastructure Layer (`src/autodocs_mcp/`)¶

The infrastructure layer provides foundational services and system-level functionality.

Main Application (`main.py`)¶

FastMCP Server: MCP protocol compliance with 8 exposed tools
Lifecycle Management: Graceful startup, shutdown, and async resource management
Service Integration: Coordinates all core services and manages their lifecycles

Configuration Management (`config.py`)¶

Environment-Aware: Supports development, testing, and production configurations
Validation: Comprehensive Pydantic-based validation with clear error messages
Defaults: Sensible defaults for all configuration parameters

Security Layer (`security.py`)¶

Input Validation: Sanitization of all user inputs and parameters
Path Security: Secure handling of file paths and cache keys
URL Validation: Validation of external URLs and endpoints

Observability (`observability.py`)¶

Metrics Collection: Performance tracking and system monitoring
Structured Logging: Production-ready logging with context preservation
Performance Monitoring: Request timing, cache hit rates, and error tracking

Health Management (`health.py`)¶

Health Checks: Comprehensive system health status for load balancers
Readiness Checks: Kubernetes-style readiness probe for deployment orchestration
Dependency Monitoring: Validation of external service availability

Data Models (`models.py`)¶

Type Safety: Pydantic v2 models for all data structures
Serialization: JSON serialization with proper error handling
Validation: Runtime validation of all data structures

Exception Handling (`exceptions.py`)¶

Custom Hierarchy: Structured exception types with context information
Error Context: Detailed error information for debugging and user feedback
Recovery Guidance: Actionable error messages with suggested solutions

2. Core Services Layer (`src/autodocs_mcp/core/`)¶

The core services layer implements business logic and domain-specific functionality.

Dependency Management¶

Dependency Parser (dependency_parser.py) - PyProject.toml Parsing: Robust parsing with graceful degradation for malformed files - Dependency Extraction: Parsing of [project] dependencies with version constraints - Error Recovery: Continues processing with partial parsing failures

Dependency Resolver (dependency_resolver.py) - Enhanced Resolution: Dependency resolution with conflict detection - Version Constraint Handling: Complex version constraint parsing and validation - Transitive Dependencies: Future support for dependency tree analysis

Version Resolver (version_resolver.py) - PyPI Integration: Version constraint resolution using PyPI API - Caching: Efficient caching of version resolution results - Fallback Strategies: Graceful handling of version resolution failures

Documentation Services¶

Documentation Fetcher (doc_fetcher.py) - PyPI API Integration: Fetching package metadata and documentation from PyPI - Concurrent Processing: Parallel fetching of multiple packages with rate limiting - Content Processing: Extraction and formatting of relevant documentation sections

Context Fetcher (context_fetcher.py) - Phase 4 Feature: Comprehensive context fetching with dependency analysis - Smart Scoping: Intelligent selection of relevant dependencies for AI context - Token Management: Budget-aware context assembly for AI model limits

Context Formatter (context_formatter.py) - AI Optimization: Documentation formatting optimized for AI consumption - Token Budget Management: Automatic context truncation and prioritization - Query Filtering: Targeted documentation section selection

Infrastructure Services¶

Cache Manager (cache_manager.py) - High-Performance Caching: JSON file-based caching with version-specific keys - Immutable Keys: Version-based cache keys ({package_name}-{version}) with no expiration - Corruption Recovery: Automatic detection and recovery from corrupted cache entries

Network Client (network_client.py) - HTTP Abstraction: Clean HTTP client interface with retry logic - Connection Pooling: Efficient connection reuse with automatic cleanup - Request Management: Timeout handling, request queuing, and resource limits

Network Resilience (network_resilience.py) - Circuit Breakers: Advanced network reliability with failure detection - Exponential Backoff: Smart retry strategies for transient failures - Connection Pool Management: Automatic resource cleanup and health monitoring

Error Formatter (error_formatter.py) - User-Friendly Messages: Structured error handling with clear, actionable messages - Error Context: Detailed error information for debugging and troubleshooting - Recovery Suggestions: Guidance for resolving common error conditions

Component Interactions¶

Request Flow Architecture¶

MCP Client Request
       ↓
FastMCP Server (main.py)
       ↓
Security Validation (security.py)
       ↓
Core Service Layer
   ├── Dependency Parser → Version Resolver
   ├── Context Fetcher → Doc Fetcher
   └── Cache Manager ← Network Client
       ↓
Network Resilience Layer
       ↓
External APIs (PyPI)

Data Flow Patterns¶

Dependency Scanning Flow:
pyproject.toml → Dependency Parser → Dependency Resolver → Structured Response
Documentation Fetching Flow:
Package Request → Cache Check → Network Fetch → Format → Cache Store → Response
Context Assembly Flow:
Dependencies → Context Fetcher → Smart Scoping → Token Budget → Formatted Context

Architectural Decisions¶

Async-First Design¶

Decision: Full asynchronous architecture throughout the system Rationale: - I/O-bound operations (network requests, file operations) benefit significantly from async - Better resource utilization for concurrent operations - Scalability for handling multiple simultaneous requests

Implementation: - All service methods are async - Proper async context managers for resources - AsyncIO-compatible libraries (httpx, aiofiles where applicable)

Layered Architecture¶

Decision: Strict separation between infrastructure, core services, and application layers Rationale: - Clear separation of concerns improves maintainability - Dependency inversion enables better testing and mocking - Modular design supports future architectural evolution

Implementation: - Infrastructure layer handles system concerns (config, logging, health) - Core services implement business logic without infrastructure knowledge - Clear interfaces between layers

Version-Based Caching¶

Decision: Use immutable cache keys based on package name and version Rationale: - Package versions are immutable - documentation won't change for a given version - Eliminates cache invalidation complexity - Optimal performance for repeated requests

Implementation: - Cache keys: {package_name}-{version}.json - No time-based expiration - Automatic cache validation and corruption recovery

Circuit Breaker Pattern¶

Decision: Implement circuit breakers for external API calls Rationale: - Prevents cascading failures when PyPI or other services are unavailable - Improves system resilience and user experience - Enables graceful degradation

Implementation: - Configurable failure thresholds - Exponential backoff with jitter - Automatic circuit recovery

Performance Considerations¶

Concurrent Processing¶

Parallel dependency fetching with configurable limits
Connection pooling with automatic cleanup
Request queuing and rate limiting

Memory Management¶

Streaming JSON processing for large responses
Bounded cache sizes with LRU eviction
Efficient string handling and memory cleanup

Resource Limits¶

Configurable timeouts for all external requests
Maximum context size limits for AI compatibility
Connection pool size management

Security Architecture¶

Input Validation¶

All user inputs validated and sanitized
Path traversal prevention for cache operations
URL validation for external requests

Resource Protection¶

Rate limiting for external API calls
Memory usage monitoring and limits
Secure temporary file handling

Error Information Security¶

Sanitized error messages to prevent information disclosure
Secure logging practices (no sensitive data in logs)
Controlled error context in responses

Evolution and Extensibility¶

Plugin Architecture Readiness¶

The current architecture supports future plugin-based extensions: - Clear service interfaces suitable for plugin implementation - Configuration system designed for plugin parameters - Event system foundation for plugin lifecycle management

Service Container Pattern¶

The infrastructure supports evolution toward a service container: - Dependency injection patterns already established - Service lifecycle management in place - Configuration-driven service instantiation

Microservice Decomposition¶

The layered architecture supports future microservice extraction: - Clear service boundaries - Network-aware interfaces - Independent service configuration

This architectural foundation provides a solid base for current requirements while enabling future evolution and scaling needs.