Linter Core Framework Design¶
This document specifies the core linter framework design implementing the validated hybrid modular architecture from ADR-001. The design provides the foundational abstractions for rule implementation, violation collection, and single-pass CST analysis orchestration.
Core Framework Architecture¶
Framework Components¶
The linter core consists of five primary components forming the analysis pipeline:
- Violation Data Structures
Immutable data classes representing rule violations with precise location information and context extraction capabilities.
- BaseRule Framework
Abstract base class providing the collection-then-analysis pattern with LibCST metadata integration and violation reporting utilities.
- Rule Registry System
Bidirectional mapping between VBL codes and rule implementations supporting both code-based and descriptive-name-based configuration.
- Engine Orchestration
Central coordinator managing the single-pass analysis pipeline with rule execution, violation collection, and deduplication.
- Context Extraction Utilities
Validated patterns for enhancing violation reports with source code context display.
Data Structure Design¶
Violation Representation¶
The violation data structures follow immutable design patterns with precise positioning and context extraction:
from . import __
# Core violation data structure
class Violation( __.immut.DataclassObject ):
''' Represents a rule violation with precise location and context information. '''
rule_id: str
filename: __.typx.Annotated[
str, __.ddoc.Doc( 'Path to source file containing violation.' ) ]
line: __.typx.Annotated[
int, __.ddoc.Doc( 'One-indexed line number of violation.' ) ]
column: __.typx.Annotated[
int, __.ddoc.Doc( 'Zero-indexed column position of violation.' ) ]
message: __.typx.Annotated[
str, __.ddoc.Doc( 'Human-readable description of violation.' ) ]
severity: str = 'error'
# Context extraction for enhanced error reporting
class ViolationContext( __.immut.DataclassObject ):
''' Represents source code context surrounding a violation for enhanced error reporting. '''
violation: Violation
context_lines: __.typx.Annotated[
tuple[ str, ... ], __.ddoc.Doc( 'Source lines surrounding violation.' ) ]
context_start_line: __.typx.Annotated[
int, __.ddoc.Doc( 'One-indexed starting line of context display.' ) ]
# Type aliases for rule framework contracts
ViolationSequence: __.typx.TypeAlias = __.cabc.Sequence[ Violation ]
ViolationContextSequence: __.typx.TypeAlias = __.cabc.Sequence[ ViolationContext ]
Exception Hierarchy¶
Package-specific exceptions following established hierarchy patterns:
from . import __
class Omniexception( __.immut.Object, BaseException ):
''' Base for all exceptions raised by linter framework. '''
class Omnierror( Omniexception, Exception ):
''' Base for error exceptions raised by linter framework. '''
# Rule execution exceptions
class RuleExecuteFailure( Omnierror ):
''' Raised when rule execution encounters unrecoverable error. '''
class MetadataProvideFailure( Omnierror ):
''' Raised when LibCST metadata provider initialization fails. '''
# Configuration exceptions
class RuleRegistryInvalidity( Omnierror ):
''' Raised when rule registry contains invalid mappings. '''
class RuleConfigureFailure( Omnierror ):
''' Raised when rule configuration parameters are invalid. '''
BaseRule Framework Design¶
Abstract Base Class Interface¶
The BaseRule framework implements the validated collection-then-analysis pattern with LibCST integration:
from . import __
class BaseRule( __.abc.ABC, __.libcst.CSTVisitor ):
''' Abstract base class for linting rules implementing collection-then-analysis pattern.
Rules collect data during CST traversal and perform analysis in leave_Module to
generate violations. This pattern supports complex rules requiring complete
information before analysis can occur.
'''
METADATA_DEPENDENCIES = (
__.libcst.metadata.PositionProvider,
__.libcst.metadata.ScopeProvider,
__.libcst.metadata.QualifiedNameProvider,
)
def __init__(
self,
filename: __.typx.Annotated[
str, __.ddoc.Doc( 'Path to source file being analyzed.' ) ],
wrapper: __.typx.Annotated[
__.libcst.MetadataWrapper,
__.ddoc.Doc( 'LibCST metadata wrapper providing position and scope information.' ) ],
source_lines: __.typx.Annotated[
tuple[ str, ... ], __.ddoc.Doc( 'Source file lines for context extraction.' ) ],
) -> None: ...
@property
@__.abc.abstractmethod
def rule_id( self ) -> __.typx.Annotated[
str, __.ddoc.Doc( 'Unique identifier for rule (VBL code).' ) ]: ...
@property
def violations( self ) -> tuple[ Violation, ... ]:
''' Returns violations generated by rule analysis. '''
def leave_Module(
self, node: __.libcst.Module
) -> __.typx.Optional[ __.libcst.Module ]:
''' Performs collection analysis after CST traversal completes.
Subclasses must override _analyze_collections to implement rule-specific
analysis logic using collected data.
'''
@__.abc.abstractmethod
def _analyze_collections( self ) -> None:
''' Analyzes collected data and generates violations.
Called by leave_Module after traversal completes. Implementations
should examine collected data and call _produce_violation for
any violations discovered.
'''
def _produce_violation(
self,
node: __.libcst.CSTNode,
message: __.typx.Annotated[
str, __.ddoc.Doc( 'Human-readable violation description.' ) ],
severity: str = 'error',
) -> None:
''' Creates violation from CST node with precise positioning. '''
def _extract_context(
self,
line: __.typx.Annotated[ int, __.ddoc.Doc( 'One-indexed line number.' ) ],
context_size: int = 2,
) -> ViolationContext:
''' Extracts source code context around violation for enhanced reporting. '''
def _position_from_node(
self, node: __.libcst.CSTNode
) -> tuple[ int, int ]:
''' Extracts (line, column) position from CST node using metadata. '''
Rule Registry Design¶
VBL Code Mapping System¶
The registry provides bidirectional mapping between VBL codes and rule implementations:
from . import __
# Rule registry data structures
class RuleDescriptor( __.immut.DataclassObject ):
''' Describes rule metadata for registry and configuration systems. '''
odr_code: __.typx.Annotated[
str, __.ddoc.Doc( 'VBL code identifier (e.g., "VBL101").' ) ]
descriptive_name: __.typx.Annotated[
str, __.ddoc.Doc( 'Hyphen-separated descriptive name (e.g., "blank-line-elimination").' ) ]
description: __.typx.Annotated[
str, __.ddoc.Doc( 'Human-readable rule description.' ) ]
category: __.typx.Annotated[
str, __.ddoc.Doc( 'Rule category (readability, discoverability, robustness).' ) ]
subcategory: __.typx.Annotated[
str, __.ddoc.Doc( 'Rule subcategory (compactness, nomenclature, navigation, etc.).' ) ]
rule_class: __.typx.Annotated[
str, __.ddoc.Doc( 'Fully qualified class name for rule implementation.' ) ]
RuleRegistry: __.typx.TypeAlias = __.immut.Dictionary[ str, RuleDescriptor ]
RuleClassFactory: __.typx.TypeAlias = __.cabc.Callable[
[ str, __.libcst.MetadataWrapper, tuple[ str, ... ] ], BaseRule ]
''' Factory function type for creating rule instances.
Used by the rule registry system to instantiate rules dynamically
based on VBL codes. Takes filename, LibCST metadata wrapper, and
source lines as parameters, returns configured rule instance ready
for analysis.
This enables the registry to support rules with different constructor
signatures while maintaining a consistent instantiation interface.
The registry can map VBL codes to factory functions that handle any
rule-specific initialization requirements while presenting a uniform
API to the engine.
'''
# Registry interface
class RuleRegistryManager:
''' Manages bidirectional mapping between VBL codes and rule implementations. '''
def __init__(
self, registry: __.cabc.Mapping[ str, RuleDescriptor ]
) -> None: ...
def resolve_rule_identifier(
self, identifier: __.typx.Annotated[
str, __.ddoc.Doc( 'VBL code or descriptive name to resolve.' ) ]
) -> __.typx.Annotated[
str, __.ddoc.Doc( 'Canonical VBL code for identifier.' ),
__.ddoc.Raises( RuleRegistryInvalidity, 'If identifier is not registered.' ),
]: ...
def produce_rule_instance(
self,
odr_code: __.typx.Annotated[ str, __.ddoc.Doc( 'VBL code for rule.' ) ],
filename: str,
wrapper: __.libcst.MetadataWrapper,
source_lines: tuple[ str, ... ],
) -> __.typx.Annotated[
BaseRule,
__.ddoc.Doc( 'Instantiated rule ready for analysis.' ),
__.ddoc.Raises( RuleRegistryInvalidity, 'If VBL code is not registered.' ),
]: ...
def survey_available_rules( self ) -> tuple[ RuleDescriptor, ... ]:
''' Returns all registered rule descriptors sorted by VBL code. '''
def filter_rules_by_category(
self,
category: __.typx.Annotated[ str, __.ddoc.Doc( 'Category to filter by.' ) ]
) -> tuple[ RuleDescriptor, ... ]:
''' Returns rule descriptors matching specified category. '''
Engine Design¶
Orchestration Interface¶
The Engine coordinates single-pass analysis with rule execution and violation collection:
from . import __
# Engine configuration
class EngineConfiguration( __.immut.DataclassObject ):
''' Configuration for linter engine behavior and rule selection. '''
enabled_rules: __.typx.Annotated[
frozenset[ str ], __.ddoc.Doc( 'VBL codes of rules to execute.' ) ]
rule_parameters: __.typx.Annotated[
__.immut.Dictionary[ str, __.immut.Dictionary[ str, __.typx.Any ] ],
__.ddoc.Doc( 'Rule-specific configuration parameters indexed by VBL code.' ) ]
context_size: __.typx.Annotated[
int, __.ddoc.Doc( 'Number of context lines to extract around violations.' ) ] = 2
include_context: __.typx.Annotated[
bool, __.ddoc.Doc( 'Whether to extract source context for violations.' ) ] = True
# Analysis results
class Report( __.immut.DataclassObject ):
''' Results of linting analysis including violations and metadata. '''
violations: tuple[ Violation, ... ]
contexts: __.typx.Annotated[
tuple[ ViolationContext, ... ],
__.ddoc.Doc( 'Violation contexts when context extraction enabled.' ) ]
filename: str
rule_count: __.typx.Annotated[
int, __.ddoc.Doc( 'Number of rules executed during analysis.' ) ]
analysis_duration_ms: __.typx.Annotated[
float, __.ddoc.Doc( 'Time spent in analysis phase excluding parsing.' ) ]
# Engine orchestration interface
class Engine:
''' Central orchestrator for linting analysis implementing single-pass CST traversal. '''
def __init__(
self,
registry_manager: __.typx.Annotated[
RuleRegistryManager, __.ddoc.Doc( 'Rule registry for instantiating rules.' ) ],
configuration: __.typx.Annotated[
EngineConfiguration, __.ddoc.Doc( 'Engine configuration and rule selection.' ) ],
) -> None: ...
def lint_file(
self,
file_path: __.typx.Annotated[
__.pathlib.Path, __.ddoc.Doc( 'Path to Python source file to analyze.' ) ]
) -> __.typx.Annotated[
Report,
__.ddoc.Doc( 'Analysis results including violations and metadata.' ),
__.ddoc.Raises( RuleExecuteFailure, 'If rule execution fails unrecoverably.' ),
__.ddoc.Raises( MetadataProvideFailure, 'If LibCST metadata initialization fails.' ),
]: ...
def lint_source(
self,
source_code: __.typx.Annotated[
str, __.ddoc.Doc( 'Python source code to analyze.' ) ],
filename: __.typx.Annotated[
str, __.ddoc.Doc( 'Logical filename for source code.' ) ] = '<string>',
) -> __.typx.Annotated[
Report,
__.ddoc.Doc( 'Analysis results including violations and metadata.' ),
__.ddoc.Raises( RuleExecuteFailure, 'If rule execution fails unrecoverably.' ),
__.ddoc.Raises( MetadataProvideFailure, 'If LibCST metadata initialization fails.' ),
]: ...
def lint_files(
self,
file_paths: __.cabc.Sequence[ __.pathlib.Path ]
) -> __.typx.Annotated[
tuple[ Report, ... ],
__.ddoc.Doc( 'Analysis results for all files.' ),
]: ...
Context Extraction Utilities¶
Enhanced Error Reporting¶
Context extraction provides validated patterns for displaying source code around violations:
from . import __
class ContextExtractor:
''' Utilities for extracting source code context around violations for enhanced reporting. '''
def __init__(
self, source_lines: tuple[ str, ... ]
) -> None: ...
def extract_violation_context(
self,
violation: Violation,
context_size: __.typx.Annotated[
int, __.ddoc.Doc( 'Number of lines to show before and after violation.' ) ] = 2,
) -> __.typx.Annotated[
ViolationContext,
__.ddoc.Doc( 'Violation with surrounding source context.' ),
]: ...
def format_context_display(
self,
context: ViolationContext,
highlight_line: __.typx.Annotated[
bool, __.ddoc.Doc( 'Whether to highlight the violation line.' ) ] = True,
) -> __.typx.Annotated[
tuple[ str, ... ],
__.ddoc.Doc( 'Formatted context lines with line numbers and highlighting.' ),
]: ...
# Context extraction utilities
def extract_contexts_for_violations(
violations: ViolationSequence,
source_lines: __.cabc.Sequence[ str ],
context_size: int = 2,
) -> ViolationContextSequence:
''' Extracts contexts for multiple violations efficiently. '''
Module Organization Design¶
Framework Module Structure¶
The linter core modules follow established filesystem organization patterns:
sources/vibelinter/
├── __/ # Centralized import hub
│ ├── __init__.py # Core framework imports
│ ├── imports.py # External dependencies (libcst)
│ └── nomina.py # Framework naming constants
├── engine.py # Engine orchestration
├── rules/ # Rule framework and implementations
│ ├── __.py # Rule-specific imports
│ ├── __init__.py # Rule package entry point
│ ├── base.py # BaseRule abstract class
│ ├── registry.py # RuleRegistryManager implementation
│ ├── context.py # ContextExtractor utilities
│ └── violations.py # Violation data structures
Import Hub Design¶
The framework imports are organized through the centralized __ pattern:
# sources/vibelinter/__/imports.py - External dependencies
import libcst
import libcst.metadata
# sources/vibelinter/__/__init__.py - Framework exports
from . import imports
from .. import exceptions
from ..rules import violations, base, registry, context
# sources/vibelinter/rules/__.py - Rule-specific imports
from ..__ import *
from . import violations, base, registry, context
Type Organization¶
Type aliases are organized by usage domain and dependency relationships:
# sources/vibelinter/rules/violations.py
from ..__ import *
# Core violation types defined here
class Violation( __.immut.DataclassObject ): ...
class ViolationContext( __.immut.DataclassObject ): ...
# Type aliases for framework contracts
ViolationSequence: __.typx.TypeAlias = __.cabc.Sequence[ Violation ]
ViolationContextSequence: __.typx.TypeAlias = __.cabc.Sequence[ ViolationContext ]
# sources/vibelinter/rules/base.py
from . import __
# BaseRule depends on violation types
class BaseRule( __.abc.ABC, __.libcst.CSTVisitor ): ...
Design Validation¶
Framework Compliance Verification¶
The design adheres to established practices and patterns:
Practices Compliance: - Wide parameter types (__.cabc.Sequence, __.cabc.Mapping) for public interfaces - Narrow return types (tuple, __.immut.Dictionary) for concrete results - Immutable data structures using __.immut.DataclassObject patterns - Exception hierarchy following Omniexception → Omnierror patterns - Type annotations with __.typx.TypeAlias for complex types
Style Compliance: - Function signatures follow spacing and bracket conventions - Docstrings use narrative mood with triple single-quotes - Type annotations use proper __.typx.Annotated patterns with __.ddoc documentation
Nomenclature Compliance: - Class names follow established patterns (Manager, Extractor, Engine suffixes) - Function names use verb_noun patterns (extract_context, produce_violation) - Module organization follows established filesystem patterns - VBL codes maintain semantic series organization
Architecture Compliance: - Single-pass CST traversal with metadata providers - Collection-then-analysis pattern for complex rule implementation - Context extraction for enhanced error reporting - Rule registry system supporting both VBL codes and descriptive names
Implementation Readiness¶
The design provides complete interface specifications for:
Validated LibCST integration patterns from proof-of-concept
Performance-optimized single-pass analysis achieving 600ms target
Collection-then-analysis pattern supporting complex rules
Context extraction enhancing developer experience
Rule registry supporting configuration flexibility
Exception handling with proper error propagation
Configuration vs. Registry Architectural Separation¶
Design Decision Rationale¶
The linter core framework maintains deliberate separation between Configuration and Registry systems:
Registry System Responsibilities: - Provides metadata about available rules (VBL codes, descriptive names, categories) - Manages rule instantiation patterns and factory functions - Handles bidirectional mapping between VBL codes and rule implementations - Maintains rule capabilities and documentation references
Configuration System Responsibilities: - Manages user preferences for rule enablement and parameters - Handles command-line overrides and project-specific settings - Applies precedence rules across configuration sources - Validates user-provided configuration values
Separation Rationale: - Command-Query Separation: Registry queries rule information, Configuration commands execution behavior - Independent Evolution: Registry changes when rules are added/modified, Configuration changes when user requirements evolve - Single Responsibility: Registry focuses on “what rules exist and how to create them,” Configuration focuses on “which rules to run and with what settings” - Testability: Registry can be tested with static rule metadata, Configuration tested with dynamic user preferences - Extensibility: Registry supports rule discovery and introspection, Configuration supports customization and overrides
This architectural separation enables the rule framework to evolve independently from user preference management while maintaining clear interfaces between rule metadata and execution configuration.
The framework design enables immediate implementation of the core linting engine following the validated architectural decisions and proven performance characteristics.