.. vim: set fileencoding=utf-8: .. -*- coding: utf-8 -*- .. +--------------------------------------------------------------------------+ | | | Licensed under the Apache License, Version 2.0 (the "License"); | | you may not use this file except in compliance with the License. | | You may obtain a copy of the License at | | | | http://www.apache.org/licenses/LICENSE-2.0 | | | | Unless required by applicable law or agreed to in writing, software | | distributed under the License is distributed on an "AS IS" BASIS, | | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | | See the License for the specific language governing permissions and | | limitations under the License. | | | +--------------------------------------------------------------------------+ ******************************************************************************* MkDocs Themes HTML Structure Analysis ******************************************************************************* **Objective**: Analyze MkDocs theme HTML structure to improve librovore content extraction and enable custom markdownify extensions for code blocks. **Method**: Direct HTML download with curl + BeautifulSoup analysis to extract precise CSS selectors and structural patterns. **Status**: ✅ **ANALYSIS COMPLETE** | All 3 MkDocs themes analyzed with comprehensive patterns discovered! =============================================================================== Analyzed Themes =============================================================================== Successfully analyzed with 10 representative documentation sites: 1. **Material for MkDocs** - Most popular theme (5 sites analyzed) - Material docs, code blocks page, reference docs - Pydantic API docs, HTTPX API docs (with mkdocstrings) 2. **ReadTheDocs** - Built-in default theme (3 sites analyzed) - MkDocs main site, configuration docs, writing guide 3. **MkDocs (default)** - Built-in Bootstrap-based theme (2 sites analyzed) - FastAPI docs, MkDocs getting started =============================================================================== Analysis Scripts =============================================================================== **Location**: ``.auxiliary/scripts/mkdocs-analysis/`` Core Analysis Script: ``analyze_mkdocs_html.py`` Main analysis script using BeautifulSoup. Analyzes code blocks, API documentation, and section structure. Outputs ``analysis_results.json`` with detailed findings. Helper Scripts - ``extract_code_patterns.py`` - Displays code block patterns from analysis results - ``section_analysis.py`` - Shows section structure and navigation patterns - ``comprehensive_summary.py`` - ✅ **COMPLETE ANALYSIS** of all 3 themes - ``download_mkdocs_themes.sh`` - Downloads all theme samples systematically Usage .. code-block:: bash # Download samples .auxiliary/scripts/mkdocs-analysis/download_mkdocs_themes.sh # Run analysis hatch --env develop run python .auxiliary/scripts/mkdocs-analysis/analyze_mkdocs_html.py # View specific patterns hatch --env develop run python .auxiliary/scripts/mkdocs-analysis/extract_code_patterns.py # Full comprehensive summary hatch --env develop run python .auxiliary/scripts/mkdocs-analysis/comprehensive_summary.py =============================================================================== Key Findings =============================================================================== 1. Code Block Language Detection ✅ **SOLVED** =============================================================================== **Key Discovery**: MkDocs uses **language-{lang} CSS classes** directly on code elements! HTML Pattern (confirmed across all themes) ------------------------------------------------------------------------------- .. code-block:: html
Language Identification ------------------------------------------------------------------------------- Pattern: ``element.class includes 'language-{lang}'`` Languages found: - ``language-python`` - Python code blocks - ``language-yaml`` - YAML configuration - ``language-json`` - JSON examples - ``language-bash`` - Shell commands - ``language-html`` - HTML markup - ``language-css`` - CSS styling - ``language-javascript`` - JavaScript code Universal container class: ``highlight`` (Same as Sphinx!) 2. API Documentation Structure ✅ **SOLVED** - SIGNATURE PATTERNS DISCOVERED! =============================================================================== 🔥 **CRITICAL DISCOVERY**: mkdocstrings provides **precise signature identification patterns** similar to Sphinx's ``dt.sig.sig-object.py``! mkdocstrings Signature Pattern (from HTTPX API docs) ------------------------------------------------------------------------------- .. code-block:: html
httpx.request ( method , url , ...

Sends an HTTP request.

Complete mkdocstrings Structure ------------------------------------------------------------------------------- Primary containers: - Signature container: ``div.autodoc`` - Signature element: ``div.autodoc-signature`` - Docstring element: ``div.autodoc-docstring`` Signature components: - Function name: ``code > strong`` - Parameters: ``em.autodoc-param`` - Punctuation: ``span.autodoc-punctuation`` Object type indicators: - Class indicator: ``em:contains("class")`` - Classes prefixed with "class" - Function signature: ``code`` - Function signatures in code tags 🎯 SPHINX INVENTORY INTEGRATION ------------------------------------------------------------------------------- When mapping Sphinx ``objects.inv`` entries to MkDocs URLs, you can now **precisely locate function and class signatures** using ``div.autodoc-signature`` patterns - just as reliable as Sphinx's ``dt.sig.sig-object.py``! 3. Section Structure for Query Results ✅ **CONFIRMED** =============================================================================== Actual Patterns (confirmed through analysis) ------------------------------------------------------------------------------- .. code-block:: python mkdocs_content_patterns = { 'material': [ 'main.md-main', # Primary container 'article.md-content__inner', # Main content article 'div.md-content' # Content wrapper ], 'readthedocs': [ 'div.col-md-9[role="main"]', # Bootstrap main column 'div.container' # Bootstrap container ], 'mkdocs_default': [ 'div.col-md-9[role="main"]', # Bootstrap main column (same as RTD) 'div.container' # Bootstrap container ] } =============================================================================== Comprehensive Analysis Results - All 3 Themes =============================================================================== ✅ **CLEAR PATTERNS DISCOVERED!** **Themes Analyzed**: Material for MkDocs, ReadTheDocs, MkDocs Default (10 documentation sites total) Code Block Patterns - **CONSISTENT CONTAINER, VARIABLE LANGUAGE DETECTION** =============================================================================== **Universal Container Class**: ``.highlight`` (identical to Sphinx!) Language Detection Patterns ------------------------------------------------------------------------------- .. code-block:: python mkdocs_code_patterns = { # Material Theme: Most comprehensive language support 'material': [ 'language-python', 'language-yaml', 'language-json', 'language-html', 'language-css', 'language-javascript' ], # ReadTheDocs Theme: Standard language support 'readthedocs': [ 'language-python', 'language-yaml', 'language-bash' ], # Default Theme: Basic language support 'mkdocs_default': [ 'language-bash', 'language-yaml' ], # Universal pattern: Check element classes for 'language-{lang}' 'detection_method': 'element.class includes language-{lang}' } Main Content Containers - **THEME-SPECIFIC BUT PREDICTABLE** =============================================================================== Content Extraction Patterns ------------------------------------------------------------------------------- .. code-block:: python mkdocs_content_extraction = { 'material': { 'primary': 'main.md-main', 'content_article': 'article.md-content__inner', 'content_wrapper': 'div.md-content' }, 'readthedocs': { 'primary': 'div.col-md-9[role="main"]', 'container': 'div.container' }, 'mkdocs_default': { 'primary': 'div.col-md-9[role="main"]', # Same as ReadTheDocs 'container': 'div.container' } } Navigation Cleanup - **THEME-AWARE PATTERNS** =============================================================================== Navigation Elements to Remove ------------------------------------------------------------------------------- .. code-block:: python mkdocs_navigation_cleanup = { 'material': [ 'nav.md-nav', # Main navigation 'div.md-sidebar', # Primary/secondary sidebars 'nav.md-header__inner' # Header navigation ], 'readthedocs': [ 'div.navbar', # Bootstrap navbar 'ul.nav.navbar-nav' # Navigation lists ], 'mkdocs_default': [ 'div.navbar', # Bootstrap navbar (same as RTD) 'ul.nav.navbar-nav' # Navigation lists ] } mkdocstrings API Documentation - **CUSTOM PATTERN** =============================================================================== API Documentation Structure (different from Sphinx dt/dd) ------------------------------------------------------------------------------- .. code-block:: python mkdocstrings_patterns = { 'container_classes': ['autodoc', 'autodoc-docstring', 'autodoc-members'], 'code_examples': '.highlight', # Uses same code container 'content_wrapper': '.md-content__inner.md-typeset' } **Key Difference**: mkdocstrings uses custom div containers, not dt/dd like Sphinx. =============================================================================== Universal MkDocs Patterns Summary =============================================================================== Code Blocks - **Selector**: ``.highlight`` - **Language detection**: ``element_class_prefix:language-`` - **Supported languages**: ``['python', 'yaml', 'json', 'bash', 'html', 'css', 'javascript']`` - **Fallback detection**: Check parent classes if no explicit language class API Documentation (mkdocstrings) - **Signature container**: ``div.autodoc`` - **Signature element**: ``div.autodoc-signature`` - **Docstring element**: ``div.autodoc-docstring`` - **Function name selector**: ``code > strong`` - **Parameters selector**: ``em.autodoc-param`` - **Pattern type**: div_containers (Not dt/dd like Sphinx) - **Sphinx inventory compatible**: True (Can pinpoint specific functions/classes!) Content Containers (Theme-specific) .. code-block:: python content_containers = { 'material': ['main.md-main', 'article.md-content__inner', 'div.md-content'], 'readthedocs': ['div.col-md-9[role="main"]', 'div.container'], 'mkdocs_default': ['div.col-md-9[role="main"]', 'div.container'], 'generic_fallback': [ 'main', 'article', '[role="main"]', '.md-content', '.container' ] } Navigation Cleanup (Theme-specific) .. code-block:: python navigation_cleanup = { 'material': ['nav.md-nav', 'div.md-sidebar', 'nav.md-header__inner'], 'readthedocs': ['div.navbar', 'ul.nav.navbar-nav'], 'mkdocs_default': ['div.navbar', 'ul.nav.navbar-nav'], 'generic': ['nav', '.navbar', '.navigation', '.sidebar'] } =============================================================================== MkDocs vs Sphinx Comparison =============================================================================== 📊 Key Similarities 1. **Code Container**: Both use ``.highlight`` class ✅ 2. **Theme-Specific Content**: Both require theme-aware extraction ✅ 3. **Navigation Cleanup**: Both need theme-specific navigation removal ✅ 📊 Key Differences 1. **Language Detection**: - **MkDocs**: ``language-{lang}`` directly on code element - **Sphinx**: ``highlight-{lang}`` on parent container 2. **API Documentation Signatures**: - **MkDocs/mkdocstrings**: ``div.autodoc > div.autodoc-signature`` ✅ **Precise function/class targeting** - **Sphinx**: ``dt.sig.sig-object.py`` ✅ **Precise function/class targeting** - **VERDICT**: ✅ **Both provide excellent signature identification for Sphinx inventory mapping!** 3. **Theme Consistency**: - **MkDocs**: More variation between themes (Material vs Bootstrap) - **Sphinx**: More standardized patterns across themes =============================================================================== Final Summary =============================================================================== ✅ **MISSION ACCOMPLISHED!** 🔥 Clear Patterns Discovered 1. **Code Block Language Detection**: ``element.class includes 'language-{lang}'`` - consistent across themes 2. **Content Container Selection**: Clear theme-specific patterns with predictable fallbacks 3. **Navigation Cleanup**: Theme-aware patterns identified and documented 4. 🎯 **mkdocstrings Signature Targeting**: ``div.autodoc-signature`` provides **precise function/class identification** for Sphinx inventory mapping! Analysis Completeness - **Themes Analyzed**: 3/3 (100%) - **Documentation Sites**: 10 representative sites - **Code Block Consistency**: High (universal ``.highlight`` container) - **Language Detection**: Theme-dependent but predictable - **Pattern Reliability**: High - **Implementation Readiness**: Complete Session Handoff Information :Context: COMPLETE analysis of all 3 major MkDocs themes :Status: ✅ **ANALYSIS COMPLETE** - All patterns discovered and documented :Scripts: Comprehensive analysis toolchain in ``.auxiliary/scripts/mkdocs-analysis/`` :Key Achievement: Discovered clear patterns despite theme variation :Next Phase: Implementation in librovore structure extractors alongside Sphinx patterns