# Python Development Guide
This guide covers **comprehensive Python development guidance including
code organization, patterns, architectural decisions, and formatting
standards**. For general guidance applicable to all languages, see the main
[practices guide](practices.md). For shared workflow guidance, use the core
[environment guide](environment.md), [validation guide](validation.md),
[testing guide](tests.md), and [release guide](releases.md), then layer
Python overlays where needed.
## Comprehensive Example: Data Processing
The following before/after example demonstrates multiple Python best practices in one cohesive implementation. Individual sections throughout this document reference specific aspects of this example.
**❌ Before - Multiple violations:**
``` python
from typing import Dict, List, Any, Optional
from pathlib import Path
import json
class DataProcessor:
def __init__(self, config_file):
# Missing type annotations
with open(config_file) as f:
self.config = json.load(f) # Mutable, exposed
def process_user_data(self, users, filters=None, options={}):
# Narrow types, mutable defaults, missing annotations
try:
# Overly broad try block
results = {}
for user in users:
if filters:
if self.validate_user(user, filters):
processed = self.transform_user(user, options)
results[user['id']] = processed
else:
processed = self.transform_user(user, options)
results[user['id']] = processed
return results # Mutable return
except:
# Generic exception handling
return {}
```
**✅ After - Best practices applied:**
``` python
from json import loads as _json_loads
from . import __
# Type aliases for reused complex types
Location: __.typx.TypeAlias = str | __.pathlib.Path
ProcessorOptions: __.typx.TypeAlias = __.immut.Dictionary[ str, __.typx.Any ]
UserRecord: __.typx.TypeAlias = __.cabc.Mapping[ str, __.typx.Any ]
FilterFunction: __.typx.TypeAlias = __.cabc.Callable[ [ UserRecord ], bool ]
# Exception hierarchy (would typically be in mypackage.exceptions module)
class UserValidationInvalidity( __.Omnierror, ValueError ):
''' User data validation invalidity. '''
class ConfigurationAbsence( __.Omnierror, FileNotFoundError ):
''' Configuration file absence. '''
# Private constants and defaults
_OPTIONS_DEFAULT = __.immut.Dictionary( )
class DataProcessor( __.immut.DataclassObject ):
''' Processes user data with configurable validation and transformation. '''
configuration: ProcessorOptions
@classmethod
def from_configuration_file( cls, location: Location ) -> __.typx.Self:
''' Creates processor from configuration file. '''
file = __.pathlib.Path( location )
try: content = file.read_text( )
except ( OSError, IOError ) as exception:
raise ConfigurationAbsence(
f"Cannot read configuration: {location}" ) from exception
try: configuration = _json_loads( content )
except ValueError as exception:
raise ConfigurationAbsence(
f"Invalid JSON configuration: {location}" ) from exception
return cls( configuration = __.immut.Dictionary( configuration ) )
def process_user_data(
self,
users: __.cabc.Sequence[ UserRecord ],
filters: __.Absential[ __.cabc.Sequence[ FilterFunction ] ] = __.absent,
options: ProcessorOptions = _OPTIONS_DEFAULT,
) -> __.immut.Dictionary[ str, UserRecord ]:
''' Processes user data with optional filtering and custom options. '''
filters = ( ) if __.is_absent( filters ) else filters
results = { }
for user in users:
try: identifier = user[ 'identifier' ]
except KeyError as exception:
raise UserValidationInvalidity(
"User missing requisite 'identifier' field." ) from exception
if not all( filter( user ) for filter in filters ):
continue
processed = self._transform_user( user, options )
results[ identifier ] = processed
return __.immut.Dictionary( results )
def _transform_user( self, user: UserRecord, options: ProcessorOptions ) -> UserRecord:
''' Transforms user record according to configuration and options. '''
return __.immut.Dictionary( user )
```
## Module Organization
- Organize module contents in the following order to improve readability and maintainability:
1. **Imports**: See import organization section below.
2. **Common type aliases**: `TypeAlias` declarations used throughout the module.
3. **Private variables and functions**:
1. **Private constants**: Configuration defaults, validation constants
2. **Private functions**: Used as defaults for public functions or to initialize caches/registries
3. **Private caches and registries**: Module-level mutable containers
Group each subcategory semantically, sort lexicographically within groups.
4. **Public interfaces**:
1. **Public classes**: Sorted lexicographically
2. **Public functions**: Sorted lexicographically
5. **All other private functions**: Implementation helpers, sorted lexicographically.
The DataProcessor
example demonstrates proper module organization: imports first, then
type aliases (`Location`, `UserRecord`, etc.), followed by exception
classes, private constants (`_OPTIONS_DEFAULT`), and finally the public
`DataProcessor` class with its methods properly ordered.
- Group private constants and initialization functions semantically (configuration, validation, formatting, etc.) but sort within each semantic group lexicographically.
- Type aliases which depend on a class defined in the module should appear immediately after the class on which they depend.
## Imports
### Import Organization
- Follow PEP 8 import grouping conventions:
1. `__future__` imports
2. Standard library imports
3. Third-party imports
4. First-party (relative) imports
### Visual Formatting
- For import sequences that will not fit on one line, use parentheses with hanging indent.
**✅ Prefer:**
``` python
from third_party.submodule import (
FirstClass, SecondClass, ThirdClass )
```
- For import sequences that will not fit on two lines, list them one per line with a trailing comma after each one and the closing parentheses dedented on a separate line.
**✅ Prefer:**
``` python
from third_party.other import (
ALongClassName,
AnotherLongClassName,
YetAnotherLongClassName,
)
```
- Imports within a sequence should be sorted lexicographically with uppercase letters coming before lowercase ones (i.e., classes and type aliases before functions). Import aliases are relevant to this ordering rather than the imports which they alias.
### Namespace Management
- Avoid ancillary imports into a module namespace. Use `__` subpackage for common imports or private module-level aliases for specialized imports.
- Never use `__all__` to advertise the public API of a module. Name anything, which should not be part of this API, with a private name starting with `_`.
The DataProcessor
example demonstrates clean namespace management: `_json_loads` as a
private alias for performance-critical imports and `from . import __` for
accessing common project utilities without namespace pollution.
## Type Annotations
- Add type annotations for all function arguments, class attributes, and return values. Use Python 3.10+ union syntax with `|` for simple unions, `__.typx.Union` for complex multi-line unions, and `TypeAlias` for reused complex types.
See the comprehensive
DataProcessor example
above, which demonstrates proper type annotation patterns including
`TypeAlias` declarations for reused types like `Location`, `UserRecord`, and
`FilterFunction`.
### Visual Formatting
- Type annotations follow the same spacing rules as other code constructs - one space after opening delimiters and one space before closing delimiters.
**✅ Prefer:**
``` python
def process(
items: __.cabc.Sequence[ __.cabc.Mapping[ str, int ] ]
) -> dict[ str, bool ]: pass
ComplexType: __.typx.TypeAlias = __.typx.Union[
dict[ str, __.typx.Any ],
list[ str ],
]
```
**❌ Avoid:**
``` python
# Wrong: inconsistent bracket spacing in type annotations
def process( items: __.cabc.Sequence[__.cabc.Mapping[str, int]] ) -> dict[str, bool]: pass
```
### Semantic Usage
- Prefer `__.Absential` over `__.typx.Optional` for optional function arguments when `None` has semantic meaning distinct from "not provided". This is especially valuable for update operations where `None` means "remove/clear" and absence means "leave unchanged".
**❌ Standard approach:**
``` python
def update_user_profile(
user_id: int,
display_name: __.typx.Optional[ str ] = None,
avatar_url: __.typx.Optional[ str ] = None
) -> None:
# Problem: Cannot distinguish "don't change" from "clear field"
if display_name is not None:
# Both "clear name" and "set name" end up here
database.update( user_id, display_name = display_name )
```
**✅ Better with Absence package:**
``` python
def update_user_profile(
user_id: int,
display_name: __.Absential[ __.typx.Optional[ str ] ] = __.absent,
avatar_url: __.Absential[ __.typx.Optional[ str ] ] = __.absent,
) -> None:
# Clear distinction between three states
if not __.is_absent( display_name ):
if display_name is None:
database.clear_field( user_id, 'display_name' ) # Remove field
else:
database.update( user_id, display_name = display_name ) # Set value
# If absent: leave field unchanged
```
- Use PEP 593 `Annotated` to encapsulate parameter and return value documentation via `dynadoc` annotations: `__.ddoc.Doc`, `__.ddoc.Raises`.
**❌ Avoid:**
``` python
# Parameter documentation in docstring
def calculate_distance( lat1, lon1, lat2, lon2 ):
"""Calculate distance between two points.
Args:
lat1: Latitude of first point in degrees
lon1: Longitude of first point in degrees
lat2: Latitude of second point in degrees
lon2: Longitude of second point in degrees
Returns:
Distance in kilometers
"""
pass
```
**✅ Prefer:**
``` python
from . import __
def calculate_distance(
lat1: __.typx.Annotated[
float, __.ddoc.Doc( "Latitude of first point in degrees." ) ],
long1: __.typx.Annotated[
float, __.ddoc.Doc( "Longitude of first point in degrees." ) ],
lat2: __.typx.Annotated[
float, __.ddoc.Doc( "Latitude of second point in degrees." ) ],
long2: __.typx.Annotated[
float, __.ddoc.Doc( "Longitude of second point in degrees." ) ],
) -> __.typx.Annotated[
float,
__.ddoc.Doc( "Distance in kilometers." ),
__.ddoc.Raises( ValueError, "If coordinates are invalid." ),
]:
''' Calculates distance between two geographic points. '''
pass
```
## Function Signatures
### Type Principles
- Accept wide types (abstract base classes) for public function parameters; return narrow types (concrete types) from all functions.
The DataProcessor
example demonstrates this principle: `process_user_data` accepts wide
parameter types (`__.cabc.Sequence[ UserRecord ]` for maximum caller
flexibility) while returning a narrow, specific type
(`__.immut.Dictionary[ str, UserRecord ]`) that provides clear guarantees.
### Visual Formatting
- Keep all arguments on one line if they fit within the line limit.
**✅ Prefer:**
``` python
def simple_function( arg1: int, arg2: str = 'default' ) -> bool:
return True
```
- Use spaces around `=` for keyword/nominative arguments.
**✅ Prefer:**
``` python
def some_function( magic: int = 42 ) -> int: pass
result = process( data, timeout = 30 )
```
**❌ Avoid:**
``` python
def some_function(magic=42): pass
result = process(data, timeout=30)
```
- When arguments must be split across lines, prefer to group positional and keyword arguments.
**✅ Prefer:**
``` python
def medium_function(
first_pos: str, second_pos: int, third_pos: bool,
first_named: str = 'default', second_named: str = 'other'
) -> None: pass
```
- When grouping would overflow a line, place each argument on its own line.
**✅ Prefer:**
``` python
def complex_function(
first_very_long_positional_argument: __.cabc.Mapping[ str, int ],
second_very_long_positional_argument: __.cabc.Sequence[ str ],
first_named_arg: str = 'some very long default value',
second_named_arg: str = 'another long default value',
) -> None: pass
```
- For multi-line return type annotations using `Annotated`, place the closing bracket and colon on the final line.
**✅ Prefer:**
``` python
def complex_function(
data: UserData
) -> __.typx.Annotated[
ProcessedData,
__.ddoc.Doc( "Processed user data with validation." ),
__.ddoc.Raises( ValueError, "If data validation fails." ),
]:
''' Process user data with comprehensive validation. '''
pass
```
- When a single-line form would overflow, always go to a three-or-more-line form with the arguments on indented lines between the first and last lines. There is no two-line form.
**✅ Prefer:**
``` python
def semicomplex_function(
argument_1: int, argument_2: int, argument_3: str
) -> bool: return True
```
**❌ Avoid:**
``` python
def semicomplex_function( argument_1: int, argument_2: int, argument_3: str
) -> bool: return True
```
## Immutability
- Prefer immutable data structures over mutable ones when internal mutability is not required. Use `tuple` instead of `list`, `frozenset` instead of `set`, and immutable classes from `__.immut` (frigid) and `__.accret` (accretive) libraries.
The DataProcessor
example demonstrates immutability principles: the class inherits from
`__.immut.DataclassObject`, uses `_OPTIONS_DEFAULT` as an immutable default,
and returns `__.immut.Dictionary` objects to prevent accidental mutation of
results.
- When mutable data structures are genuinely needed (e.g., performance-critical code, interfacing with mutable APIs), clearly document the mutability requirement and consider using the `Mutable` variants of `__.accret` and `__.immut` classes.
## Exceptional Conditions
- Create a package exception hierarchy by subclassing from `Omniexception` and `Omnierror` base classes. This allows callers to catch all package-specific exceptions generically if desired.
The DataProcessor
example demonstrates proper exception hierarchy with
`UserValidationInvalidity` and `ConfigurationAbsence` inheriting from
`__.Omnierror` and appropriate built-in exception types.
- Follow established exception naming conventions from the
[nomenclature guide](nomenclature.md). Use patterns like
`Failure`, `Absence`, `Invalidity`, and
`Empty`.
- Limit `try` block scope to contain only the statement(s) that can raise exceptions. In rare cases, a `with` suite may be included. Avoid wrapping entire loop bodies or function bodies in `try` blocks when possible.
The DataProcessor
example demonstrates narrow try blocks: each `try` statement isolates
only the specific operation that can fail (`user['identifier']`,
`file.read_text()`, `_json_loads(content)`), enabling precise error handling
and proper exception chaining.
- Never swallow exceptions. Either chain a `__cause__` with a `from` original exception or raise a new exception with original exception as the `__context__`. Or properly handle the exception.
**❌ Avoid:**
These examples show two common anti-patterns: completely swallowing exceptions (which loses debugging information) and raising new exceptions without chaining (which loses the original context).
``` python
def risky_operation( ):
try: dangerous_call( )
except Exception:
pass
def risky_operation( ):
try: dangerous_call( )
except ValueError:
raise RuntimeError( "Operation failed." )
```
**✅ Prefer:**
These examples show proper exception handling: explicit chaining preserves the original exception context, while proper handling provides fallback behavior without losing debugging information.
``` python
def risky_operation( ):
try: dangerous_call( )
except ValueError as exc:
raise OperateFailure( operation_context ) from exc
def risky_operation( ):
try: dangerous_call( )
except ValueError as exc:
logger.warning( f"Dangerous call failed: {exc}." )
return fallback_result( )
```
## Documentation
### Content Standards
- Documentation must be written as Sphinx reStructuredText. The docstrings for functions must not include parameter or return type documentation. Parameter and return type documentation is handled via PEP 727 annotations. Pull requests, which include Markdown documentation or which attempt to provide function docstrings in the style of Google, NumPy, Sphinx, etc..., will be rejected.
- Function docstrings should use narrative mood (third person) rather than imperative mood (second person). The docstring describes what the function does, not what the caller should do.
### Visual Formatting
- Use triple single-quotes for all docstrings with proper spacing.
- For single-line docstrings, include one space after the opening quotes and before the closing quotes.
**✅ Prefer:**
``` python
def example_function( ) -> str:
''' An example function. '''
```
**❌ Avoid:**
``` python
def example_function( ):
"""An example function."""
def example_function( ):
'''An example function.'''
```
- For multi-line docstrings, include a newline after the heading and before the closing quotes. Indent continuation lines to match the opening quotes. Place the closing triple quotes on their own line for multi-line docstrings, indented to match the opening quotes.
**✅ Prefer:**
``` python
class ExampleClass:
''' An example class.
This class demonstrates proper docstring formatting
with multiple lines of documentation.
'''
```
**❌ Avoid:**
``` python
class ExampleClass:
"""An example class.
This class demonstrates proper docstring formatting
with multiple lines of documentation.
"""
class ExampleClass:
"""An example class.
This class demonstrates proper docstring formatting
with multiple lines of documentation."""
```
**❌ Avoid:**
``` python
def validate_config( config: __.cabc.Mapping[ str, __.typx.Any ] ) -> __.cabc.Mapping[ str, __.typx.Any ]:
''' Validate the configuration dictionary. ''' # Imperative mood
def process_data( data: __.cabc.Sequence[ __.typx.Any ] ) -> dict[ str, __.typx.Any ]:
''' Process the input data and return results. ''' # Mixed - starts imperative
```
**✅ Prefer:**
``` python
def validate_config(
config: __.cabc.Mapping[ str, __.typx.Any ]
) -> __.cabc.Mapping[ str, __.typx.Any ]:
''' Validates the configuration dictionary. ''' # Narrative mood
def process_data(
data: __.cabc.Sequence[ __.typx.Any ]
) -> dict[ str, __.typx.Any ]:
''' Processes input data and returns results. ''' # Narrative mood
```
## Formatting Standards
For approximate autoformatter profiles aligned with these conventions,
see [Python automatic formatting](python-autoformat.md).
### Lines and Spaces
- One space after opening delimiters ( `(`, `[`, `{` ) and one space before closing delimiters ( `)`, `]`, `}` ), *except* inside of f-strings and strings to which `.format` is applied.
**✅ Prefer:**
``` python
func( arg1, arg2 )
data = [ 1, 2, 3 ]
config = { 'key': 'value' }
# Exception: f-strings and .format
message = f"Hello {name}."
template = "Value: {value}".format( value = 42 )
```
**❌ Avoid:**
``` python
func(arg1, arg2)
data = [1, 2, 3]
config = {"key": "value"}
# Wrong: spaces in f-strings
message = f"Hello { name }."
```
- Empty collection literals have a single space between delimiters, `( )`, `[ ]`, `{ }`. This includes function definitions and invocations with no arguments.
**✅ Prefer:**
``` python
empty_list = [ ]
empty_dict = { }
def no_args_function( ) -> None: pass
result = some_function( )
```
**❌ Avoid:**
``` python
empty_list = []
empty_dict = {}
def no_args_function(): pass
result = some_function()
```
### Collections
- For short collections, keep them on one line.
**✅ Prefer:**
``` python
points = [ ( 1, 2 ), ( 3, 4 ), ( 5, 6 ) ]
config = { 'name': 'example', 'value': 42 }
```
- For longer collections, split elements one per line with a trailing comma after the last element.
**✅ Prefer:**
``` python
matrix = [
[ 1, 2, 3, 4 ],
[ 5, 6, 7, 8 ],
[ 9, 10, 11, 12 ],
]
settings = {
'name': 'example',
'description': 'A longer example that needs multiple lines',
'values': [ 1, 2, 3, 4, 5 ],
'nested': {
'key1': 'value1',
'key2': 'value2',
},
}
```
### Line Continuation
- Use parentheses for line continuation. Split at natural points such as dots, operators, or after commas. Keep the closing parenthesis on the same line as the last element unless the collection has a trailing comma.
- For operator splits, place the operator at the beginning of the split-off line, not at the end of the line being split.
**✅ Prefer:**
``` python
# Dot operator splits
result = (
very_long_object_name.first_method_call( )
.second_method_call( )
.final_method_call( ) )
# Operator splits - operators at beginning of continuation lines
total = (
first_long_value * second_long_value
+ third_long_value * fourth_long_value
- adjustment_factor )
```
**❌ Avoid:**
``` python
# Using backslash continuation
result = very_long_object_name.first_method_call( ) \
.second_method_call( ) \
.final_method_call( )
# Operators at end of line being split
total = (
first_long_value * second_long_value +
third_long_value * fourth_long_value )
```
### Function Invocations
- For function invocations, generally omit trailing commas after the final argument, keeping the closing parenthesis on the same line as the final argument.
**✅ Prefer:**
``` python
# Single line invocations
result = process_data( input_file, output_file, strict = True )
# Multi-line invocations without trailing comma
result = complex_processing_function(
very_long_input_parameter,
another_long_parameter,
final_parameter = computed_value )
```
**❌ Avoid:**
``` python
# Unnecessary trailing comma in function call
result = process_data(
input_file,
output_file,
strict = True, )
```
### Strings
- Use single quotes for plain data strings unless they contain single quotes. Use double quotes for f-strings, `.format` strings, exception messages, and log messages.
- Exception messages and log messages should end with periods for consistency and proper sentence structure.
**✅ Prefer:**
``` python
name = 'example'
path = 'C:\\Program Files\\Example'
message = f"Processing {name} at {path}."
count = "Number of items: {count}".format( count = len( items ) )
raise ValueError( "Invalid configuration value." )
logger.error( "Failed to process item." )
```
**❌ Avoid:**
``` python
name = "example"
path = "C:\\Program Files\\Example"
message = f'Processing {name} at {path}'
count = "Number of items: {len(items)}"
raise ValueError( 'Invalid configuration value' )
logger.error( 'Failed to process item' )
```
- Do not use function calls or subscripts inside of f-string expressions. These can be opaque to some linters and syntax highlighters. Instead, use strings with the `.format` method for these cases.
**✅ Prefer:**
``` python
"Values: {values}".format( values = ', '.join( values ) )
```
**❌ Avoid:**
``` python
f"Values: {', '.join(values)}"
```
## Quality Assurance
- Run project-specific quality commands to ensure code meets standards. Use the provided hatch environments for consistency.
``` shell
hatch --env develop run linters # Runs all configured linters
hatch --env develop run testers # Runs full test suite
hatch --env develop run docsgen # Generates documentation
```
- Linter suppressions must be reviewed critically. Address underlying design problems rather than masking them with suppressions.
**Acceptable Suppressions:**
- `noqa: PLR0913` may be used for CLI or service APIs with many parameters, but data transfer objects should be considered in most other cases.
- `noqa: S*` may be used for properly constrained and vetted subprocess executions or Internet content retrievals.
**Unacceptable Suppressions (require investigation):**
- `type: ignore` must not be used except in extremely rare circumstances. Such suppressions usually indicate missing third-party dependencies or type stubs, inappropriate type variables, or bad inheritance patterns.
- `__.typx.cast` should not be used except in extremely rare circumstances. Such casts suppress normal type checking and usually indicate the same problems as `type: ignore`.
- Tryceratops complaints must not be suppressed with `noqa` pragmas.
- Most other `noqa` suppressions require compelling justification.
- If third-party typing stubs are missing, then ensure that the third-party package has been included in `pyproject.toml` and rebuild the Hatch environment with `hatch env prune`. If they are still missing after that, then generate them with:
``` shell
hatch --env develop run pyright --createsub somepackage
```
Then, fill out the stubs you need to satisfy Pyright and comment out or discard the remainder.