Architecture overview¶

This page explains the high-level architecture of typing-graph, how its components work together, and the reasoning behind the design decisions. Understanding the architecture helps you work effectively with the library and appreciate why certain choices were made.

Why typing-graph exists¶

Python's typing module provides rich type annotation capabilities, but working with these annotations programmatically is surprisingly difficult. The standard library offers get_origin(), get_args(), and get_type_hints(), but these low-level primitives leave significant work for library authors:

Handling the many edge cases across Python versions
Traversing nested types recursively
Extracting metadata from Annotated wrappers
Managing forward reference resolution
Dealing with the subtle differences between typing.Union and types.UnionType

typing-graph addresses these challenges by providing a unified, graph-based representation of type annotations. Rather than working with raw type objects and their quirks, you work with a consistent node hierarchy that handles the complexity internally.

Core concepts¶

typing-graph transforms Python type annotations into a traversable graph of nodes. Each node represents a component of a type annotation, from simple types like int to complex nested generics like dict[str, list[Annotated[int, Gt(0)]]].

flowchart LR
    A[Type Annotation] --> B[inspect_type]
    B --> C[TypeNode Graph]
    C --> D[Traverse with children]
    C --> E[Access metadata]
    C --> F[Query structure]

The library has three main layers:

Inspection layer - Functions that analyze types and produce nodes
Node layer - Dataclasses representing type structure
Configuration layer - Options controlling inspection behavior

The inspection layer¶

Entry points¶

The inspection layer provides focused functions for different inspection tasks:

Function	Purpose
`inspect_type()`	Inspect any type annotation
`inspect_class()`	Auto-detect and inspect a class
`inspect_dataclass()`	Inspect dataclass specifically
`inspect_typed_dict()`	Inspect TypedDict specifically
`inspect_function()`	Inspect function signature
`inspect_module()`	Discover types in a module

Each function accepts an optional InspectConfig to customize behavior.

The inspection process¶

When you call inspect_type(), the library:

Checks the cache - Returns cached result if available
Classifies the type - Determines which node type to create
Extracts qualifiers - Identifies ClassVar, Final, Required, etc.
Hoists metadata - Moves Annotated metadata to the base type (if enabled)
Recursively inspects - Processes nested types to build child nodes
Caches the result - Stores for future lookups

flowchart TD
    A[inspect_type called] --> B{In cache?}
    B -->|Yes| C[Return cached node]
    B -->|No| D[Classify type]
    D --> E[Extract qualifiers]
    E --> F{Is Annotated?}
    F -->|Yes| G[Hoist metadata]
    F -->|No| H[Create node]
    G --> H
    H --> I[Inspect children recursively]
    I --> J[Cache result]
    J --> K[Return node]

Metadata hoisting¶

When the library encounters Annotated[T, meta1, meta2], it can either:

Hoist metadata (default): Create a node for T with metadata=(meta1, meta2)
Preserve wrapper: Create an AnnotatedNode node containing the base and annotations

Hoisting simplifies working with annotated types since you get the underlying type directly with metadata attached.

# snippet - illustrative pseudocode
# With hoisting (default)
node = inspect_type(Annotated[str, MaxLen(100)])
# Returns: ConcreteNode(cls=str, metadata=(MaxLen(100),))

# Without hoisting
config = InspectConfig(hoist_metadata=False)
node = inspect_type(Annotated[str, MaxLen(100)], config=config)
# Returns: AnnotatedNode(base=ConcreteNode(cls=str), annotations=(MaxLen(100),))

Design trade-off: hoisting by default

Hoisting is the default because most use cases want to work with the underlying type while having convenient access to metadata. The alternative (always preserving AnnotatedNode wrappers) would require consumers to constantly unwrap types to get to the actual type structure.

However, some use cases genuinely need to distinguish "a string with metadata" from "a string." Round-trip serialization and type reconstruction are examples where preserving the exact annotation structure matters. That's why hoist_metadata=False exists.

The node layer¶

Node hierarchy¶

All type representations inherit from TypeNode, which provides the common interface:

classDiagram
    TypeNode <|-- ConcreteNode
    TypeNode <|-- GenericTypeNode
    TypeNode <|-- SubscriptedGenericNode
    TypeNode <|-- UnionNode
    TypeNode <|-- TupleNode
    TypeNode <|-- CallableNode
    TypeNode <|-- TypeVarNode
    TypeNode <|-- ForwardRefNode
    TypeNode <|-- AnyNode
    TypeNode <|-- NeverNode
    TypeNode <|-- LiteralNode

    class TypeNode {
        +source: SourceLocation
        +metadata: tuple
        +qualifiers: frozenset
        +children() Sequence
    }

    class ConcreteNode {
        +cls: type
    }

    class SubscriptedGenericNode {
        +origin: GenericTypeNode
        +args: tuple
    }

    class UnionNode {
        +members: tuple
    }

Node categories¶

Nodes fall into these categories based on what they represent:

Concrete types represent non-generic nominal types:

ConcreteNode - Simple types like int, str, custom classes

Generic types represent parameterized types:

GenericTypeNode - Unsubscripted generics like list, Dict
SubscriptedGenericNode - Applied generics like list[int]
GenericAliasNode - Generic class aliases

Composite types combine other types:

UnionNode - Union types (A | B, Union[A, B])
TupleNode - Tuple types (heterogeneous and homogeneous)
CallableNode - Callable signatures

Special forms represent typing system constructs:

AnyNode - typing.Any
NeverNode - typing.Never
SelfNode - typing.Self
LiteralNode - Literal[...] values

Type parameters represent generic placeholders:

TypeVarNode - Type variables
ParamSpecNode - Parameter specifications
TypeVarTupleNode - Variadic type variables

Structured types represent classes with fields:

DataclassNode - Dataclasses
TypedDictNode - TypedDict classes
NamedTupleNode - NamedTuple classes
ProtocolNode - Protocol definitions
EnumNode - Enum classes

Memory efficiency and immutability¶

The library implements all nodes as frozen dataclasses with slots=True for memory efficiency and immutability. This design:

Reduces memory footprint via __slots__
Guarantees immutability after construction via frozen=True
Enables safe caching (code cannot mutate nodes)
Makes nodes hashable (usable as dictionary keys and in sets)
Ensures thread safety for concurrent read access

Classes with computed fields (like _children) use object.__setattr__ in __post_init__ to set derived values during initialization, before the object becomes fully frozen.

The children method¶

Every node implements children() to enable graph traversal:

# snippet - simplified internal implementation
@dataclass(slots=True, frozen=True)
class SubscriptedGenericNode(TypeNode):
    origin: GenericTypeNode
    args: tuple[TypeNode, ...]

    def children(self) -> Sequence[TypeNode]:
        return self.args  # Type arguments are the children

What counts as "children" depends on the node type:

SubscriptedGenericNode → type arguments
UnionNode → union members
DataclassNode → field types
ConcreteNode → empty (leaf node)

Graph traversal¶

The children() method provides structural traversal, but many use cases require semantic context: knowing whether a child is a dictionary key versus value, or a function parameter versus return type. The edges() method complements children() by providing this relationship metadata through TypeEdgeConnection objects.

Edges caching¶

Both children() and edges() return pre-computed results with no allocation overhead at call time. The library computes these values once during node construction in __post_init__, storing them as tuples:

# snippet - simplified internal pattern
@dataclass(slots=True, frozen=True)
class SubscriptedGenericNode(TypeNode):
    origin: GenericTypeNode
    args: tuple[TypeNode, ...]
    _children: tuple[TypeNode, ...] = field(init=False, repr=False)
    _edges: tuple[TypeEdgeConnection, ...] = field(init=False, repr=False)

    def __post_init__(self) -> None:
        # Compute once at construction, before freeze
        object.__setattr__(self, "_children", self.args)
        object.__setattr__(self, "_edges", self._build_edges())

    def children(self) -> Sequence[TypeNode]:
        return self._children  # Direct return, no computation

    def edges(self) -> Sequence[TypeEdgeConnection]:
        return self._edges  # Direct return, no computation

This design follows the principle of paying for what you use: children() returns the lightweight tuple directly for simple traversal, while edges() provides the richer semantic context when needed. Neither method performs work at call time because both return pre-built results.

Why tuples instead of lists?

Tuples provide three benefits over lists. First, they are immutable, which aligns with the frozen dataclass design. Second, they are hashable, enabling nodes to participate in sets and dictionary keys. Third, they consume slightly less memory than equivalent lists. For collections that never change after construction, tuples are the natural choice.

Thread safety¶

The combination of frozen dataclasses and immutable tuples makes type nodes safe for concurrent read access without synchronization. Multiple threads can call children() and edges() simultaneously on the same node with no risk of data races or inconsistent views.

This thread safety emerges from the design rather than explicit locking. Because nodes cannot change after construction and the returned tuples are also immutable, there is no shared mutable state to protect. Each thread receives a reference to the same unchanging data.

The global inspection cache also benefits from this design. Once a node is cached, any thread can retrieve and traverse it safely. The only synchronization needed is in the cache insertion path, which the library handles internally.

For more on how edges encode semantic relationships like key/value and parameter/return, see Graph edges and semantic relationships.

MetadataCollection¶

Every TypeNode carries a metadata attribute containing any metadata attached via Annotated. The MetadataCollection class provides an immutable, type-safe container for this metadata.

Design principles¶

MetadataCollection follows the same design principles as the node layer:

Immutable: All transformation methods return new collections
Memory efficient: The EMPTY singleton avoids allocating empty collections
Thread-safe: Immutability enables safe concurrent access
Sequence protocol: Works naturally with standard Python patterns

Rich query API¶

The collection provides specialized methods for metadata operations:

Query: find(), find_all(), find_first(), get(), get_required()
Existence: has(), count(), is_empty
Filter: filter(), filter_by_type(), first(), any()
Protocol: find_protocol(), has_protocol(), count_protocol()
Transform: exclude(), unique(), sorted(), reversed(), partition(), map()
Introspection: types(), by_type()

See Metadata and Annotated for details on how metadata flows through the inspection process.

The configuration layer¶

InspectConfig¶

InspectConfig controls all aspects of inspection:

@dataclass(slots=True)
class InspectConfig:
    eval_mode: EvalMode = EvalMode.DEFERRED
    globalns: dict | None = None
    localns: dict | None = None
    max_depth: int | None = None
    hoist_metadata: bool = True
    include_source_locations: bool = False
    # ... class inspection options

Forward reference handling¶

The eval_mode option controls how the library handles forward references:

Mode	Behavior
`EAGER`	Resolve immediately, fail on errors
`DEFERRED`	Create `ForwardRefNode` nodes for unresolvable references
`STRINGIFIED`	Keep as strings, resolve lazily

DEFERRED (the default) provides the best balance: resolution when possible, graceful handling when not.

InspectContext¶

Internally, the library uses InspectContext to track state during inspection:

depth - Current recursion level for depth limiting
seen - Map of visited types for cycle detection
resolving - Forward references the library currently resolves

The library passes this context through recursive calls but hides it from the public API.

Caching¶

The library maintains a global cache of inspection results keyed by (type, config). This provides:

Performance - Repeated inspection of the same type is instant
Consistency - Same input always produces same node instance
Memory efficiency - Complex type graphs are only built once

Cache management functions:

cache_clear() - Clear all cached results
cache_info() - Get cache statistics (hits, misses, size)

Integration with typing-inspection¶

typing-graph builds on Pydantic's typing-inspection library for low-level type introspection. This provides:

Robust handling of Python version differences
Correct qualifier extraction
Forward reference evaluation utilities

The Qualifier Enum is re-exported from typing-inspection for convenience.

Design principles¶

The architecture reflects several deliberate design choices. Understanding these principles helps you predict how the library behaves and why.

Standard library first¶

The library minimizes dependencies, relying primarily on the standard library. External dependencies are carefully chosen:

typing-inspection - Battle-tested type introspection from Pydantic
typing-extensions - Backports of modern typing features
annotated-types (optional) - Standard constraint vocabulary

Why limit dependencies?

Dependencies create maintenance burden, compatibility constraints, and security surface area. For a library that may be used transitively by many projects, each dependency multiplies the risk of version conflicts. The standard library is always available and evolves with Python itself.

The dependencies typing-graph does include are chosen for their stability and widespread adoption. typing-inspection comes from the Pydantic ecosystem and handles the gnarliest version-specific edge cases. typing-extensions is effectively part of the standard library with a different release cadence.

Immutable by design¶

The library freezes all TypeNode dataclasses, guaranteeing immutability after construction. Tuples store metadata (not lists), and qualifier sets use frozensets. This immutability enables nodes to be hashable, cached safely, and accessed concurrently without synchronization.

The case for immutability

Immutability might seem like an unnecessary constraint. Why not let consumers modify nodes if they want to? The answer lies in the caching system. typing-graph caches inspection results to avoid repeatedly analyzing the same types. If nodes were mutable, cached nodes could be modified unexpectedly by any code that received them, breaking the cache's integrity.

Immutability also enables safe sharing in concurrent code. Multiple threads can hold references to the same node without synchronization because no thread can modify the shared state.

The trade-off is that creating modified versions of nodes requires constructing new instances. In practice, this rarely matters because most consumers read from the type graph rather than transforming it.

Lazy evaluation¶

The library uses lazy evaluation where possible:

The library builds type graphs on-demand
Forward references can defer resolution
Children use computed properties, not eager materialization

This approach keeps memory usage proportional to what you actually inspect, not to the theoretical size of the complete type graph.

Type safety¶

The library is fully typed and passes strict basedpyright checking. Type guard functions (is_concrete_node(), is_union_type_node(), etc.) enable type-safe node discrimination.

Relationship to the Python ecosystem¶

typing-graph builds on Pydantic's typing-inspection library, which provides battle-tested utilities for low-level type introspection. This relationship is deliberate: rather than reimplementing complex version-specific logic, typing-graph uses typing-inspection for the foundational layer and focuses on providing the graph abstraction.

The library also integrates with annotated-types for standard constraint types like Gt, Le, and MaxLen. This integration is optional (typing-graph works with any metadata objects) but provides convenient support for the emerging standard vocabulary of type constraints.

Practical application¶

Now that you understand the architecture, apply this knowledge:

Traverse the node hierarchy with Walking the type graph
Configure inspection behavior with Configuration options
Work with metadata using Metadata queries

Architecture overview¶

Why typing-graph exists¶

Core concepts¶

The inspection layer¶

Entry points¶

The inspection process¶

Metadata hoisting¶

The node layer¶

Node hierarchy¶

Node categories¶

Memory efficiency and immutability¶

The children method¶

Graph traversal¶

Edges caching¶

Thread safety¶

MetadataCollection¶

Design principles¶

Rich query API¶

The configuration layer¶

InspectConfig¶

Forward reference handling¶

InspectContext¶

Caching¶

Integration with typing-inspection¶

Design principles¶

Standard library first¶

Immutable by design¶

Lazy evaluation¶

Type safety¶

Relationship to the Python ecosystem¶

Practical application¶

See also¶