Architecture overview¶
This page explains the high-level architecture of typing-graph, how its components work together, and the reasoning behind the design decisions. Understanding the architecture helps you work effectively with the library and appreciate why certain choices were made.
Why typing-graph exists¶
Python's typing module provides rich type annotation capabilities, but working with these annotations programmatically is surprisingly difficult. The standard library offers get_origin(), get_args(), and get_type_hints(), but these low-level primitives leave significant work for library authors:
- Handling the many edge cases across Python versions
- Traversing nested types recursively
- Extracting metadata from
Annotatedwrappers - Managing forward reference resolution
- Dealing with the subtle differences between
typing.Unionandtypes.UnionType
typing-graph addresses these challenges by providing a unified, graph-based representation of type annotations. Rather than working with raw type objects and their quirks, you work with a consistent node hierarchy that handles the complexity internally.
Core concepts¶
typing-graph transforms Python type annotations into a traversable graph of nodes. Each node represents a component of a type annotation, from simple types like int to complex nested generics like dict[str, list[Annotated[int, Gt(0)]]].
flowchart LR
A[Type Annotation] --> B[inspect_type]
B --> C[TypeNode Graph]
C --> D[Traverse with children]
C --> E[Access metadata]
C --> F[Query structure]
The library has three main layers:
- Inspection layer - Functions that analyze types and produce nodes
- Node layer - Dataclasses representing type structure
- Configuration layer - Options controlling inspection behavior
The inspection layer¶
Entry points¶
The inspection layer provides focused functions for different inspection tasks:
| Function | Purpose |
|---|---|
inspect_type() |
Inspect any type annotation |
inspect_class() |
Auto-detect and inspect a class |
inspect_dataclass() |
Inspect dataclass specifically |
inspect_typed_dict() |
Inspect TypedDict specifically |
inspect_function() |
Inspect function signature |
inspect_module() |
Discover types in a module |
Each function accepts an optional InspectConfig to customize behavior.
The inspection process¶
When you call inspect_type(), the library:
- Checks the cache - Returns cached result if available
- Classifies the type - Determines which node type to create
- Extracts qualifiers - Identifies
ClassVar,Final,Required, etc. - Hoists metadata - Moves
Annotatedmetadata to the base type (if enabled) - Recursively inspects - Processes nested types to build child nodes
- Caches the result - Stores for future lookups
flowchart TD
A[inspect_type called] --> B{In cache?}
B -->|Yes| C[Return cached node]
B -->|No| D[Classify type]
D --> E[Extract qualifiers]
E --> F{Is Annotated?}
F -->|Yes| G[Hoist metadata]
F -->|No| H[Create node]
G --> H
H --> I[Inspect children recursively]
I --> J[Cache result]
J --> K[Return node]
Metadata hoisting¶
When the library encounters Annotated[T, meta1, meta2], it can either:
- Hoist metadata (default): Create a node for
Twithmetadata=(meta1, meta2) - Preserve wrapper: Create an
AnnotatedNodenode containing the base and annotations
Hoisting simplifies working with annotated types since you get the underlying type directly with metadata attached.
# snippet - illustrative pseudocode
# With hoisting (default)
node = inspect_type(Annotated[str, MaxLen(100)])
# Returns: ConcreteNode(cls=str, metadata=(MaxLen(100),))
# Without hoisting
config = InspectConfig(hoist_metadata=False)
node = inspect_type(Annotated[str, MaxLen(100)], config=config)
# Returns: AnnotatedNode(base=ConcreteNode(cls=str), annotations=(MaxLen(100),))
Design trade-off: hoisting by default
Hoisting is the default because most use cases want to work with the underlying type while having convenient access to metadata. The alternative (always preserving AnnotatedNode wrappers) would require consumers to constantly unwrap types to get to the actual type structure.
However, some use cases genuinely need to distinguish "a string with metadata" from "a string." Round-trip serialization and type reconstruction are examples where preserving the exact annotation structure matters. That's why hoist_metadata=False exists.
The node layer¶
Node hierarchy¶
All type representations inherit from TypeNode, which provides the common interface:
classDiagram
TypeNode <|-- ConcreteNode
TypeNode <|-- GenericTypeNode
TypeNode <|-- SubscriptedGenericNode
TypeNode <|-- UnionNode
TypeNode <|-- TupleNode
TypeNode <|-- CallableNode
TypeNode <|-- TypeVarNode
TypeNode <|-- ForwardRefNode
TypeNode <|-- AnyNode
TypeNode <|-- NeverNode
TypeNode <|-- LiteralNode
class TypeNode {
+source: SourceLocation
+metadata: tuple
+qualifiers: frozenset
+children() Sequence
}
class ConcreteNode {
+cls: type
}
class SubscriptedGenericNode {
+origin: GenericTypeNode
+args: tuple
}
class UnionNode {
+members: tuple
}
Node categories¶
Nodes fall into these categories based on what they represent:
Concrete types represent non-generic nominal types:
ConcreteNode- Simple types likeint,str, custom classes
Generic types represent parameterized types:
GenericTypeNode- Unsubscripted generics likelist,DictSubscriptedGenericNode- Applied generics likelist[int]GenericAliasNode- Generic class aliases
Composite types combine other types:
UnionNode- Union types (A | B,Union[A, B])TupleNode- Tuple types (heterogeneous and homogeneous)CallableNode- Callable signatures
Special forms represent typing system constructs:
AnyNode-typing.AnyNeverNode-typing.NeverSelfNode-typing.SelfLiteralNode-Literal[...]values
Type parameters represent generic placeholders:
TypeVarNode- Type variablesParamSpecNode- Parameter specificationsTypeVarTupleNode- Variadic type variables
Structured types represent classes with fields:
DataclassNode- DataclassesTypedDictNode- TypedDict classesNamedTupleNode- NamedTuple classesProtocolNode- Protocol definitionsEnumNode- Enum classes
Memory efficiency and immutability¶
The library implements all nodes as frozen dataclasses with slots=True for memory efficiency and immutability. This design:
- Reduces memory footprint via
__slots__ - Guarantees immutability after construction via
frozen=True - Enables safe caching (code cannot mutate nodes)
- Makes nodes hashable (usable as dictionary keys and in sets)
- Ensures thread safety for concurrent read access
Classes with computed fields (like _children) use object.__setattr__ in __post_init__ to set derived values during initialization, before the object becomes fully frozen.
The children method¶
Every node implements children() to enable graph traversal:
# snippet - simplified internal implementation
@dataclass(slots=True, frozen=True)
class SubscriptedGenericNode(TypeNode):
origin: GenericTypeNode
args: tuple[TypeNode, ...]
def children(self) -> Sequence[TypeNode]:
return self.args # Type arguments are the children
What counts as "children" depends on the node type:
SubscriptedGenericNode→ type argumentsUnionNode→ union membersDataclassNode→ field typesConcreteNode→ empty (leaf node)
Graph traversal¶
The children() method provides structural traversal, but many use cases require semantic context: knowing whether a child is a dictionary key versus value, or a function parameter versus return type. The edges() method complements children() by providing this relationship metadata through TypeEdgeConnection objects.
Edges caching¶
Both children() and edges() return pre-computed results with no allocation overhead at call time. The library computes these values once during node construction in __post_init__, storing them as tuples:
# snippet - simplified internal pattern
@dataclass(slots=True, frozen=True)
class SubscriptedGenericNode(TypeNode):
origin: GenericTypeNode
args: tuple[TypeNode, ...]
_children: tuple[TypeNode, ...] = field(init=False, repr=False)
_edges: tuple[TypeEdgeConnection, ...] = field(init=False, repr=False)
def __post_init__(self) -> None:
# Compute once at construction, before freeze
object.__setattr__(self, "_children", self.args)
object.__setattr__(self, "_edges", self._build_edges())
def children(self) -> Sequence[TypeNode]:
return self._children # Direct return, no computation
def edges(self) -> Sequence[TypeEdgeConnection]:
return self._edges # Direct return, no computation
This design follows the principle of paying for what you use: children() returns the lightweight tuple directly for simple traversal, while edges() provides the richer semantic context when needed. Neither method performs work at call time because both return pre-built results.
Why tuples instead of lists?
Tuples provide three benefits over lists. First, they are immutable, which aligns with the frozen dataclass design. Second, they are hashable, enabling nodes to participate in sets and dictionary keys. Third, they consume slightly less memory than equivalent lists. For collections that never change after construction, tuples are the natural choice.
Thread safety¶
The combination of frozen dataclasses and immutable tuples makes type nodes safe for concurrent read access without synchronization. Multiple threads can call children() and edges() simultaneously on the same node with no risk of data races or inconsistent views.
This thread safety emerges from the design rather than explicit locking. Because nodes cannot change after construction and the returned tuples are also immutable, there is no shared mutable state to protect. Each thread receives a reference to the same unchanging data.
The global inspection cache also benefits from this design. Once a node is cached, any thread can retrieve and traverse it safely. The only synchronization needed is in the cache insertion path, which the library handles internally.
For more on how edges encode semantic relationships like key/value and parameter/return, see Graph edges and semantic relationships.
MetadataCollection¶
Every TypeNode carries a metadata attribute containing any metadata attached via Annotated. The MetadataCollection class provides an immutable, type-safe container for this metadata.
Design principles¶
MetadataCollection follows the same design principles as the node layer:
- Immutable: All transformation methods return new collections
- Memory efficient: The
EMPTYsingleton avoids allocating empty collections - Thread-safe: Immutability enables safe concurrent access
- Sequence protocol: Works naturally with standard Python patterns
Rich query API¶
The collection provides specialized methods for metadata operations:
- Query:
find(),find_all(),find_first(),get(),get_required() - Existence:
has(),count(),is_empty - Filter:
filter(),filter_by_type(),first(),any() - Protocol:
find_protocol(),has_protocol(),count_protocol() - Transform:
exclude(),unique(),sorted(),reversed(),partition(),map() - Introspection:
types(),by_type()
See Metadata and Annotated for details on how metadata flows through the inspection process.
The configuration layer¶
InspectConfig¶
InspectConfig controls all aspects of inspection:
@dataclass(slots=True)
class InspectConfig:
eval_mode: EvalMode = EvalMode.DEFERRED
globalns: dict | None = None
localns: dict | None = None
max_depth: int | None = None
hoist_metadata: bool = True
include_source_locations: bool = False
# ... class inspection options
Forward reference handling¶
The eval_mode option controls how the library handles forward references:
| Mode | Behavior |
|---|---|
EAGER |
Resolve immediately, fail on errors |
DEFERRED |
Create ForwardRefNode nodes for unresolvable references |
STRINGIFIED |
Keep as strings, resolve lazily |
DEFERRED (the default) provides the best balance: resolution when possible, graceful handling when not.
InspectContext¶
Internally, the library uses InspectContext to track state during inspection:
- depth - Current recursion level for depth limiting
- seen - Map of visited types for cycle detection
- resolving - Forward references the library currently resolves
The library passes this context through recursive calls but hides it from the public API.
Caching¶
The library maintains a global cache of inspection results keyed by (type, config). This provides:
- Performance - Repeated inspection of the same type is instant
- Consistency - Same input always produces same node instance
- Memory efficiency - Complex type graphs are only built once
Cache management functions:
cache_clear()- Clear all cached resultscache_info()- Get cache statistics (hits, misses, size)
Integration with typing-inspection¶
typing-graph builds on Pydantic's typing-inspection library for low-level type introspection. This provides:
- Robust handling of Python version differences
- Correct qualifier extraction
- Forward reference evaluation utilities
The Qualifier Enum is re-exported from typing-inspection for convenience.
Design principles¶
The architecture reflects several deliberate design choices. Understanding these principles helps you predict how the library behaves and why.
Standard library first¶
The library minimizes dependencies, relying primarily on the standard library. External dependencies are carefully chosen:
- typing-inspection - Battle-tested type introspection from Pydantic
- typing-extensions - Backports of modern typing features
- annotated-types (optional) - Standard constraint vocabulary
Why limit dependencies?
Dependencies create maintenance burden, compatibility constraints, and security surface area. For a library that may be used transitively by many projects, each dependency multiplies the risk of version conflicts. The standard library is always available and evolves with Python itself.
The dependencies typing-graph does include are chosen for their stability and widespread adoption. typing-inspection comes from the Pydantic ecosystem and handles the gnarliest version-specific edge cases. typing-extensions is effectively part of the standard library with a different release cadence.
Immutable by design¶
The library freezes all TypeNode dataclasses, guaranteeing immutability after construction. Tuples store metadata (not lists), and qualifier sets use frozensets. This immutability enables nodes to be hashable, cached safely, and accessed concurrently without synchronization.
The case for immutability
Immutability might seem like an unnecessary constraint. Why not let consumers modify nodes if they want to? The answer lies in the caching system. typing-graph caches inspection results to avoid repeatedly analyzing the same types. If nodes were mutable, cached nodes could be modified unexpectedly by any code that received them, breaking the cache's integrity.
Immutability also enables safe sharing in concurrent code. Multiple threads can hold references to the same node without synchronization because no thread can modify the shared state.
The trade-off is that creating modified versions of nodes requires constructing new instances. In practice, this rarely matters because most consumers read from the type graph rather than transforming it.
Lazy evaluation¶
The library uses lazy evaluation where possible:
- The library builds type graphs on-demand
- Forward references can defer resolution
- Children use computed properties, not eager materialization
This approach keeps memory usage proportional to what you actually inspect, not to the theoretical size of the complete type graph.
Type safety¶
The library is fully typed and passes strict basedpyright checking. Type guard functions (is_concrete_node(), is_union_type_node(), etc.) enable type-safe node discrimination.
Relationship to the Python ecosystem¶
typing-graph builds on Pydantic's typing-inspection library, which provides battle-tested utilities for low-level type introspection. This relationship is deliberate: rather than reimplementing complex version-specific logic, typing-graph uses typing-inspection for the foundational layer and focuses on providing the graph abstraction.
The library also integrates with annotated-types for standard constraint types like Gt, Le, and MaxLen. This integration is optional (typing-graph works with any metadata objects) but provides convenient support for the emerging standard vocabulary of type constraints.
Practical application¶
Now that you understand the architecture, apply this knowledge:
- Traverse the node hierarchy with Walking the type graph
- Configure inspection behavior with Configuration options
- Work with metadata using Metadata queries
See also¶
- Type node - Glossary definition
- Type graph - Glossary definition
- Metadata hoisting - Glossary definition