Skip to content

How to combine and transform metadata

This guide shows you how to transform MetadataCollection instances through combining, sorting, deduplicating, and mapping operations. You'll learn to merge collections, remove duplicates, sort by custom criteria, and extract values.

Quick reference

Goal Method Returns
Merge two collections + / \| MetadataCollection
Exclude specific types exclude() MetadataCollection
Expand grouped constraints flatten() MetadataCollection
Recursively expand all groups flatten_deep() MetadataCollection
Remove duplicates unique() MetadataCollection
Sort items sorted() MetadataCollection
Reverse order reversed() MetadataCollection
Extract values map() tuple
Split by condition partition() tuple[MetadataCollection, MetadataCollection]
Get unique types types() frozenset[type]
Group by type by_type() Mapping[type, MetadataCollection]

Merging metadata from multiple sources

add (+) operator

Use + to combine two collections:

from typing_graph import MetadataCollection

a = MetadataCollection(_items=(1, 2))
b = MetadataCollection(_items=(3, 4))

combined = a + b
print(list(combined))  # [1, 2, 3, 4]

or (|) operator

The | operator works identically to +:

from typing_graph import MetadataCollection

a = MetadataCollection(_items=(1, 2))
b = MetadataCollection(_items=(3, 4))

combined = a | b
print(list(combined))  # [1, 2, 3, 4]

Operator equivalence

The + and | operators are completely equivalent, so use whichever reads better in your context. The | operator is often more intuitive when thinking of collections as sets being combined.

Preserving order

Both operators preserve the order of items from left to right:

from typing_graph import MetadataCollection

first = MetadataCollection(_items=("a", "b"))
second = MetadataCollection(_items=("c", "d"))
third = MetadataCollection(_items=("e", "f"))

# Chain multiple combinations
result = first + second + third
print(list(result))  # ['a', 'b', 'c', 'd', 'e', 'f']

Excluding specific metadata types

exclude() by type

Use exclude() to remove items of specific types:

from typing_graph import MetadataCollection

coll = MetadataCollection(_items=("a", 1, "b", 2))
no_strings = coll.exclude(str)
print(list(no_strings))  # [1, 2]

Chaining exclusions

Exclude multiple types by chaining calls or passing multiple types:

from typing_graph import MetadataCollection

coll = MetadataCollection(_items=("a", 1, True, 2.5, "b"))

# Chain exclusions
result = coll.exclude(str).exclude(bool)
print(list(result))  # [1, 2.5]

# Or exclude multiple types at once
result = coll.exclude(str, bool)
print(list(result))  # [1, 2.5]

Expanding grouped constraints

GroupedMetadata is a protocol from the annotated-types library for metadata that contains other metadata. For example, Interval(ge=0, le=100) groups Ge(0) and Le(100) into a single constraint. See GroupedMetadata automatic flattening for more background.

flatten() single level

Use flatten() to expand GroupedMetadata items one level:

from annotated_types import Ge, Interval, Le
from typing_graph import MetadataCollection

interval = Interval(ge=5, le=15)
coll = MetadataCollection.of([interval], auto_flatten=False)
flattened = coll.flatten()
print(list(flattened))  # [Ge(ge=5), Le(le=15)]

flatten_deep() recursive

Use flatten_deep() to recursively expand nested GroupedMetadata:

from typing_graph import MetadataCollection

# For deeply nested GroupedMetadata structures
coll = MetadataCollection(_items=(1, 2, 3))
deep_flat = coll.flatten_deep()
print(list(deep_flat))  # [1, 2, 3]
When to use flatten() vs flatten_deep()
Method Behavior Use when
flatten() Expands one level You only want immediate children expanded
flatten_deep() Recursively expands all levels You want all nested GroupedMetadata fully unwrapped

In practice, flatten() is sufficient for most cases since annotated-types constraints don't nest GroupedMetadata deeply. Use flatten_deep() only if you have custom GroupedMetadata implementations that might be nested.

auto_flatten parameter

Both flatten methods return self if no GroupedMetadata items exist, avoiding unnecessary allocations.

When creating collections, auto_flatten=True (the default) automatically flattens:

from annotated_types import Interval
from typing_graph import MetadataCollection

interval = Interval(ge=0, le=100)

# Default: auto_flatten=True
coll = MetadataCollection.of([interval])
print(len(coll))  # 2 (Ge and Le)

# Disable flattening
coll = MetadataCollection.of([interval], auto_flatten=False)
print(len(coll))  # 1 (the Interval itself)

Removing duplicates and sorting

unique() for duplicates

Use unique() to remove duplicate items:

from typing_graph import MetadataCollection

coll = MetadataCollection(_items=(1, 2, 1, 3, 2))
unique = coll.unique()
print(list(unique))  # [1, 2, 3]

The method preserves first occurrence order and handles unhashable items:

from typing_graph import MetadataCollection

# Works with unhashable items too
coll = MetadataCollection(_items=([1, 2], [3, 4], [1, 2]))
unique = coll.unique()
print(list(unique))  # [[1, 2], [3, 4]]

Performance with unhashable items

unique() is O(n) for hashable items but O(n^2) for unhashable items. For collections with unhashable items and many duplicates, consider filtering before calling unique().

sorted() with default key

Use sorted() to sort items:

from typing_graph import MetadataCollection

coll = MetadataCollection(_items=(3, 1, 2))
sorted_coll = coll.sorted()
print(list(sorted_coll))  # [1, 2, 3]

The default sort key groups items by type name first, then by value:

from typing_graph import MetadataCollection

coll = MetadataCollection(_items=("b", 2, "a", 1))
sorted_coll = coll.sorted()
# Integers before strings (alphabetically by type name)
print(list(sorted_coll))  # [1, 2, 'a', 'b']

sorted() with custom key

Provide a custom key function for specific ordering:

from typing_graph import MetadataCollection

coll = MetadataCollection(_items=("bb", "a", "ccc"))
by_len = coll.sorted(key=len)
print(list(by_len))  # ['a', 'bb', 'ccc']
from dataclasses import dataclass
from typing_graph import MetadataCollection

@dataclass(frozen=True)
class Constraint:
    value: int

coll = MetadataCollection(_items=(
    Constraint(10),
    Constraint(5),
    Constraint(20)
))
by_value = coll.sorted(key=lambda c: c.value)
print(list(by_value))  # [Constraint(value=5), Constraint(value=10), Constraint(value=20)]

reversed() ordering

Use reversed() to reverse the order:

from typing_graph import MetadataCollection

coll = MetadataCollection(_items=(1, 2, 3))
rev = coll.reversed()
print(list(rev))  # [3, 2, 1]

Combine with sorted() for descending order:

from typing_graph import MetadataCollection

coll = MetadataCollection(_items=(3, 1, 2))
descending = coll.sorted().reversed()
print(list(descending))  # [3, 2, 1]

Extracting values from metadata

map() returns tuple

Use map() to transform items. This is a terminal operation that returns a tuple, not a new collection:

Terminal operation

map() returns a tuple, not a MetadataCollection. This is intentional because the transformed values may not be valid metadata items. If you need to chain operations, apply map() last.

from typing_graph import MetadataCollection

coll = MetadataCollection(_items=(1, 2, 3))
doubled = coll.map(lambda x: x * 2)
print(doubled)       # (2, 4, 6)
print(type(doubled)) # <class 'tuple'>

Extract specific attributes:

from dataclasses import dataclass
from typing_graph import MetadataCollection

@dataclass(frozen=True)
class Constraint:
    value: int

coll = MetadataCollection(_items=(Constraint(10), Constraint(5), Constraint(20)))
values = coll.map(lambda c: c.value)
print(values)  # (10, 5, 20)

partition() for splitting

Use partition() to split a collection by a predicate:

from typing_graph import MetadataCollection

coll = MetadataCollection(_items=(1, 2, 3, 4, 5))
evens, odds = coll.partition(lambda x: x % 2 == 0)
print(list(evens))  # [2, 4]
print(list(odds))   # [1, 3, 5]

The first collection contains items where the predicate is True, the second where it's False:

from dataclasses import dataclass
from typing_graph import MetadataCollection

@dataclass(frozen=True)
class Constraint:
    value: int
    strict: bool = False

coll = MetadataCollection(_items=(
    Constraint(0, strict=True),
    Constraint(10, strict=False),
    Constraint(5, strict=True)
))
strict, lenient = coll.partition(lambda c: c.strict)
print(list(strict))   # [Constraint(value=0, strict=True), Constraint(value=5, strict=True)]
print(list(lenient))  # [Constraint(value=10, strict=False)]

Analyzing metadata composition

types() for unique types

Use types() to get the unique types in a collection:

from typing_graph import MetadataCollection

coll = MetadataCollection(_items=("a", 1, "b", 2.0))
types = coll.types()
print(sorted(t.__name__ for t in types))  # ['float', 'int', 'str']

Returns a frozenset of types:

from typing_graph import MetadataCollection

coll = MetadataCollection(_items=(1, 2, 3))
types = coll.types()
print(type(types))  # <class 'frozenset'>
print(int in types)  # True

by_type() for grouping

Use by_type() to group items by their type:

from typing_graph import MetadataCollection

coll = MetadataCollection(_items=("a", 1, "b", 2))
grouped = coll.by_type()
print(list(grouped[str]))  # ['a', 'b']
print(list(grouped[int]))  # [1, 2]

The returned mapping is immutable:

from typing_graph import MetadataCollection

coll = MetadataCollection(_items=(1, 2, "a"))
grouped = coll.by_type()

# Access groups
for type_, items in grouped.items():
    print(f"{type_.__name__}: {list(items)}")
# int: [1, 2]
# str: ['a']

Result

You can now merge collections with + or |, exclude types with exclude(), expand grouped constraints with flatten() and flatten_deep(), remove duplicates with unique(), sort with sorted(), extract values with map(), split with partition(), and analyze composition with types() and by_type().

See also