← Back

When Does Homomorphism Expressivity Exist? A Meta-Theorem for Stabilizing Refinement GNNs

Metadata

Table of Contents

  1. 1. Introduction: why WL-style qualitative expressiveness is insufficient; recap homomorphism expressivity as a complete invariant; goal of a meta-theorem and ‘expressivity certificate’ conditions.
  2. 2. Preliminaries: graphs, rooted graphs, homomorphisms and spasms; refinement procedures; categorical product; overview of Dell–Grohe–Rattan-style linear relations used in completeness proofs.
  3. 3. The RWA Architecture Class: formal definition of tuple-state refinement with a neighbor relation template; invariant vs equivariant outputs; examples (MPNN, Subgraph, Local k, folklore).
  4. 4. Axioms for Existence and Uniqueness: (A1) equivariance, (A2) stabilization, (A3) locality schema, (A4) compositionality, (A5) unfolding identifiability; discussion, necessity, and a taxonomy of failure modes.
  5. 5. Upper Bound / Completeness: unfolding objects and bag-isomorphism counting; show representations are determined by hom counts on a derived pattern basis; matrix factorization argument (bag-isomorphism ↔︎ hom).
  6. 6. Lower Bound / Maximality: generalized pebble games parameterized by the neighbor relation template; construction of generalized F"urer gadgets and twists; rooted (clique-augmented) gadgets for node/edge-level; prove that every forbidden pattern yields a separating pair.
  7. 7. Recipe to Derive F𝒜: how to compute/describe the candidate family from an architecture spec; conversion of unfolding-basis descriptions into structural decompositions (treewidth/NED-like) when possible; guidance for new architectures.
  8. 8. Instantiations and Corollaries: recover known FM for standard GNNs; formalize the pathological counterexample and show exactly which axiom it violates; additional examples (distance-guarded variants, fixed sparse-attention variants).
  9. 9. Complexity and Constructivity: size bounds for gadget counterexamples; decidability/complexity of membership tests under common structural characterizations; limits of full automation.
  10. 10. Discussion and 2026 Outlook: how the meta-theorem can underwrite expressivity claims for modern models; open directions (learned neighborhoods, approximate variants, transformers).

Content

1. Introduction: why WL-style qualitative expressiveness is insufficient; recap homomorphism expressivity as a complete invariant; goal of a meta-theorem and ‘expressivity certificate’ conditions.

The contemporary theory of message-passing and Weisfeiler–Leman (WL) style graph representations has largely developed around a notion of expressiveness: an architecture is deemed stronger if it distinguishes strictly more non-isomorphic graphs, or if it simulates a higher-dimensional refinement. While such comparisons are useful as a first taxonomy, they do not by themselves constitute a complete description of what a representation . In particular, (i) qualitative distinguishability is insensitive to the of the separating signal (one witness substructure versus exponentially many), (ii) it fails to capture the many settings where the downstream task is inherently -like (motifs, partition functions, kernel features), and (iii) it does not directly yield a tight ``if and only if’’ characterization of the information preserved by a stable representation.

A canonical illustration is provided by pairs of graphs that are indistinguishable by a given refinement (say, 1-WL), but that differ drastically in the number of occurrences of a fixed pattern. In such a case, any invariant extracted solely from the stable colors cannot recover that pattern statistic; yet qualitative language—``the architecture cannot distinguish the graphs’’—does not specify statistics are lost, nor does it provide a systematic recipe to certify loss of information for pattern outside the architecture’s range. Conversely, when an architecture distinguish two graphs, qualitative statements do not identify a canonical family of witnesses that explains the distinction in a uniform algebraic way.

For refinement-based architectures, there is a well-established quantitative alternative: . For a fixed pattern graph F, the number hom(F, G) of homomorphisms from F to an input graph G is a robust, compositional statistic. First, it linearizes naturally: many classical graph invariants can be expressed as (finite) linear combinations of homomorphism counts, and conversely, subgraph counts can be recovered from homomorphism counts through inclusion–exclusion over quotients (the ``spasm’’ formalism). Second, homomorphism counts behave well under the standard algebraic operations on graphs (in particular, disjoint union and categorical product), which makes them compatible with closure properties that arise implicitly when one analyzes stable refinement procedures. Third, and crucially for our purposes, homomorphism counts admit relative to refinement: for several canonical variants of WL and local k-tuple methods, equality of stable representations is equivalent to equality of hom(F, ⋅) over an explicitly characterized family of patterns F.

Our objective is to lift these completeness phenomena to a single meta-theorem that treats a broad class of architectures uniformly. We do not fix a particular GNN or a particular WL dimension a priori. Instead, we consider an abstract refinement-with-aggregation (RWA) architecture 𝒜 specified by (a) an initialization of colors on k-tuples, (b) an equivariant round update that aggregates colors over a fixed neighbor-relation template R on k-tuples, and (c) an invariant (graph-level) or equivariant (node-/edge-level) readout of the stabilized colors. Under mild axioms ensuring that 𝒜 behaves like a genuine refinement procedure rather than an ad hoc encoding, we aim to associate to 𝒜 a family of patterns F𝒜 that plays the role of a complete quantitative invariant: the stable representation χ𝒜(G) should be determined exactly by the vector (hom(F, G))F ∈ F𝒜, and conversely, if two graphs have identical stable representations, then their homomorphism counts agree for all F ∈ F𝒜. The maximality requirement is essential: it is not merely that there exists family that is complete, but that there is a unique such family, so that membership F ∈ F𝒜 is the sharp dividing line between what the architecture can and cannot determine from its stable output.

To make such a statement meaningful across architectures, we must articulate what constitutes an ``expressivity certificate.’’ Informally, a certificate for 𝒜 should provide three ingredients.

We require an unfolding semantics that explains stabilized colors as isomorphism types of local combinatorial objects generated by iterating R. For 1-WL this basis is the family of rooted unfolding trees; for higher-dimensional methods one obtains hypertree-like objects or canonical decomposition fragments. The meta-theorem demands that the architecture admit such a basis , so that the stable colors do not hide information unrelated to the refinement dynamics. This is the role of an unfolding identifiability condition: distinct stable colors must correspond to non-isomorphic unfoldings.

Counting unfolding types inside an input graph is not yet the same as counting homomorphisms from ordinary patterns, because unfoldings are typically rooted, layered, or bagged objects rather than plain graphs. The certificate therefore includes a reduction showing that the relevant counts (bag-isomorphisms, typed embeddings, or their analogues) are expressible as a finite triangular linear system in homomorphism counts of an induced family of pattern graphs. This is the mechanism behind completeness: once the stable representation is determined by counts of unfolding types, and those are in turn determined by homomorphism counts of induced patterns, we obtain a homomorphism characterization of representation equality.

Maximality requires more than an upper bound. For every pattern F that is compatible with the unfolding basis, we need an explicit pair of graphs GF, HF such that 𝒜 produces identical stable representations on GF and HF, yet hom(F, GF) ≠ hom(F, HF). Such pairs witness that hom(F, ⋅) is determined by χ𝒜(⋅) and therefore that F ∉ F𝒜. The meta-theorem we develop demands a uniform construction of these separators from F, typically via a generalized F"urer-type gadget together with a ``twist’’ operation that preserves the architecture’s indistinguishability while flipping the target homomorphism count.

The axioms imposed on 𝒜 should be read precisely as conditions ensuring that the above three ingredients can be made canonical and architecture-independent. Equivariance (or invariance) of initialization and updates is necessary so that χ𝒜 respects isomorphism and the unfolding semantics is well-defined. Stabilization ensures that the refinement reaches a fixed point on every finite graph, so that the representation is a finite object and is amenable to counting arguments. The fixed neighbor-relation template R is the locus of locality: it is what permits the inductive definition of unfoldings and the extraction of a grammar of local witnesses. Compositional closure under disjoint union and categorical product rules out special-purpose behavior that would break the homomorphism algebra and would obstruct maximality; it is also the structural reason why homomorphism counts, rather than arbitrary statistics, emerge as the appropriate complete invariants. Finally, unfolding identifiability ensures that the architecture does not conflate non-isomorphic unfoldings into the same stable color in an uncontrolled way; without this, the unfolding basis would cease to be an invariant semantics of the representation, and the resulting ``expressivity family’’ could fail to exist.

We emphasize that our aim is not to re-prove isolated results for particular architectures, but to isolate a reusable proof pattern: from R and the refinement update, we derive an induced family of unfolding objects and hence an induced family of patterns, yielding a completeness direction; then, from violations of the induced structural constraints, we derive explicit gadget separators, yielding maximality. The outcome is a certificate that can be instantiated to recover known homomorphism characterizations (forests for message passing, bounded treewidth for suitable k-tuple models, and so on) while also applying to less standard neighbor relations, provided they satisfy the axioms.

The remainder of the paper is organized to support this program. In the next section we fix notation for graphs (including rooted variants), homomorphisms and their relation to subgraph counts, and we formalize refinement procedures in the RWA model. We also recall the categorical product operation and summarize the Dell–Grohe–Rattan style linear relations that convert between bagged counts and homomorphism counts, which will serve as the algebraic backbone of the completeness proofs. Subsequent sections then implement the certificate: we derive the unfolding basis associated with R, prove completeness via the resulting linear system, and establish maximality by constructing gadget pairs that witness the failure of any pattern outside the maximal family.


2. Preliminaries: graphs, rooted graphs, homomorphisms and spasms; refinement procedures; categorical product; overview of Dell–Grohe–Rattan-style linear relations used in completeness proofs.

We collect the basic objects and algebraic identities used throughout. All graphs are finite, simple, undirected, and vertex-labeled. We write 𝒢 for the class of such graphs.

A (labeled) graph is a triple G = (VG, EG, G), where VG is a finite set, $E_G\subseteq \binom{V_G}{2}$ is an undirected edge set, and G : VG → Σ is a label function into a fixed countable label set Σ. An isomorphism φ : G → H is a bijection φ : VG → VH such that {u, v} ∈ EG ⇔ {φ(u), φ(v)} ∈ EH and G(u) = H(φ(u)) for all u ∈ VG. We write G ≅ H if an isomorphism exists.

We frequently use rooted and multi-rooted variants. For u ∈ VG we write Gu for the rooted graph obtained by designating u as a distinguished vertex. For u, v ∈ VG we analogously write Guv for the graph rooted at an ordered pair (u, v); the order is important when we later treat k-tuples. More generally, for a tuple u = (u1, …, uk) ∈ VGk we write Gu for the k-rooted graph. Root-preserving isomorphisms are defined in the obvious way: φ : Gu → Hv must satisfy φ(ui) = vi for all i ∈ [k].

Let F, G ∈ 𝒢. A graph homomorphism h : F → G is a map h : VF → VG such that
{x, y} ∈ EF ⇒ {h(x), h(y)} ∈ EG,   and   F(x) = G(h(x))  ∀x ∈ VF.
We denote by Hom(F, G) the set of all such homomorphisms, and by hom(F, G) = |Hom(F, G)| its cardinality. For rooted graphs we enforce root preservation: for a ∈ VFk and u ∈ VGk we write Hom(Fa, Gu) for the homomorphisms h : F → G satisfying h(ai) = ui for all i ∈ [k], and hom(Fa, Gu) for their number.

We also use the related notion of injective embeddings and subgraph counts. Writing Emb(F, G) for the set of label-preserving injective homomorphisms, we define emb(F, G) = |Emb(F, G)|. If F is unlabeled or labels are rigid, one may further divide by |Aut(F)| to obtain the number sub(F, G) of (not-necessarily-induced) subgraphs isomorphic to F. For our purposes the fundamental statistic is hom(F, G), because it interacts cleanly with the graph operations that arise from refinement dynamics.

Homomorphisms conflate vertices of F, and it is therefore convenient to organize this conflation explicitly. A partition π of VF is if all vertices within a block have the same label. From such a partition we form a quotient graph F/π by contracting each block to a single vertex and placing an edge between two blocks if there exists at least one edge of F between the corresponding blocks. (Since our graphs are simple, loops produced by contracting an edge within a block are discarded; equivalently, we may restrict to partitions π in which no edge lies within a block.) The of F, denoted Spasm(F), is the set of isomorphism types of all such quotients F/π that can occur as homomorphic images of F.

The relevance is the standard fact that homomorphism counts are linear combinations of injective counts over Spasm(F), and conversely. Concretely, if we let surj(F, H) be the number of surjective homomorphisms from F onto H (up to equality on vertices, not modulo automorphisms), then for every G we have
hom(F, G) = ∑H ∈ Spasm(F)surj(F, H) ⋅ emb(H, G),
and by Möbius inversion on the partition lattice one obtains emb(F, G) as an integer linear combination of hom(H, G) over H ∈ Spasm(F). We will use this perspective only as a bookkeeping device: it explains why the passage from homomorphisms to substructures and back is always mediated by finite triangular linear systems indexed by quotients.

We next fix the minimal language for refinement-based representations. Let k ≥ 1. For a graph G, a k-tuple is u = (u1, …, uk) ∈ VGk. A (discrete) refinement state at round t is a coloring
cGt : VGk → Γt,
into some finite or countable set of colors Γt. A refinement round replaces cGt by cGt + 1 via an isomorphism-equivariant update rule that aggregates information from a specified neighborhood relation on k-tuples. We will formalize the update schema and the allowed neighbor relations in Section~3; for the present discussion we only require the following two generic properties.

First, refinement is : if φ : G → H is an isomorphism and u ∈ VGk, then cGt(u) = cHt(φ(u)) for all rounds t, where φ(u) is applied componentwise. Second, refinement : for each finite G there exists T = T(G) such that cGT = cGT + 1, hence cGt = cGT for all t ≥ T. The stabilized coloring induces a partition of VGk into color classes, and any invariant readout of this partition (e.g., the multiset of colors, or color histograms) yields an isomorphism-invariant graph representation. Rooted and equivariant outputs are obtained by restricting attention to tuples containing specified vertices; again, the precise readout interface will be fixed later.

Two graph operations play a structural role in our completeness and maximality arguments.

For disjoint union, given graphs G, H with disjoint vertex sets we write G ⊔ H for their union (labels and edges are inherited componentwise). Homomorphism counts interact with by component factorization: if F has connected components F1, …, Fm, then
$$ \mathrm{hom}(F,G)=\prod_{i=1}^m \mathrm{hom}(F_i,G), \qquad\text{and}\qquad \mathrm{hom}(F,G\sqcup H)=\prod_{i=1}^m \bigl(\mathrm{hom}(F_i,G)+\mathrm{hom}(F_i,H)\bigr). $$
In particular, for connected F we have hom(F, G ⊔ H) = hom(F, G) + hom(F, H).

For categorical product, the product G × H is defined by
VG × H := VG × VH,   {(g, h), (g, h)} ∈ EG × H ⇔ {g, g} ∈ EG ∧ {h, h} ∈ EH,
with vertex labels G × H(g, h) := (G(g), H(h)) (or any fixed encoding of pairs into Σ). The key identity is multiplicativity of homomorphisms:
hom(F, G × H) = hom(F, G) ⋅ hom(F, H)   for all F, G, H ∈ 𝒢.
This follows by the bijection Hom(F, G × H) ≅ Hom(F, G) × Hom(F, H) obtained by projecting a homomorphism h : VF → VG × VH to its two coordinate maps. Rooted variants satisfy the same identity when roots are taken as tuples: hom(Fa, (G × H)(u, v)) = hom(Fa, Gu)hom(Fa, Hv).

These two operations are the algebraic reason homomorphism counts form the appropriate coordinate system for refinement-based representations: the invariants preserved by stable refinement are compatible with and × under mild closure hypotheses, and therefore live naturally in the semiring generated by {hom(F, ⋅)}F ∈ 𝒢.

In completeness proofs we must bridge the gap between what refinement ``sees’’ locally and plain homomorphism counts. The refinement dynamics will be interpreted via (rooted trees, hypertrees, or bagged fragments) generated by iterating the neighbor relation on tuples. Counting occurrences of such objects in G typically yields quantities that are not immediately of the form hom(F, G), because unfoldings come with additional structure: layers, bags, or explicit equality constraints between repeated vertices.

The standard mechanism, originating in work of Dell–Grohe–Rattan and subsequently reused in several WL completeness results, is that these structured counts are linearly interreducible with homomorphism counts of finitely many ordinary patterns obtained from the unfolding object by ``forgetting structure’’ and taking quotients. We state this at the level of a schematic principle, which will be instantiated once U𝒜 is defined.

Fix a finite structured pattern P (think: a finite unfolding object with a set of bags and specified overlaps). Let biso(P, G) denote the number of of P into G: assignments of vertices of P to vertices of G that preserve labels and edges and that respect the intended identifications within bag overlaps. Such assignments can be viewed as homomorphisms from an auxiliary relational structure encoding the bag incidences; equivalently, they are homomorphisms from a graph equipped with extra equality constraints.

The conversion proceeds in two steps. First, equality constraints are eliminated by expanding over partitions of the vertex set of P, producing a finite set Quot(P) of ordinary graphs (quotients of the underlying graph of P) such that
biso(P, G) = ∑F ∈ Quot(P)αP, F hom(F, G)
for explicit integers αP, F. Second, by ordering quotients by (say) number of vertices and refining by isomorphism type, the coefficient matrix becomes triangular with nonzero diagonal; hence the system is invertible over , and we also have
hom(F, G) = ∑P ∈ UnfTypes( ≤ |VF|)βF, P biso(P, G)
for suitable coefficients βF, P. In particular, vectors of bag-isomorphism counts and vectors of homomorphism counts over the induced pattern family determine one another.

We will use this linear-algebraic bridge as follows: stabilization of refinement colors will be shown to be equivalent to counting unfolding types P inside G; the above relations then imply that χ𝒜(G) is determined exactly by hom(F, G) over the induced family of patterns F. The maximality direction will rely on explicit separators rather than linear algebra, but the same quotient-and-triangularity viewpoint clarifies why the resulting family is canonical.


3. The RWA Architecture Class: formal definition of tuple-state refinement with a neighbor relation template; invariant vs equivariant outputs; examples (MPNN, Subgraph, Local k, folklore).

We now formalize the class of refinement-based architectures covered by our meta-theorems. The common interface is that an architecture 𝒜 maintains discrete states (``colors’’) on k-tuples of vertices, updates these states for finitely many rounds by aggregating over a fixed tuple-neighborhood relation, and finally applies an invariant or equivariant readout. We emphasize that we treat the update as an information-theoretic refinement operator: whenever we write Hash below, it stands for an injective encoding of the input data into a fresh color alphabet (equivalently, a canonical naming of isomorphism types of the local view presented to the update).

Fix k ≥ 1. For each input graph G, the architecture maintains a sequence of colorings
cGt : VGk → Γt,   t = 0, 1, 2, …,
where Γt is a (finite or countable) color set that may grow with t. We allow repeated vertices in tuples, i.e., VGk rather than $\binom{V_G}{k}$, since equality patterns among coordinates often carry information in higher-order refinement.

The locality of the update is specified by a R. Concretely, R is a rule that, given a graph G and a tuple u ∈ VGk, returns a finite multiset (or set) of tuples RG(u) ⊆ VGk. The interpretation is that v ∈ RG(u) is a ``message source’’ for u. We do not impose a specific syntactic form on R here; later axioms will restrict R to be fixed across inputs and isomorphism-equivariant. For the present section it suffices that RG(u) is defined uniformly and is finite for each (G, u).

We optionally allow a single updated alongside tuple colors. This is a sequence gGt ∈ Δt that is intended to model the effect of graph-level tokens or global pooling used within each round.

An RWA round computes for each u ∈ VGk an aggregated neighborhood summary from current colors and then applies a deterministic refinement map. We write this schematically as

where Hash is injective in the tuple of its arguments (so two tuples receive the same new color if and only if their previous colors, aggregated neighbor-multisets, and global values coincide). If no global channel is present, one omits gGt. The aggregation operator is thus fixed to be multiset collection; the architectural variability is encoded in R (which determines what is collected) and in the initialization/readout interfaces described next. This abstraction matches the standard viewpoint that sufficiently expressive MLPs together with multiset pooling simulate injective hashing of neighbor-multisets.

When a global channel is present, we allow it to be updated by an invariant aggregation over tuple colors, e.g.,
$$ g_G^{t+1}:=\mathsf{Hash}_\mathrm{glob}\Bigl(g_G^t,\; \multiset{\,c_G^t(\mathbf{u}) : \mathbf{u}\in V_G^k\,}\Bigr), $$
again with Hashglob injective. This is a convenient way to include common ``global context’’ mechanisms while keeping the refinement semantics explicit.

Initialization fixes cG0 (and possibly gG0). In all examples of interest, cG0(u) depends only on the labeled equality/adjacency type induced by the tuple, for instance:
cG0(u) = G(u)  for k = 1,
and for k > 1 one may include (i) the label tuple (G(u1), …, G(uk)), (ii) the equality pattern on coordinates (ui = uj or not), and (iii) the adjacency pattern among coordinates ({ui, uj} ∈ EG or not). We keep initialization abstract, since later axioms only require that it respects isomorphisms.

Given an input graph G, we run the refinement until it stabilizes, i.e., until some T = T(G) satisfies cGT = cGT + 1 (and gGT = gGT + 1 if present). We denote the stabilized tuple-coloring by cG := cGT.

A graph-level representation is obtained by an isomorphism-invariant function of the stabilized coloring, typically the multiset histogram of tuple-colors:
$$ \chi_{\mathcal{A}}(G)\;:=\;\mathsf{Read}^{\mathrm{inv}}\Bigl(\multiset{\,c_G^\infty(\mathbf{u}) : \mathbf{u}\in V_G^k\,},\; g_G^\infty\Bigr). $$
The simplest choice is to take Readinv to be the identity on the multiset (or its multiplicity vector over ΓT), but we allow any invariant post-processing since our results depend only on the stable partition induced by cG.

Equivariant outputs are defined by restricting the stabilized coloring to tuples that contain designated vertices and then applying an invariant aggregation over the remaining coordinates. For example, a node-level representation may be defined by
$$ \chi_{\mathcal{A}}(u)\;:=\;\mathsf{Read}^{\mathrm{n}}\Bigl(\multiset{\,c_G^\infty(\mathbf{w}) : \mathbf{w}\in V_G^k,\; \exists i\in[k]\; \mathbf{w}_i=u\,}\Bigr), $$
and an ordered-pair (edge-level) representation analogously by conditioning on (u, v) appearing in specified coordinates:
$$ \chi_{\mathcal{A}}(u,v)\;:=\;\mathsf{Read}^{\mathrm{e}}\Bigl(\multiset{\,c_G^\infty(\mathbf{w}) : \mathbf{w}\in V_G^k,\; \mathbf{w}_1=u,\;\mathbf{w}_2=v\,}\Bigr), $$
with the obvious generalization to k-tuple outputs χ𝒜(u). This interface covers the usual situation where an architecture is permutation-equivariant at the tuple level and then applies coordinatewise pooling to obtain node/edge embeddings.

We record how several common GNN refinement schemes fit into the above template. These examples only serve to anchor the abstraction; later sections will characterize their induced homomorphism families.

Take k = 1 and define RG(u) := {v ∈ VG : {u, v} ∈ EG}. With initialization cG0(u) = G(u), the update
$$ c_G^{t+1}(u)=\mathsf{Hash}\Bigl(c_G^t(u),\multiset{\,c_G^t(v):v\in N_G(u)\,}\Bigr) $$
is exactly the standard color refinement / 1-WL update when Hash is injective. Many message-passing architectures are continuous relaxations of this scheme; at our level of abstraction they are treated as computing a stable partition coarser than or equal to the one produced by injective hashing.

Let u = (u1, …, uk). The classical k-WL neighborhood replaces one coordinate by an arbitrary vertex:
$$ \mathsf{R}_G^{\mathrm{WL}}(\mathbf{u}) :=\bigcup_{i=1}^k \bigl\{(u_1,\dots,u_{i-1},w,u_{i+1},\dots,u_k): w\in V_G\bigr\}. $$
This relation is global (it ranges over all vertices) and yields the usual k-dimensional refinement on tuples. Local variants are obtained by restricting w to a graph-theoretic neighborhood of ui, which we discuss next.

A common restriction is to only modify a coordinate along an edge. Define
$$ \mathsf{R}_G^{\mathrm{loc}}(\mathbf{u}) :=\bigcup_{i=1}^k \bigl\{(u_1,\dots,u_{i-1},w,u_{i+1},\dots,u_k): w\in N_G(u_i)\bigr\}. $$
This is a purely local schema: messages to u come from tuples obtained by a single-step move in exactly one coordinate. The resulting refinement is strictly weaker than k-WL in general but admits a clean unfolding semantics in terms of locally generated tuple-neighborhood trees.

For k = 2, writing (u, v) for a tuple, the local neighborhood becomes
RGpair(u, v) := {(u, v) : u ∈ NG(u)} ∪ {(u, v) : v ∈ NG(v)},
optionally augmented by the swapped pair (v, u) to model symmetric pair features. Such pair refinements underlie many ``2-GNN’’ and edge-aware architectures; the difference between variants is largely in which additional relations are included (e.g., conditioning on whether {u, v} ∈ EG in initialization, or adding special handling for diagonal pairs u = v).

Architectures that explicitly encode small rooted subgraphs can also be represented in the tuple framework by taking k = s for a fixed subgraph size s and interpreting a tuple u ∈ VGs as an ordered candidate vertex set for a pattern instance. One then chooses R so that neighboring tuples correspond to simple edits of the candidate set (e.g., replace one coordinate by a nearby vertex, swap coordinates to enforce permutation symmetry, or constrain replacements to maintain a connected induced subgraph around a designated root coordinate). The refinement colors on VGs can be read out to node-level summaries by pooling over tuples that contain a given node. This viewpoint captures a wide range of ``Subgraph-GNN’’ constructions in which the internal message passing over extracted subgraphs is abstracted into an injective refinement on the space of tuples representing those subgraphs.

Finally, several standard modifications are subsumed by the optional global channel and by allowing R to be a union of finitely many basic relations. For instance, adding a graph-level token that aggregates from all nodes each round corresponds to including gGt as above; incorporating multiple edge types or directions corresponds to defining RG(u) as a disjoint union of neighbor multisets indexed by relation type and including the index in the hashed summary. These extensions do not change the refinement nature of the computation; they only enrich the local view presented to Hash.

The remainder of the paper isolates the precise axioms under which such an RWA specification admits a canonical homomorphism-count characterization and a maximal separating family.


4. Axioms for Existence and Uniqueness: (A1) equivariance, (A2) stabilization, (A3) locality schema, (A4) compositionality, (A5) unfolding identifiability; discussion, necessity, and a taxonomy of failure modes.

The RWA interface of Section~ isolates a large class of refinement procedures, but it does not by itself guarantee that the resulting stable representations admit a canonical characterization by homomorphism counts, nor that the induced family of ``test patterns’’ is uniquely determined. We therefore impose five axioms (A1–A5). Each axiom formalizes a closure or uniformity principle that is implicitly satisfied by the standard refinement schemes (WL and its local variants) and is explicitly used in our completeness and maximality arguments.

We require that initialization, each round update, and the readout commute with graph isomorphisms. Concretely, for an isomorphism π : VG → VH between labeled graphs G and H, we extend π coordinatewise to tuples u = (u1, …, uk) ∈ VGk by π(u) = (π(u1), …, π(uk)). Then (A1) demands that for all t ≥ 0,
cHt(π(u)) = cGt(u),   gHt = gGt,
and that the readout is invariant/equivariant in the standard sense:
χ𝒜(H) = χ𝒜(G),   χ𝒜H(π(u)) = χ𝒜G(u),   χ𝒜H(π(u), π(v)) = χ𝒜G(u, v),
for graph-/node-/edge-level outputs (and analogously for general tuple outputs).
Axiom (A1) is not merely cosmetic: it is the minimal condition under which χ𝒜 can be compared to homomorphism counts, since hom(F, ⋅) is itself isomorphism-invariant (and the rooted variants are equivariant).

We assume that for every finite input graph G, the refinement dynamics reaches a fixed point in finitely many rounds:
∃ T(G) < ∞:  cGT(G) = cGT(G) + 1  (and gGT(G) = gGT(G) + 1 if present).
This axiom separates refinement procedures from more general recurrent computations. In the information-theoretic view where Hash is injective, standard WL-style refinements are monotone in the sense that they refine partitions of VGk and hence must stabilize after at most |VG|k proper refinement steps. However, the abstract RWA interface permits global channels and post-processing, and it allows color alphabets Γt to vary with t. Axiom (A2) asserts that, regardless of such details, the architecture admits a well-defined stable coloring cG that underlies χ𝒜. Without stabilization, there is no canonical limit partition'' to which one could attach a basis of local witnesses, and the equivalence relationχ𝒜(G) = χ𝒜(H)’’ becomes ill-behaved with respect to the algebraic manipulations used later.

We require that the neighbor relation is specified by a template R that is instantiated on each input graph and is isomorphism-equivariant. Formally, for every isomorphism π : G → H and every tuple u ∈ VGk, we demand
π(RG(u)) = RH(π(u))
as multisets of tuples. In particular, R may depend on k and on the labeled adjacency/equality type of u in G, but it may not depend on non-isomorphism-invariant features (such as vertex indices) nor on graph-specific parameters chosen adversarially per input. We also assume that RG(u) is finite for each (G, u), ensuring that one refinement round only inspects a finite ``local view’’ around u.

Axiom (A3) is the uniformity principle that makes an ``unfolding semantics’’ possible. The unfolding objects Unf𝒜(D)(G, u) used in Section~5 are defined by repeatedly expanding according to RG; if R were allowed to vary arbitrarily with G, there would be no single object family U𝒜 whose isomorphism types enumerate the possible local views across all graphs.

The meta-theorems rely on standard amplification and linear-independence arguments that operate by forming composite graphs from smaller ones. We therefore assume that the architecture is with respect to two operations.

For disjoint graphs G and H, the refinement on G ⊎ H should decompose into independent refinements on the components, up to a canonical combination at the readout stage. In its weakest form, we require that χ𝒜(G ⊎ H) is determined by χ𝒜(G) and χ𝒜(H) (and analogously for rooted/equivariant outputs when the root is contained in a specified component). This prevents architectures from encoding nonlocal ``cross-component’’ information not justified by the locality template R.

We also assume closure under the categorical product G × H, which is the standard device for turning local indistinguishability into global lower bounds and for enforcing linear independence of count vectors. Concretely, we require that the stable representation of G × H is a function of the stable representations of G and H, in a way compatible with the product structure (and similarly for rooted products). At the level of refinement, the intended reading is that the refinement operator respects product symmetries and does not contain ad hoc rules that ``detect’’ that a graph is a product and branch on that event.

Axiom (A4) rules out special-casing that would otherwise destroy the existence of a homomorphism family: maximality is proved by showing that indistinguishability is preserved under such compositions, while certain homomorphism counts multiply or combine predictably. If 𝒜 violates these closure properties, then indistinguishability may fail to be a congruence with respect to or ×, and the usual algebraic completion argument that produces a unique maximal family F𝒜 can break down (cf. Proposition~5).

Finally, we require that stable colors correspond faithfully to isomorphism types of unfolding objects induced by R. Informally, (A5) says that 𝒜 does not collapse two genuinely different local views into the same stable color unless those views are already indistinguishable by R-unfolding isomorphism. Formally, for each depth D we consider the rooted unfolding object Unf𝒜(D)(G, u) obtained by expanding from u according to RG and recording at each node the relevant initial type information (labels, equality patterns, adjacency within tuples, and, if present, the global-channel input viewed as a root annotation). Axiom (A5) asserts:

The first direction is typically proved by induction from (A1) and the definition of the update; the second direction is an additional identifiability requirement that excludes degenerate architectures whose update or readout intentionally merges distinct unfolding types. It is precisely this axiom that allows us to treat the stabilized colors as local witness objects and to define U𝒜 as the set of all unfoldings that can occur.

Each axiom corresponds to a distinct class of pathologies.
Violating (A1) allows the architecture to depend on vertex identifiers or arbitrary tie-breaking; then χ𝒜 need not be expressible in terms of any isomorphism-invariant counting logic, and homomorphism characterizations are inapplicable.
Violating (A2) permits oscillations or unbounded state growth; then there is no stable partition and no finite-depth unfolding basis that determines the output.
Violating (A3) leads to non-uniform locality where the neighborhood rule depends on global properties of the input graph; in that case there is no single induced unfolding family, and any purported F𝒜 would have to encode graph-dependent behavior.
Violating (A4) admits ``if–then’’ constructions that break congruence under or ×; such architectures can implement functions for which no unique maximal separating homomorphism family exists, as the closure steps in the maximality proof fail.
Violating (A5) allows intentional information loss that collapses distinct unfoldings; then U𝒜 cannot be recovered from the stable coloring, and the completeness direction in Section~5 no longer yields an characterization.

In the remainder, we work exclusively under (A1–A5). These axioms are strong enough to force a canonical unfolding semantics and weak enough to include the refinement schemes of interest; the subsequent completeness and maximality theorems should be read as consequences of this constrained, but still expressive, design space.


5. Upper Bound / Completeness: unfolding objects and bag-isomorphism counting; show representations are determined by hom counts on a derived pattern basis; matrix factorization argument (bag-isomorphism ↔︎ hom).

We establish the upper bound direction: the stable representation produced by 𝒜 is determined by homomorphism counts over an explicit pattern basis induced by the neighbor template R. Throughout we fix the tuple dimension k and write cGt : VGk → Γt for the tuple-coloring after t rounds.

For D ≥ 0, a graph G, and a tuple u ∈ VGk, we define the depth-D unfolding object Unf𝒜(D)(G, u) by iteratively expanding from u according to the neighbor relation RG. Concretely, nodes of the unfolding are R-walks (u = u0, u1, …, ud) of length d ≤ D such that ui + 1 ∈ RG(ui), with the root being the empty walk at u. Each node is annotated by the of the corresponding tuple in G, namely: the vertex labels (G(u1), …, G(uk)), the equality pattern on coordinates, and the adjacency pattern between coordinates (as induced by EG); if 𝒜 uses a global channel, we include its initial value as an additional root annotation. Children of a node corresponding to x are indexed by the multiset RG(x), so that multiplicities in RG(x) are represented as multiplicities of isomorphic subtrees. Isomorphisms between unfoldings are root-preserving, type-preserving, and respect these multiset child structures.

The relevance of unfoldings is that one refinement round is precisely one expansion step together with multiset aggregation. By induction on D, using (A1) and the fact that R is instantiated equivariantly on each input (A3), we obtain the following correctness statement.

If Unf𝒜(D)(G, u) ≅ Unf𝒜(D)(H, v), then cGD(u) = cHD(v).

By (A5), stable colors do not identify genuinely different unfolding types. In particular, for each pair (G, u) and (H, v), equality cG(u) = cH(v) implies that their unfoldings are isomorphic for all sufficiently large depths D (depending only on the stabilization depths of G and H). Thus stable tuple-colors may be treated as of unfolding isomorphism types.

Let U𝒜 be the set of all finite unfolding objects that arise as Unf𝒜(D)(G, u) for some G ∈ 𝒢, u ∈ VGk, and D ≥ 0, taken up to isomorphism. For each τ ∈ U𝒜, we write Nτ(G) for the number of tuples u ∈ VGk whose stabilized color corresponds to τ, equivalently whose unfolding type is τ at a depth beyond stabilization. Since χ𝒜(G) is (by definition of the RWA interface) an invariant of the multiset of stabilized tuple-colors (possibly post-processed by an invariant readout), we conclude:

For every G, the stable representation χ𝒜(G) is determined by the vector (Nτ(G))τ ∈ U𝒜, with finite support on each fixed G.

It remains to express these unfolding-type counts in terms of homomorphism counts of ordinary pattern graphs.

A depth-D unfolding object can be encoded by a finite capturing the same local constraints while remembering the tuple structure. Formally, with each unfolding type τ ∈ U𝒜 we associate a finite structure Bag(τ) consisting of (i) a finite graph-shaped incidence skeleton given by the unfolding tree truncated at depth D (for some representative depth where τ is realized), (ii) a distinguished of k vertices for each node of the skeleton, together with explicit equalities inside the bag, and (iii) edge and label constraints that enforce the adjacency/equality/label types recorded by τ. A from Bag(τ) into G is a mapping of all vertices of Bag(τ) to VG that is a graph homomorphism and is injective on each bag after quotienting by the enforced equalities, while preserving the marked root bag. We write biso(Bag(τ), G) for the number of such bag-isomorphisms.

The construction is arranged so that choosing a tuple u in G together with compatible witnesses for each expansion step of τ is equivalent to choosing a bag-isomorphism of Bag(τ) into G. Consequently, for each τ we have an identity of the form
Nτ(G) = biso(Bag(τ), G),
up to a fixed normalization that only depends on automorphisms of the bagged pattern (and may be absorbed by considering rooted variants of Bag(τ)). Hence χ𝒜(G) is determined by the family of numbers biso(Bag(τ), G) as τ ranges over U𝒜.

We now pass from bag-isomorphism counts to ordinary homomorphism counts. The key point is that bag-injectivity constraints can be removed by a finite inclusion–exclusion (equivalently, a Möbius inversion over partitions), yielding a linear combination of homomorphism numbers.

Fix a finite bagged pattern B (in our application, B = Bag(τ)). Consider all quotients of B obtained by identifying vertices in all possible ways consistent with the forced equalities. Each such quotient is an ordinary finite graph F together with induced labels; we denote by Spasm(B) the finite set of all graphs F obtainable in this way (up to isomorphism). For each F ∈ Spasm(B), there is a natural surjection q : B → F. Any homomorphism h : F → G composes with q to a homomorphism B → G that violates bag-injectivity exactly on those identifications represented by q. Standard Möbius inversion over the lattice of partitions inside each bag yields integer coefficients μB, F such that for all G,

Equation~ is finite because Spasm(B) is finite.

Conversely, for each finite graph F we may pick a canonical bagged expansion BF whose bags remember the tuple incidences that arise in our unfolding grammar (for instance, by taking a tree-shaped covering of F consistent with R and placing a k-bag at each node). When these expansions are ordered by refinement (coarser identifications first), the coefficient system μB, F becomes triangular with respect to this order, and the diagonal coefficients are 1. Therefore the linear transformation from the vector (hom(F, G))F ∈ Pat to (biso(B, G))B ∈ BagPat is invertible over on each finite truncation of the index sets. In particular, for any finite subcollection of bagged patterns, there exists a finite graph family and an integer matrix M with det (M) = ±1 such that
(biso(B, G))B ∈ ℬ = M ⋅ (hom(F, G))F ∈ ℱ
for all G. Hence the bag-isomorphism counts and the corresponding homomorphism counts carry exactly the same information.

Applying this to ℬ = {Bag(τ) : τ ∈ U𝒜( ≤ D)}, the set of unfolding types realizable up to some fixed depth D, we obtain a finite induced pattern set Pat(U𝒜( ≤ D)) of ordinary graphs such that, for all G, the depth-D unfolding statistics (and thus the depth-D colors) are determined by (hom(F, G))F ∈ Pat(U𝒜( ≤ D)). Letting D exceed stabilization depths yields the desired completeness statement.

Define Pat(U𝒜) := ⋃D ≥ 0Pat(U𝒜( ≤ D)). For each fixed input graph G, only finitely many depths are relevant (beyond stabilization no new tuple-colors appear), so the homomorphism vector (hom(F, G))F ∈ Pat(U𝒜) determines χ𝒜(G). Moreover, if χ𝒜(G) = χ𝒜(H), then the stabilized tuple-color multisets agree, hence all unfolding-type counts agree, hence all corresponding bag-isomorphism counts agree, and by invertibility of the linear system, hom(F, G) = hom(F, H) for all F ∈ Pat(U𝒜). We thus obtain the equivalence
χ𝒜(G) = χ𝒜(H)   ⇔   hom(F, G) = hom(F, H)  ∀ F ∈ Pat(U𝒜),
which is the promised completeness (upper bound) in homomorphism-counting terms.

The rooted/node- and edge-level variants are obtained by the same argument with rooted unfoldings. For node-level outputs, we work with tuples u in which one coordinate is designated as the output node (or, equivalently, with rooted graphs Gu) and define rooted unfolding objects Unf𝒜(D)(Gu, u) whose root annotation records the mark. The associated bagged patterns are rooted, and the linear reduction is replaced by its rooted analogue, expressing rooted bag-isomorphism counts as integer combinations of rooted homomorphism counts hom(F, Gu). The edge-level case is identical with two marked vertices. In each case, stabilization and (A5) identify equivariant stable representations with rooted unfolding types, and the bag-isomorphism factorization transfers this information to rooted homomorphism counts over the induced rooted pattern basis.


6. Lower Bound / Maximality: generalized pebble games parameterized by the neighbor relation template; construction of generalized F"urer gadgets and twists; rooted (clique-augmented) gadgets for node/edge-level; prove that every forbidden pattern yields a separating pair.

We now establish the lower bound direction: the homomorphism basis extracted from the unfolding semantics is . Concretely, for every pattern graph F ∉ F𝒜 we construct an explicit separating pair (GF, HF) such that χ𝒜(GF) = χ𝒜(HF) while hom(F, GF) ≠ hom(F, HF); we also give rooted variants yielding separations at node- and edge-level.

Fix the neighbor template R on k-tuples. We define a two-structure game PGRD(G, H) of length D that mirrors the refinement-with-aggregation interface. A position is a pair of tuples (u, v) ∈ VGk × VHk. We say that (u, v) is if the initial types of u in G and v in H agree (labels, equality pattern on coordinates, and adjacency pattern between coordinates; and, if present, the global channel annotation agrees at the root).

In round i, Spoiler chooses one of the two graphs, say G, and selects a neighbor tuple u ∈ RG(u). Duplicator must respond with a tuple v ∈ RH(v). The game proceeds from (u, v). Duplicator loses immediately if the resulting position is illegal. Duplicator wins if she can survive D rounds.

This game is the natural abstraction of one-step multiset aggregation: R determines which tuple-colors can influence the next color, and legality enforces that any matched tuples have compatible intrinsic types. The standard induction, adapted to the present R-parametric setting, yields:

For all D ≥ 0, if Duplicator wins PGRD(G, H) from (u, v), then cGD(u) = cHD(v). Conversely, by (A5), if cGD(u) = cHD(v), then the depth-D unfoldings are isomorphic and Duplicator wins from (u, v).

Passing to stabilization, we obtain: if for every u ∈ VGk there exists v ∈ VHk with Duplicator winning for all D (and symmetrically), then the multisets of stable tuple-colors coincide and hence χ𝒜(G) = χ𝒜(H). Thus, to prove indistinguishability by 𝒜, it suffices to provide a Duplicator strategy in this R-game.

By definition of F𝒜, a pattern F ∉ F𝒜 is one whose homomorphism count is determined by the unfolding-type statistics encoded in χ𝒜. Equivalently (and this is the viewpoint we use for gadget design), F fails to admit the decomposition discipline implicit in the unfolding grammar U𝒜: there is no way to represent F as a consistent gluing of local k-tuple constraints generated by repeated R-expansion. In familiar instantiations this failure specializes to, e.g., having a cycle (for k = 1 MPNNs), or exceeding a prescribed NED/treewidth bound (for higher-order local architectures). We keep the discussion abstract and only use the existence of such a failure mode, witnessed by the R-pebble game characterization below.

We consider the CGR(F), in which Spoiler tries to force a configuration that violates any putative R-consistent unfolding decomposition of F. One may formalize CGR(F) as a finite obstruction search over partial R-walks of k-tuples in F, with legality given by the same initial-type constraints as above, and with moves constrained to follow RF. The key property is:

Alice (Spoiler) wins CGR(F) if and only if F ∉ F𝒜.

We use this equivalence only as an interface: given F ∉ F𝒜, we fix a finite winning transcript for Spoiler in CGR(F) and compile it into a gadget in which R-local views are symmetric while hom(F, ⋅) detects the asymmetry.

We now describe a concrete gadget family Gad(F) together with a operation that preserves all R-local unfolding types but changes hom(F, ⋅). The construction generalizes the Cai–F"urer–Immerman (CFI) paradigm to arbitrary degree patterns and to the present R-parametric locality.

Fix a connected pattern F = (VF, EF, F), and orient each edge arbitrarily to obtain an incidence list at every vertex. For each vertex x ∈ VF of degree d(x), we create a Cx consisting of 2d(x) − 1 vertices, indexed by even-parity bitstrings α ∈ {0, 1}d(x) (or equivalently even subsets of incident edges). Intuitively, α encodes a parity constraint on how a homomorphism traverses the incident edges of x. Labels are lifted by setting all vertices of Cx to carry label F(x) (or a refined label if we need to break symmetry between different degrees).

For each edge e = {x, y} ∈ EF, we connect clouds Cx and Cy by a perfect matching defined as follows. Let the coordinate corresponding to e in x’s incidence list be i, and the coordinate corresponding to e in y’s incidence list be j. For α ∈ Cx and β ∈ Cy, we add an edge between the vertices (x, α) and (y, β) if and only if αi = βj. This yields a graph Gad(F) whose local neighborhoods are highly regular: any vertex in a cloud sees, for each incident edge of F, exactly half of the opposite cloud, and the parity restriction is internal to the cloud.

To define the twist, fix a designated edge e = {x, y} ∈ EF. We obtain Twist(Gad(F), e) by modifying only the matching on e: instead of connecting (x, α) to (y, β) when αi = βj, we connect them when αi ≠ βj. All other adjacencies and labels are unchanged. We write
GF := Gad(F),   HF := Twist(Gad(F), e).

The reason GF and HF are hard for refinement procedures is that the twist is globally defined but locally invisible to R-bounded exploration. Formally, one shows that Duplicator wins PGRD(GF, HF) for every D, from an appropriate correspondence between tuples in GF and tuples in HF. The correspondence is induced by an 𝔽2-affine symmetry: within each cloud Cx, flipping all indices by adding a fixed even-parity vector is an automorphism; moreover, along each non-twisted edge matching, equal-bit constraints are preserved under synchronized flips of the relevant coordinates. The only discrepancy occurs at the twisted edge e, and Duplicator compensates by switching between two affine cosets when the play crosses e. Since the game only exposes neighborhoods through R-moves, and R is fixed and equivariant, this compensation can be maintained uniformly for any finite number of rounds.

Consequently, stabilized tuple-colors match, and therefore
χ𝒜(GF) = χ𝒜(HF).
This is the point at which (A1)–(A3) are used: equivariance ensures that the Duplicator strategy, defined in terms of the gadget symmetries, aligns with the refinement operator; and the fixed template R prevents the architecture from probing non-R-local global structure.

It remains to show that the twist changes hom(F, ⋅) whenever F ∉ F𝒜. A homomorphism h : F → GF chooses, for each x ∈ VF, a vertex in the corresponding cloud Cx, i.e. an even-parity index αx. The edge constraints ensure that for each edge {x, y}, the selected indices satisfy (αx)xy = (αy)yx in GF, whereas in HF the constraint is flipped on e. Thus homomorphisms correspond to solutions of a system of 𝔽2-constraints of the form
(αx)xy ⊕ (αy)yx = bxy,
with bxy = 0 for all edges in GF and bxy = 1 (and b = 0 elsewhere) for HF, together with the requirement that each αx has even parity.

If F admitted an R-consistent unfolding decomposition, then these constraints would be satisfiable in a way that is invariant under flipping be, and the solution counts (hence homomorphism counts) would coincide. Precisely the failure F ∉ F𝒜 yields an obstruction cycle (or, in general, an R-certified parity witness) forcing different solution counts when a single edge constraint is flipped. Therefore
hom(F, GF) ≠ hom(F, HF).
This establishes the required separation at graph level.

For node- and edge-level statements, we require separations that persist under equivariant readouts, i.e. we must exhibit roots r and s in GF and HF whose stable equivariant representations agree while rooted homomorphism counts differ. We enforce root rigidity by a standard clique-augmentation.

We form F from GF by adjoining a labeled clique K of size k + 2 whose vertices carry pairwise distinct labels (or, equivalently, are made distinguishable by attaching uniquely labeled leaves). We connect a designated vertex of K to a designated anchor vertex of GF (chosen in a cloud corresponding to a fixed vertex of F), and do the same for HF, obtaining F. The unique labeling forces every homomorphism of a rooted pattern to map the root(s) into prescribed vertices of K, and forces any 𝒜-equivariant representation at those vertices to be canonical. Since the same rigid gadget is attached in both graphs and the attachment is R-locally identical, the previous Duplicator strategy extends, yielding equality of node-/edge-level stable representations at the chosen roots. On the other hand, rooted homomorphisms hom(F, Fu) and hom(F, Fu) still encode the same parity system on the underlying CFI part and hence remain separated.

Combining indistinguishability in the R-pebble game with the parity separation of homomorphism counts, we obtain the promised maximality principle: for every forbidden pattern F ∉ F𝒜 we can explicitly produce (GF, HF) (and rooted analogues) witnessing that no invariant computable from χ𝒜 can recover hom(F, ⋅). This completes the lower bound direction and, together with the completeness statement proved earlier, pins down F𝒜 as the unique maximal homomorphism expressivity family associated with 𝒜 under axioms (A1)–(A5).


7. Recipe to Derive F𝒜: how to compute/describe the candidate family from an architecture spec; conversion of unfolding-basis descriptions into structural decompositions (treewidth/NED-like) when possible; guidance for new architectures.

Having proved that every refinement-with-aggregation architecture 𝒜 satisfying (A1–A5) induces a unique maximal homomorphism expressivity family F𝒜, we now make explicit how one (and, in favorable cases, ) this family from the architecture-level specification. The point is not that one can always list F𝒜 explicitly, but rather that one can (i) compute an unfolding basis U𝒜 mechanically from the neighbor template R and the initialization types, (ii) read off from U𝒜 a candidate pattern family Pat(U𝒜) that is complete for χ𝒜, and (iii) translate U𝒜 into a structural decomposition discipline on patterns, yielding a concise characterization of F𝒜 (treewidth/NED-like) when such a translation exists.

We first rewrite the forward pass of 𝒜 in the RWA normal form: a tuple dimension k, an initial type map on k-tuples (encoding labels and equality/adjacency patterns among coordinates, and any extra designated channels), and a fixed neighbor template R that is instantiated as RG(u) on each input graph G. In this normalization we ignore the particular hash function and aggregator implementation; only the interface matters: the next-round color of a tuple is a function of its current color and the multiset of colors in RG(u). If the architecture uses several aggregators (e.g. multiple radii or multiple relation types), we take R to be the disjoint union of the corresponding neighbor relations, with colors recording the relation identifier; this is without loss of generality for expressivity.

Given R and the initial tuple types, we define the depth-D unfolding object Unf𝒜(D)(G, u) inductively:

We then quotient by isomorphism of unfoldings (root-preserving, type-preserving, and respecting relation tags). The family U𝒜 is the set of all finite unfolding types obtainable as Unf𝒜(D)(G, u) over all finite graphs G, tuples u ∈ VGk, and depths D. By (A5), distinct stable colors correspond to distinct unfolding types for sufficiently large D; thus U𝒜 is the canonical ``witness vocabulary’’ of 𝒜.

In practice, this step is a grammar extraction: from the syntactic description of R we write an inductive datatype whose constructors correspond to (i) initial tuple-types and (ii) R-expansions that collect multisets of previously constructed objects, decorated by relation tags. When R has bounded arity and finitely many clauses, this grammar is finite-state in the usual sense (though it generates infinitely many objects across depths).

To obtain a homomorphism-complete family, we associate to each unfolding object U ∈ U𝒜 a finite set of pattern graphs Pat(U) whose homomorphism counts into G recover the number of occurrences of U as an unfolding type in G. Concretely, one proceeds in two substeps.

This compilation is canonical up to homomorphic equivalence, and it is designed so that counting homomorphisms from P into G is (via a triangular linear system) equivalent to counting bag-isomorphisms from Bag(U) into G, hence equivalent to counting tuples in G whose unfolding type is U. We then set
Pat(U𝒜) := ⋃U ∈ U𝒜Pat(U).
By the completeness direction proved earlier, equality of χ𝒜(G) and χ𝒜(H) is equivalent to equality of hom(P, ⋅) over all P ∈ Pat(U𝒜) (and rooted analogues if the readout is equivariant).

While Pat(U𝒜) is already a complete basis, it is often redundant and not structurally transparent. The structurally meaningful object is the on patterns induced by U𝒜: a pattern F is said to be if it can be obtained by a finite construction that mirrors the unfolding grammar, where each constructor adds a constant-size piece (a bag) and glues it to previously built pieces according to the overlap constraints specified by the corresponding R-tag. Equivalently, F admits a decomposition whose ``bags’’ are k-tuple-shaped (or (k + 1)-tuple-shaped, depending on whether R replaces one coordinate at a time) and whose adhesion sets are exactly those coordinate overlaps permitted by R.

Operationally, we proceed as follows. We consider the hypergraph F whose hyperedges are the candidate bags, and we attempt to orient and arrange these bags into a tree-like or hypertree-like object such that:

When this succeeds, we typically obtain a known width measure: for k-tuple replacement neighborhoods one recovers tree decompositions of width  ≤ k; for pair-based neighborhoods with endpoint-sharing constraints one recovers NED-like decompositions. When it does not match a known width measure, we obtain a new decomposition notion canonically attached to R; in that case, F𝒜 is best stated as the class of patterns admitting this R-decomposition.

From the preceding step, we propose the candidate family
𝒜 := {F ∈ 𝒢 : F is U𝒜-decomposable},
with the corresponding rooted families defined by requiring a decomposition that respects the designated root tuple(s) as a distinguished initial bag. The meta-theorem then supplies the tightness criterion: to identify 𝒜 with the unique maximal family F𝒜, it suffices to verify that the decomposition discipline exactly matches the R-pebble-game characterization of indistinguishability on patterns. In practice this is done by proving two implications:

Once these are shown, maximality and uniqueness promote 𝒜 to F𝒜 automatically.

For a newly proposed 𝒜, we recommend the following workflow.

In summary, the derivation of F𝒜 is a compilation pipeline
(𝒜, R) ⇒ U𝒜 ⇒ Pat(U𝒜) ⇒ decomposition discipline ⇒ F𝒜,
where the last implication is certified by the game/gadget maximality interface. This is the mechanism we will instantiate next to recover the known families for standard architectures and to isolate, in a formal way, which axiom failures lead to pathological behavior.


8. Instantiations and Corollaries: recover known FM for standard GNNs; formalize the pathological counterexample and show exactly which axiom it violates; additional examples (distance-guarded variants, fixed sparse-attention variants).

We now instantiate the derivation pipeline of on a number of standard neighborhood templates, and we record the resulting maximal families in the form most useful for subsequent arguments. In each case the identification follows the same logic: from the syntactic description of R we read off the unfolding grammar U𝒜, from U𝒜 we extract the permitted overlap types between constant-size bags, and we then recognize the induced decomposition discipline as a known width notion. The soundness direction is then a dynamic program over the induced decomposition, while completeness follows from the R-pebble-game/gadget interface developed earlier.

Consider first the archetypal message-passing architecture with tuple dimension k = 1 and neighbor template
RG(u) = {u ∈ VG : {u, u} ∈ EG},
possibly with multiple relation types corresponding to edge labels. The depth-D unfolding Unf𝒜(D)(G, u) is then a rooted, vertex-labeled tree of depth D, where each child corresponds to a neighbor and relation tags record the edge type. Consequently, the induced bag structures are precisely rooted trees, and the compilation Pat(U𝒜) yields (rooted) tree patterns. Since homomorphism counts of trees determine the stable color multiset (and conversely, tree unfoldings are the only local witnesses that survive stabilization), we recover the well-known characterization: the maximal family F𝒜 consists of forests. More precisely, χ𝒜(G) = χ𝒜(H) if and only if hom(T, G) = hom(T, H) for all finite trees T (and similarly for node-rooted trees in the equivariant setting). This is the k = 1 instance of Corollary~4.

We next consider architectures whose refinement state lives on ordered pairs (u, v) ∈ VG2 (so k = 2) and whose neighbor template updates one endpoint at a time. A minimal such template is
RG(u, v) = {(u, v) : u ∼ u} ∪ {(u, v) : v ∼ v},
with relation tags recording whether the first or second coordinate moved. The corresponding unfolding objects are rooted trees of pair-states, but crucially each expansion step carries an overlap constraint: along an edge of the unfolding, one coordinate is preserved. When we translate this to a decomposition discipline on a pattern F, we obtain decompositions into constant-size bags shaped like ordered pairs (or small augmentations thereof) whose adhesion sets are forced to include one endpoint. This is the abstract form of an ``endpoint-sharing’’ constraint.

Different pair-based architectures impose different additional constraints on R, and these translate into different permitted adhesions, hence different decomposition notions. For example:

In each case, the reason such NED-like notions appear is that the only overlap information that can be transported through the refinement is the overlap already hard-coded in the relation tags of R: if the template never permits an expansion that simultaneously changes both endpoints, then no decomposition assembled from the corresponding unfolding vocabulary can require larger adhesions than those endpoints.

We emphasize that the above statements are not merely upper bounds. The maximality theorem supplies the converse: whenever a pattern F fails the corresponding endpoint-discipline, we obtain explicit gadget graphs GF, HF that are indistinguishable by the pair-refinement (hence yield identical χ𝒜) but satisfy hom(F, GF) ≠ hom(F, HF).

For k-tuple models (including k-WL-like and folklore k-GNN variants) the most common neighbor template replaces one coordinate at a time, typically along adjacency:
$$ \mathsf{R}_G(u_1,\dots,u_k)\;=\;\bigcup_{i=1}^k \{(u_1,\dots,u_{i-1},u_i',u_{i+1},\dots,u_k): u_i'\sim u_i\}, $$
together with intrinsic coordinate relations (equalities and adjacencies among coordinates) included in the tuple type. The unfolding grammar then describes how k-tuples can be expanded by changing a single coordinate while keeping the remaining k − 1 coordinates fixed, and the relation tags record which coordinate was replaced. Translating this into a pattern discipline yields a tree decomposition whose bags correspond to (k + 1)-vertex neighborhoods obtained by introducing one new vertex adjacent to a preserved k-tuple context. The adhesion sets are exactly the preserved k-tuples, hence have size at most k, and the running-intersection property follows from the tree-like nature of unfoldings.

This is precisely the standard treewidth mechanism: the induced family is the class of patterns of treewidth at most k. Concretely, we recover the known equivalence (again as in Corollary~4): for such an architecture 𝒜, the stable representation χ𝒜(G) determines hom(F, G) for every F of treewidth  ≤ k, and for every pattern F with tw(F) > k there exist indistinguishable gadget graphs separating hom(F, ⋅). Rooted analogues correspond to rooted tree decompositions in which the root tuple sits inside a designated initial bag.

We next formalize two simple ``if–then’’ constructions illustrating why the axioms (A4) and (A5) are not technical conveniences but structural necessities for the existence of a maximal homomorphism family.

Let 𝒜 be the architecture that outputs a single bit
χ𝒜(G) := |VG| mod  2,
computed via a global channel that increments a counter during refinement and then reduces mod 2 at readout. This is isomorphism-invariant and stabilizes immediately, and it can be phrased as an RWA-style procedure with a global accumulator. However, it violates compositionality under disjoint union: we have χ𝒜(G ⊔ H) = χ𝒜(G) ⊕ χ𝒜(H) rather than a homomorphism-compatible additive behavior.

Assume for contradiction that there were a family F such that χ𝒜(G) = χ𝒜(H) if and only if hom(F, G) = hom(F, H) for all F ∈ F. For every fixed pattern F, the function G ↦ hom(F, G) is additive under disjoint union:
hom(F, G ⊔ H) = hom(F, G) + hom(F, H).
Hence any graph invariant determined solely by the collection {hom(F, ⋅) : F ∈ F} must respect the congruence induced by these additive coordinates. The parity function does not define such a congruence: for example, there exist G, H with the same parity but such that G ⊔ G and H ⊔ H have opposite parity. This contradicts determinability from hom counts, and therefore no maximal family F𝒜 exists. This is an explicit instance of Proposition~5 with failure of (A4).

Fix a finite graph H0. Define an architecture 𝒜H0 that first runs an otherwise standard refinement procedure (say, the k = 1 neighborhood template) to stabilization, and then applies the readout
$$ \chi_{\mathcal{A}_{H_0}}(G)\;:=\; \begin{cases} 1 & \text{if } G\cong H_0,\\ 0 & \text{otherwise.} \end{cases} $$
This is isomorphism-invariant and stabilizing, but it is not ``explainable’’ by unfolding objects: the difference between H0 and a graph G with identical stabilized color multiset (which can be arranged by choosing G among known WL-indistinguishable families) is introduced entirely at readout by an external test not reflected in the unfolding vocabulary. In particular, there cannot exist a family F such that all non-isomorphic graphs outside H0 become homomorphism-indistinguishable while H0 is separated, since for any nontrivial pattern F the map G ↦ hom(F, G) is not constant across all G ≆ H0. Thus the induced equivalence relation of χ𝒜H0 cannot be represented as equality on any set of homomorphism counts. This is a canonical instance of Proposition~5 tied to failure of (A5) (and typically also (A4), since such special-casing does not behave well under products/unions).

Finally, we note two template families that fall between the standard cases above and illustrate how the recipe produces new, architecture-specific decompositions.

Suppose R permits a coordinate replacement not only along edges but along paths of length at most r, with relation tags recording the chosen radius class (or, equivalently, working in the r-th power graph G(r) while keeping the original labels). The unfolding grammar then expands tuples by ``guarded’’ moves within r-balls, and the induced bags must cover edges of F while maintaining adhesions that are realizable within distance  ≤ r in the partially constructed pattern. The resulting pattern class can be stated as admitting an R-decomposition with radius-r guards: each new bag introduces vertices whose attachment to the existing structure is witnessed by a bounded-radius connector. This yields expressivity strictly stronger than purely edge-local templates but still limited to patterns compatible with the guarded adhesion discipline.

Assume the architecture aggregates not over all of RG(u) but over a fixed subrelation RG(u) ⊆ RG(u) chosen by an isomorphism-equivariant rule (e.g. restricting to neighbors of specified labels, or to a designated edge type). Then the unfolding vocabulary is computed with R in place of R, and F𝒜 aligns with decompositions that only use the corresponding restricted attachment types. In particular, such sparsification can only decrease the maximal family: any pattern requiring attachments outside the permitted relation tags is excluded and admits gadget separations. This provides a formal way to certify expressivity loss (and, conversely, to justify safe sparsification) directly from the template.

These instantiations will be the input to the next section, where we quantify the constructivity implicit in the maximality theorem by bounding gadget sizes and discussing the complexity of recognizing membership in the structurally characterized cases.


9. Complexity and Constructivity: size bounds for gadget counterexamples; decidability/complexity of membership tests under common structural characterizations; limits of full automation.

The meta-theorem in is information-theoretic: it asserts the existence of a unique maximal homomorphism family and, for every excluded pattern F ∉ F𝒜, the existence of explicit counterexamples (GF, HF) with χ𝒜(GF) = χ𝒜(HF) but hom (F, GF) ≠ hom (F, HF) (and rooted analogues). In this section we make precise what ``explicit’’ entails in the present setting, we record size bounds for the gadgets arising from our construction, and we delineate which membership and separation questions admit effective procedures under the structural characterizations identified in . We also isolate the points at which full automation is provably limited.

Fix an architecture 𝒜 satisfying (A1)–(A5), with tuple dimension k and neighbor template R. For a pattern F ∉ F𝒜, Theorem~3 produces graphs GF, HF that are indistinguishable by the induced R-pebble game (hence by 𝒜), but are separated by F-homomorphism counts. Operationally, our construction proceeds in two steps: (i) we build a base gadget Gad(F) that locally encodes the incidence structure of F in a way compatible with the moves of the R-game, and (ii) we apply a along a chosen attachment edge (or, more generally, along a designated relation instance) to obtain Twist(Gad(F), e), which preserves R-indistinguishability but flips the parity of a controlled family of F-homomorphisms. We then set
GF := Gad(F),   HF := Twist(Gad(F), e).
The specific form of Gad(F) depends on the decomposition obstruction exhibited by F relative to the unfolding basis U𝒜; however, in all standard instances the gadget is a bounded-depth, bounded-adhesion blow-up of F in which each vertex v ∈ VF is replaced by a local cloud whose internal wiring supports a exploitable by Duplicator.

The resulting size bounds are most conveniently stated in terms of degree parameters of F and the local signature induced by R. In the common case where the gadget step replaces each vertex v by a cloud of size on the order of 2degF(v) − 1 (the familiar phenomenon from CFI/F"urer constructions), we obtain
|VGF|,|VHF| = Θ (∑v ∈ VF2degF(v) − 1),   |EGF|,|EHF| = poly(∑v ∈ VF2degF(v) − 1),
where the implicit polynomial degree is uniform over F and depends only on the arities and tags present in R (and on whether we enforce vertex labels). In particular, for bounded-degree patterns this yields gadgets of size polynomial in |VF|, whereas for unbounded degree an exponential blow-up is inherent to the symmetry needed to maintain R-indistinguishability while enabling a controlled twist.

For node- or edge-rooted separations (equivariant readouts), we additionally require that the distinguished root tuple(s) be anchored in a rigid structure that cannot be permuted away by an automorphism of the gadget. A convenient method is : we adjoin a labeled rigid container (typically a (k + 1)-clique with unique label pattern) and attach the root vertex (or root endpoints) in a prescribed way. This increases the gadget size by an additive constant depending only on k and the label alphabet, leaving the asymptotic bound above unchanged:
|VGFroot| = |VGF|+Ok(1),   |VHFroot| = |VHF|+Ok(1).
We stress that these are bounds for a generic obstruction. In concrete instantiations, smaller separating pairs often exist; our point is that the meta-theorem supplies a uniform construction with explicit resource dependence on F.

The equivalence χ𝒜(G) = χ𝒜(H) ⇔ hom (F, G) = hom (F, H) for all F ∈ F𝒜 is only as useful as our ability to recognize when a given pattern F lies in F𝒜. In general, F𝒜 is defined as a maximal family extracted from the unfolding basis U𝒜, and without additional structure one should not expect a simple graph-theoretic description. Nevertheless, in the instantiations of the family aligns with standard width notions, and membership becomes decidable by known algorithms.

For k = 1 message passing, we have F𝒜= forests. Membership is immediate: F ∈ F𝒜 iff F is acyclic, decidable in time O(|VF|+|EF|) by a DFS.

For k-tuple refinement with the standard one-coordinate replacement template, we have $F_{\mathcal{A}}=\{F:\tw(F)\le k\}$. For fixed k, membership is decidable in linear time in |VF| via Bodlaender’s algorithm (with constants depending superexponentially on k, as usual). If k is part of the input, deciding $\tw(F)\le k$ is NP-hard, and thus no fully general polynomial-time membership test should be expected unless P = NP. This delineates a practical regime: one fixes the architecture parameter k and then performs membership tests for patterns F of interest.

For the pair-based local models characterized via NED-like decompositions, the situation is analogous. Once the endpoint discipline is fixed (endpoint-shared, strong, almost-strong), membership reduces to the existence of a decomposition of F into constant-size bags with prescribed adhesion patterns. Such existence can be checked by dynamic programming over candidate decomposition trees, or, equivalently, by a finite-state search over partial decompositions, yielding a procedure polynomial in |VF| for each fixed bag size/adhesion template. Here again, the difficulty is not decidability but parameter dependence: if we enlarge the permitted bag vocabulary (or allow it to be part of the input), the recognition problem can encode hard constraint satisfaction tasks.

Given an explicit membership test for the structural characterization of F𝒜, we may also extract the for non-membership and feed it into the gadget construction. Concretely, for width notions the failure of membership can be witnessed by an obstruction: for forests, a cycle; for treewidth  ≤ k, a minor-like certificate of $\tw(F)>k$ or, operationally, failure of a k-tree decomposition search; for NED variants, a minimal subpattern violating the endpoint adhesion rule. In our framework, these obstructions correspond to a Spoiler win in the R-pebble game on F against the unfolding vocabulary. Once such a certificate is available, the gadget Gad(F) is constructed by composing local widgets prescribed by the certificate and then applying the twist along a distinguished interface.

The resulting overall runtime is polynomial in the output size of the gadget graphs:
$$ \mathrm{time}(\textsc{ConstructSeparator}(F)) \;=\; \mathrm{poly}(|V_{G_F}|+|E_{G_F}|), $$
with the polynomial uniform over F for each fixed architecture 𝒜. The dependence on F is therefore dominated by the blow-up intrinsic to |VGF|, as discussed above. This separation between and is useful: it clarifies that the main barrier to small counterexamples is not implementability but the need for sufficiently symmetric gadgets.

Even when F ∈ F𝒜, and thus hom (F, ⋅) is determined by χ𝒜(⋅), this does not imply that we can efficiently hom (F, G) from χ𝒜(G) in general. Theorem~2 proceeds via a linear reduction between counts of unfolding-object types and homomorphism counts of induced patterns, but the relevant linear system is conceptual: it is finite for each fixed radius and vocabulary, yet its size grows quickly with the unfolding depth required for stabilization and with the number of unfolding types.

In regimes where F𝒜 is a bounded-width class, there are nevertheless efficient counting algorithms independent of χ𝒜. For instance, for each fixed k and each fixed pattern F with $\tw(F)\le k$, the homomorphism number hom (F, G) can be computed in time O(|VG|k + 1) by standard tree-decomposition dynamic programming. Thus the content of the meta-theorem should be read as a : it characterizes exactly which patterns are in principle recoverable from χ𝒜, without asserting that the recovery map is computationally efficient when given χ𝒜(G) alone.

Finally, we record three barriers that delimit what one can hope to automate uniformly across architectures.

First, if an architecture were universal up to isomorphism, i.e. χ𝒜(G) = χ𝒜(H) ⇔ G ≅ H, then deciding representation equality would solve Graph Isomorphism. Any attempt at a fully general, efficiently checkable characterization of F𝒜 for such architectures would therefore amount to progress on GI-hard instances. This motivates restricting attention to refinement-based architectures whose neighbor template enforces a bounded-width unfolding vocabulary.

Second, even within refinement-based families, parameters that are treated as architecture hyperparameters (such as tuple dimension k) are not benign from the standpoint of meta-automation. As noted above, allowing k to vary turns membership in the induced family into the treewidth decision problem, NP-hard in k. Thus any general tool that takes (𝒜, F) as input and outputs whether F ∈ F𝒜 must either fix such parameters in advance or accept worst-case hardness.

Third, the underlying counting problems remain hard: for general patterns F, computing hom (F, G) is #P-hard. Our equivalence statements therefore cannot be interpreted as providing efficient counting algorithms from representations; rather, they certify that no method observing only χ𝒜(G) can, in the worst case, recover counts for patterns outside F𝒜, while for patterns inside F𝒜 recovery is information-theoretically possible (and algorithmically possible under additional bounded-width assumptions).

These complexity considerations will guide our discussion in the next section: they indicate which kinds of expressivity claims can be made robustly for modern architectures, and which extensions (learned neighbor selection, approximate aggregation, and hybrid attention/refinement mechanisms) require revisiting the axioms and the unfolding-based proof strategy.


10. Discussion and 2026 Outlook: how the meta-theorem can underwrite expressivity claims for modern models; open directions (learned neighborhoods, approximate variants, transformers).

We close by articulating what the meta-theorem buys us as a for modern graph models, and by isolating the main points at which contemporary architectures fall outside the present axiomatic envelope. Our aim is not to proclaim universality, but to make explicit which claims can be justified model-specific ad hoc arguments and which claims require new technical ideas.

A recurrent difficulty in the expressivity literature is that statements are often phrased at the level of ``model class = some refinement power’’ without clarifying which implementation choices are essential (hashing vs. continuous MLPs, sum vs. multiset normalization, finite vs. infinite precision, etc.). The present framework promotes a different view: once a forward pass can be cast as a stabilizing refinement-with-aggregation procedure with a fixed neighbor template, the architecture inherits a homomorphism family F𝒜 (and rooted analogues) that exactly characterizes the information content of its stable representation.

Operationally, this gives a clean rule for expressivity claims of the form
χ𝒜(G) = χ𝒜(H) ⇒ hom(F, G) = hom(F, H)  for all F ∈ 𝒞.
Such a statement is true if and only if 𝒞 ⊆ F𝒜, and it is in the sense that every F ∉ F𝒜 admits a uniform counterexample pair indistinguishable to 𝒜 but separated by hom(F, ⋅). In particular, one does not need to guess the ``right’’ test family by analogy with 1-WL or k-WL; the unfolding basis induced by R dictates it.

From the standpoint of model design, we may read F𝒜 as the set of patterns for which the architecture can, in principle, act as a sufficient statistic (at the level of stable representations). If an application demands sensitivity to a pattern outside F𝒜, then no amount of training data or parameter tuning can overcome the obstruction while keeping the same R-template and refinement semantics.

Many architectures deployed in 2024–2026 fit the axioms once we separate from . For example, message-passing GNNs with any continuous update (MLPs, residuals, layer normalization) still instantiate a refinement operator provided that (i) aggregation is permutation-invariant, (ii) the update is equivariant, and (iii) we interpret ``colors’’ as stable equivalence classes under the induced operator (i.e. we work information-theoretically, ignoring numerical collisions). Under this interpretation, the neighbor template is the sole driver of the unfolding vocabulary.

Hybrid models that mix local message passing with a global channel (graph token, virtual node, pooled context) also fall into the framework if the global channel is treated as an additional, isomorphism-invariant register whose update depends only on multisets of tuple-states. In that case, the unfolding objects become trees (or hypertrees) decorated with global annotations; the induced pattern family typically remains structurally bounded, but can strictly enlarge compared to the purely local case.

Equivariant readouts (node- or edge-level) are particularly clean in this language: rooted homomorphism families F𝒜n, F𝒜e provide an exact account of which rooted patterns are determined by node/edge representations. Thus, claims such as ``the model can represent all functions of neighborhoods of radius r’’ are only meaningful once one specifies whether the neighborhood is defined by R on tuples (and hence by the unfolding), or by an extrinsic metric (e.g. shortest-path balls) that may not be preserved by the refinement dynamics.

A dominant trend is to the notion of neighborhood—via attention, learned edge weights, top-m selection, or structure learning modules that add/remove edges. This is precisely where axiom (A3) (fixed, isomorphism-equivariant neighbor template) becomes delicate.

There are two qualitatively different regimes.

We view the first regime as a promising direction for 2026: one can seek an axiom system where the template is not fixed but is drawn from a equivariant grammar of neighborhood rules, so that the induced guarded unfoldings remain finitely describable and yield a maximal homomorphism family.

Modern models often depart from exact multiset aggregation in three ways: (i) neighbor sampling for scalability, (ii) truncation to a bounded number of rounds, and (iii) real-valued computations with finite precision. Each interacts differently with our axioms.

Neighbor sampling breaks (A2) and (A5) if interpreted literally: the refinement state becomes random, stabilization is not guaranteed on a fixed instance, and representation equality becomes a probabilistic statement. A natural extension is therefore to replace determinacy by distributional invariants, studying when
ℒ(χ𝒜(G)) = ℒ(χ𝒜(H))
implies equality of certain homomorphism counts or of low-degree graph statistics. One expects the induced ``pattern family’’ to become a family of patterns weighted by sampling probabilities, akin to unbiased estimators for subgraph counts. Establishing maximality in this stochastic setting would require a gadget theory that equalizes the distribution of sampled unfoldings while perturbing targeted hom counts in expectation.

Truncation to T rounds yields a well-defined, finitely bounded unfolding family U𝒜( ≤ T). In this case we can recover a clean theorem: representation equality after T rounds is equivalent to equality of homomorphism counts over the patterns induced by depth-T unfoldings. The maximal family is then explicitly finite (up to isomorphism types of bounded depth), and the separation gadgets can be chosen of bounded radius. Thus, for truncated models, the theory becomes constructive, at the cost of a weaker invariance.

Finite precision and continuous embeddings are subtler. Our meta-theorem is intentionally information-theoretic; it identifies what is determined by the refinement operator, not what is numerically stable. For practitioners, this suggests a two-step interpretive discipline: first identify F𝒜 at the level of the induced symbolic refinement; then separately analyze whether the chosen parameterization approximates that refinement with acceptable error. Bridging these two steps is an open interface between expressivity theory and numerical analysis.

Graph Transformers with full self-attention do not fit the local neighbor-template paradigm unless one treats the neighborhood of each tuple as ``all tuples,’’ which collapses the unfolding perspective into a global computation. Even then, two features of Transformers complicate matters: (i) attention scores induce a (weighted) aggregation rather than a pure multiset, and (ii) positional/structural encodings (Laplacian eigenvectors, random features, shortest-path biases) inject additional information that may not be definable from the graph structure alone.

There is nevertheless a plausible path forward. One may define an RWA-like model over a richer signature in which the neighbor relation includes global edges annotated by equivalence classes of relative positions, and in which aggregation carries weights drawn from a finite palette (or from an abstract commutative semiring). The corresponding unfolding objects would be hypertrees or, in the fully global case, canonical decompositions akin to color refinement on complete relational structures. The question for 2026 is then: can we recover a maximal family of test structures that plays the role of F𝒜 when the underlying logic is no longer purely homomorphism logic over graphs but over enriched signatures?

We emphasize a limitation: if positional encodings leak vertex identities (even approximately), then the induced representation can become as hard to characterize as graph isomorphism itself. In that regime, the correct conclusion is not that the framework fails, but that any maximal-family characterization will inherit GI-level barriers.

We conclude with a short list of technically precise problems suggested by the above discussion.

Our position is that the unfolding-based viewpoint remains the correct organizing principle: as architectures evolve, the relevant question is which witness objects they canonically induce, and which pattern counts those witnesses linearly generate. The meta-theorem provides the stable core; the outlook is to enlarge the space of admissible unfoldings without sacrificing maximality and separability.