A constraint lemma: if self-indexing works, it is essentially unique.
Under branching with global evaluation, any two self-indexing schemes that support stable credit assignment must be equivalent up to relabeling.
Equivalently:
If “ownership pointers” work at all, there is (almost) only one right way to assign them:
the only freedom is renaming the labels.
This turns self-indexing from “one possible trick” into a canonical structural requirement.
Let the system generate at time t a finite set of internal candidate
branches:
Bt = {bt(1), …, bt(m)}.
Each branch corresponds to a distinct internal trajectory suffix (plan, thought continuation, action proposal, memory write candidate, etc.).
Let $b_t^\*\in B_t$ be the branch that is actually enacted/committed into downstream control.
At time t, a global
evaluative signal Et is produced
and broadcast to multiple operators/modules:
Et ∈ ℰ.
This evaluation must be assigned as credit/blame to some branch(es) to drive learning or internal policy revision.
A self-indexing scheme is a rule that maps the
system’s internal evidence to an “ownership pointer”:
st: (internal
state, trace) → {1, …, m},
interpreted as “the branch that is mine / the branch that owns Et.”
Thus st
induces a credited branch:
b̂t(s) := bt(st).
Let Δt
denote the update to shared substrate ξ (weights, routing priors, memory
gates, etc.) driven by evaluation and attribution:
ξt + 1 = U(ξt, Et, b̂t).
A self-indexing scheme s is stable if, across time and tasks, it yields bounded oscillation and improves or maintains performance; informally, it does not produce persistent “credit thrash.”
We write: s ∈ 𝒮stable.
Lemma (Uniqueness up to Relabeling).
Consider a branching system with global evaluation Et and shared substrate updates ξt + 1 = U(ξt, Et, b̂t).
Suppose two self-indexing schemes s and s′ are both stable:
s, s′ ∈ 𝒮stable.
Then s and s′ are equivalent up to relabeling of branch identifiers:
∃ π ∈ 𝔖m (a permutation) s.t. st′ = π(st) for all relevant t,
i.e., they induce the same ownership partition of trajectories, differing only by name.
Assume the hypotheses and suppose, for contradiction, that s and s′ are not equivalent up to relabeling.
If there is no permutation π with st′ = π(st)
for all t, then there exists
at least one time t0 such that:
b̂t0(s) ≠ b̂t0(s′).
That is: the two schemes assign ownership of the same evaluation signal
Et0
to different branches in a way that cannot be repaired
by consistently renaming labels.
Intuitively: the schemes disagree on “which branch is mine” in a substantive way.
Because Et is global, it
is used by multiple operators and enters a shared update:
ξt + 1 = U(ξt, Et, b̂t).
Therefore the update induced by scheme s at t0 differs from the
update induced by s′:
U(ξt0, Et0, b̂t0(s)) ≠ U(ξt0, Et0, b̂t0(s′)).
In particular, at least one shared parameter (routing prior, policy weight, memory gate, etc.) receives conflicting credit direction across the two attributions.
Now consider repeated exposure to a recurring task family in which
the same branch-role structure reappears
(e.g., the system repeatedly generates similar alternative plans and
receives similar evaluation).
Because the two schemes disagree on ownership at such events, they push the shared substrate in incompatible directions over time:
Since these policies/routes are mutually exclusive competitors within
the same choice set, the system cannot converge.
Instead it experiences one of:
Any of these contradicts the assumption that both s and s′ yield stable credit assignment.
Thus, for stability to obtain, any apparent difference between s and s′ must be superficial—i.e., a consistent relabeling.
We conclude that if both schemes are stable, then they must induce
the same ownership mapping on branch structure, differing only by a
permutation π:
st′ = π(st)
for all relevant times.
▫
This lemma does not claim there is a single
implementation.
It claims there is a single equivalence class of
working implementations:
If evaluation were purely local (no shared substrate, no broadcast), disagreement between indexing schemes could remain compartmentalized.
Global evaluation forces a single coherent ownership binding; otherwise modules fight.
Uniqueness up to relabeling may fail if:
- branches are truly symmetric and interchangeable under the task
distribution,
- evaluation does not enter any shared update,
- or the system never revisits structurally similar branching situations
(no recurrence).
These are degenerate cases where “credit assignment” is not well-posed or not required.
Corollary.
In any non-degenerate branching system with global evaluation and learning/control adaptation,
self-indexing is not an arbitrary narrative layer: it is a canonical binding constraint demanded by stable credit assignment.
This justifies treating the “ownership pointer” as a forced architectural primitive rather than a convenient story.
If self-indexing works under branching and global evaluation, it’s essentially unique—the only freedom is renaming which label points to “mine.”