← Back

Self-Indexing Necessity Theorem

A standalone necessity result: under branching control with limited capacity and global evaluative feedback, a competent system must represent a self-indexing variable (“ownership”) that binds evaluation to the system’s own internal trajectory.

This result is not an identity claim about consciousness. It is a control/learning claim: without self-indexing, credit assignment becomes ill-posed in the regimes where selection and integration are required.

1. Setup

We consider an agent operating in discrete time steps t = 0, 1, 2, ....

At each time step, the agent performs capacity allocation and control:

It chooses a selection/control state C_t from some set 𝒞.
- C_t may include attention masks, retrieval choices, compute routing, tool-use decisions, memory write gates, plan selection, etc.
- C_t is capacity-limited: it cannot represent or execute all candidate branches/options simultaneously.
The agent produces an action a_t and receives an observation o_t + 1 and reward (or utility signal) r_t + 1.

The agent also computes an evaluative signal E_t summarizing performance-relevant information:

prediction error / mismatch
uncertainty
value or utility weighting
urgency or salience

We do not assume any particular learning algorithm. We only assume that in a generally competent agent, evaluation must eventually influence future selection/control decisions.

2. Core definitions

2.1. Branching

A branch is a distinct internal candidate trajectory the system could pursue. Formally, a branch is an element b ∈ ℬ.

Examples of branches:
- competing plans / hypotheses
- alternative next actions
- alternative retrieval queries
- alternative memory writes
- different attention allocations

We assume that at time t there are n_t viable branches, with n_t > 1 often.

2.2. Limited-capacity selection

Let C_t be the selection/control state that determines which subset of branches is actively represented/processed.

We assume a capacity bound: there exists k such that at any time step the agent can actively process at most k branches (or k effective degrees of freedom), with

∃ k s.t. #{branches actively represented at t} ≤ k.

This means that for n_t > k the system must choose what to represent.

2.3. Global evaluative broadcast

We say evaluation is globally available if E_t can be read by multiple internal update mechanisms at the next step:

selection allocation update (attention / retrieval / routing)
action policy update
memory write policy update
planning mode update

Formally, this is the existence of at least two update functions that take E_t as an input.

2.4. Self-index / ownership variable

A self-index is a variable s_t that identifies the internal trajectory to which evaluation applies.

The minimal requirement is that s_t supports a binding relation of the form:

Bind(E_t, s_t) ⇒ Update targets the responsible internal causes.

Informally: “this error/value pertains to my branch/decision/state.”

3. The problem the theorem isolates: global evaluation is not enough

It is tempting to think that once E_t is globally broadcast, the system can correct itself.

However, global broadcast alone does not solve credit assignment when the system:

is capacity-limited (n_t > k), and
must commit to one of many competing internal branches.

In such settings, evaluation must be attributed to which internal choice caused it.

If the system cannot represent which internal branch is “mine / active / responsible,” then evaluation cannot reliably update the correct cause.

4. Assumptions (minimal)

We use only the following assumptions:

A1. Branching under selection

There exist times t with n_t > k, i.e. the agent must select among competing internal branches.

A2. Nontrivial novelty

There exist environments where the correct branch varies across time/episodes and cannot be perfectly predicted by a fixed schedule.

A3. Competence requires credit assignment

To maintain performance under novelty, the agent must improve its future selection/control decisions using feedback.

This is a very weak assumption: if the agent never improves selection using feedback, then in novelty regimes it will repeatedly allocate capacity to irrelevant branches and fail.

5. Lemma 1: Without self-indexing, evaluation cannot uniquely select an update target

Lemma 1 (Ambiguity of responsibility)

Assume n_t > 1. Suppose the agent broadcasts an evaluative signal E_t, but has no self-index variable capable of representing which branch was active/responsible.

Then there exist two distinct internal trajectories b and b^′ such that:

both are compatible with the same externally observed interaction history (o_0 : t, a_0 : t), and
both are compatible with the same E_t, but
they require different parameter/control updates to improve future performance.

Proof (construction)

Consider a time t where two branches compete:

b: retrieve hypothesis H₁ and act accordingly
b^′: retrieve hypothesis H₂ and act accordingly

Assume the agent’s capacity limitation forces it to commit to one branch (it cannot fully represent both).

Now construct a novelty regime where:

in episode type 1, H₁ is correct and H₂ is wrong
in episode type 2, H₂ is correct and H₁ is wrong

Let the agent receive negative evaluation E_t = "bad outcome" whenever it selects the wrong hypothesis.

Without a self-index, the agent possesses only the global fact “bad outcome occurred,” not “bad outcome occurred because branch b (rather than b^′) was selected.”

Thus the same evaluation E_t is consistent with two different internal responsibility assignments that demand different corrections.

Therefore evaluation does not uniquely determine an update target. ∎

6. Lemma 2: Ambiguous evaluation induces unstable or impotent learning

Lemma 2 (Credit-assignment failure)

If the update target is not uniquely determined by evaluation, then any generic update rule driven by E_t either:

averages across incompatible causes (washing out learning), or
updates the wrong cause (destabilizing performance), or
must secretly implement an implicit self-index (contradicting the no-self-index premise).

Proof sketch

Let the agent apply some update rule

(C_t + 1, θ_t + 1) = U(C_t, θ_t, E_t, …)

where θ denotes any internal parameters governing selection/control/policy.

By Lemma 1, the same E_t must sometimes mean “punish branch b” and sometimes mean “punish branch b^′,” depending on which branch was active.

If the agent cannot encode which branch was responsible, then U cannot condition updates on responsibility.

Hence:

If U updates both branches symmetrically, it reduces discriminability and fails to improve selection.
If U updates one branch arbitrarily, it will be wrong on some episodes, degrading competence.
If U becomes reliably correct anyway, it must be extracting a hidden variable that tracks responsibility, which is exactly a self-index in functional form.

Therefore, under novelty and branching, learning is unstable or impotent without self-indexing. ∎

7. Theorem: Self-indexing is necessary for competence under global evaluation and branching

Theorem (Self-Indexing Necessity)

Assume A1–A3. Suppose evaluation E_t is globally broadcast and is required to stabilize competence under novelty. Then any generally competent agent must implement a self-indexing variable s_t sufficient to bind evaluation to the internal trajectory responsible for it.

Equivalently:

In branching, capacity-limited systems, global evaluation without ownership cannot support stable competence.

Proof

By A1, there exist decision points with n_t > 1 competing branches under capacity limitation.

By A2, which branch is correct varies over time/episodes unpredictably.

By A3, the agent must use feedback to improve selection/control.

Suppose for contradiction that the agent has no self-index s_t.

Then by Lemma 1, evaluative feedback E_t is ambiguous with respect to internal responsibility and does not uniquely specify which internal cause/branch should be updated.

By Lemma 2, ambiguous evaluation yields either no reliable learning improvement, performance instability, or implicit reconstruction of a responsibility tag.

Since sustained competence requires stable improvement of selection/control under novelty (A3), the first two outcomes are incompatible with competence.

Therefore the system must implement the third: an internal variable that tracks “which trajectory/branch is mine/responsible,” i.e. self-indexing.

Contradiction. Hence, self-indexing is necessary. ∎

8. Corollaries

Corollary 1 (Ownership is required for global coherence)

If evaluation is broadcast to multiple subsystems (planning, memory, action selection), then self-indexing must be stable across those subsystems; otherwise different modules will attribute evaluation to different internal causes and thrash.

Corollary 2 (Self-indexing is minimal, not metaphysical)

The theorem does not require metaphysical selfhood. It only requires a functional tag that binds evaluation to the agent’s internal causal chain.

Corollary 3 (Why “report without ownership” is unstable)

A system may log evaluation (“I did badly”) without self-indexing, but such logging does not support systematic correction under branching. Logging is not ownership.

9. What the theorem does not claim

It does not claim that self-indexing is phenomenology.
It does not claim that any system with a self-index is conscious.
It does not claim that every task requires self-indexing.

It claims a conditional necessity:

If the system must maintain competence under novelty in a branching, capacity-limited regime using evaluative feedback, then it must self-index evaluation.

10. Intuition (optional bridge)

In open-loop control, evaluation is “heat” that does not bite.

In closed-loop control, evaluation steers resource allocation.

But when the system’s own internal branches compete for capacity, evaluation must not only steer control—it must be attributable to which internal trajectory it is steering.

That attribution is precisely self-indexing.