A standalone necessity result: under branching control with limited capacity and global evaluative feedback, a competent system must represent a self-indexing variable (“ownership”) that binds evaluation to the system’s own internal trajectory.
This result is not an identity claim about consciousness. It is a control/learning claim: without self-indexing, credit assignment becomes ill-posed in the regimes where selection and integration are required.
We consider an agent operating in discrete time steps t = 0, 1, 2, ....
At each time step, the agent performs capacity allocation and control:
The agent also computes an evaluative signal Et summarizing performance-relevant information:
We do not assume any particular learning algorithm. We only assume that in a generally competent agent, evaluation must eventually influence future selection/control decisions.
A branch is a distinct internal candidate trajectory the system could pursue. Formally, a branch is an element b ∈ ℬ.
We assume that at time t there are nt viable branches, with nt > 1 often.
Let Ct be the selection/control state that determines which subset of branches is actively represented/processed.
We assume a capacity bound: there exists k such that at any time step the agent can actively process at most k branches (or k effective degrees of freedom), with
∃ k s.t. #{branches actively represented at t} ≤ k.
This means that for nt > k the system must choose what to represent.
We say evaluation is globally available if Et can be read by multiple internal update mechanisms at the next step:
Formally, this is the existence of at least two update functions that take Et as an input.
A self-index is a variable st that identifies the internal trajectory to which evaluation applies.
The minimal requirement is that st supports a binding relation of the form:
Bind(Et, st) ⇒ Update targets the responsible internal causes.
Informally: “this error/value pertains to my branch/decision/state.”
It is tempting to think that once Et is globally broadcast, the system can correct itself.
However, global broadcast alone does not solve credit assignment when the system:
In such settings, evaluation must be attributed to which internal choice caused it.
If the system cannot represent which internal branch is “mine / active / responsible,” then evaluation cannot reliably update the correct cause.
We use only the following assumptions:
There exist times t with nt > k, i.e. the agent must select among competing internal branches.
There exist environments where the correct branch varies across time/episodes and cannot be perfectly predicted by a fixed schedule.
To maintain performance under novelty, the agent must improve its future selection/control decisions using feedback.
This is a very weak assumption: if the agent never improves selection using feedback, then in novelty regimes it will repeatedly allocate capacity to irrelevant branches and fail.
Assume nt > 1. Suppose the agent broadcasts an evaluative signal Et, but has no self-index variable capable of representing which branch was active/responsible.
Then there exist two distinct internal trajectories b and b′ such that:
Consider a time t where two branches compete:
Assume the agent’s capacity limitation forces it to commit to one branch (it cannot fully represent both).
Now construct a novelty regime where:
Let the agent receive negative evaluation Et = "bad outcome" whenever it selects the wrong hypothesis.
Without a self-index, the agent possesses only the global fact “bad outcome occurred,” not “bad outcome occurred because branch b (rather than b′) was selected.”
Thus the same evaluation Et is consistent with two different internal responsibility assignments that demand different corrections.
Therefore evaluation does not uniquely determine an update target. ∎
If the update target is not uniquely determined by evaluation, then any generic update rule driven by Et either:
Let the agent apply some update rule
(Ct + 1, θt + 1) = U(Ct, θt, Et, …)
where θ denotes any internal parameters governing selection/control/policy.
By Lemma 1, the same Et must sometimes mean “punish branch b” and sometimes mean “punish branch b′,” depending on which branch was active.
If the agent cannot encode which branch was responsible, then U cannot condition updates on responsibility.
Hence:
Therefore, under novelty and branching, learning is unstable or impotent without self-indexing. ∎
Assume A1–A3. Suppose evaluation Et is globally broadcast and is required to stabilize competence under novelty. Then any generally competent agent must implement a self-indexing variable st sufficient to bind evaluation to the internal trajectory responsible for it.
Equivalently:
In branching, capacity-limited systems, global evaluation without ownership cannot support stable competence.
By A1, there exist decision points with nt > 1 competing branches under capacity limitation.
By A2, which branch is correct varies over time/episodes unpredictably.
By A3, the agent must use feedback to improve selection/control.
Suppose for contradiction that the agent has no self-index st.
Then by Lemma 1, evaluative feedback Et is ambiguous with respect to internal responsibility and does not uniquely specify which internal cause/branch should be updated.
By Lemma 2, ambiguous evaluation yields either no reliable learning improvement, performance instability, or implicit reconstruction of a responsibility tag.
Since sustained competence requires stable improvement of selection/control under novelty (A3), the first two outcomes are incompatible with competence.
Therefore the system must implement the third: an internal variable that tracks “which trajectory/branch is mine/responsible,” i.e. self-indexing.
Contradiction. Hence, self-indexing is necessary. ∎
If evaluation is broadcast to multiple subsystems (planning, memory, action selection), then self-indexing must be stable across those subsystems; otherwise different modules will attribute evaluation to different internal causes and thrash.
The theorem does not require metaphysical selfhood. It only requires a functional tag that binds evaluation to the agent’s internal causal chain.
A system may log evaluation (“I did badly”) without self-indexing, but such logging does not support systematic correction under branching. Logging is not ownership.
It claims a conditional necessity:
If the system must maintain competence under novelty in a branching, capacity-limited regime using evaluative feedback, then it must self-index evaluation.
In open-loop control, evaluation is “heat” that does not bite.
In closed-loop control, evaluation steers resource allocation.
But when the system’s own internal branches compete for capacity, evaluation must not only steer control—it must be attributable to which internal trajectory it is steering.
That attribution is precisely self-indexing.