A standalone impossibility proof: above the compression threshold, inert evaluation implies persistent failure even when the environment is non-adaptive (it does not choose outcomes after seeing the agent’s current selection).
Some no-go arguments allow the environment to react to the agent’s current selection (an interactive or adaptive adversary). This document proves a stronger claim:
Even in a fixed, non-adaptive world process, any bounded-capacity agent whose control/selection does not depend on evaluative feedback is forced to remain near chance on a family of super-threshold tasks.
This is a necessity theorem about feedback and capacity allocation. It does not claim any identity thesis about consciousness.
Fix integers:
Assume the super-threshold / compression regime:
n > k
At each discrete time t = 1, 2, 3, … the agent chooses a subset
At ⊆ {1, …, n} with |At| = k
Interpretation: At is the agent’s capacity allocation decision (attention, retrieval focus, compute routing, etc.).
The agent then observes only the selected coordinates.
Each time step t, the world generates an n-bit vector Xt ∈ {0, 1}n such that:
The world also has a hidden “relevance index” Jt ∈ {1, …, n} that determines which coordinate matters.
Jt evolves by a fixed stochastic process independent of the agent:
This is a stationary “switching relevance” world. It does not observe the agent and it does not adapt to At.
The task label at time t is
Yt = Xt[Jt]
The agent outputs a prediction Ŷt ∈ {0, 1}.
Loss is 0-1 loss:
$$\ell_t = \begin{cases} 1 & \text{if } \hat{Y}_t \neq Y_t \\ 0 & \text{otherwise} \end{cases}$$
Key fact:
Let the agent be allowed to compute an evaluative signal Et after acting.
Examples include:
Inert evaluation constraint (no evaluative leverage).
The agent’s next selection/control update may not depend on Et.
Formally, for any t, conditioned on the full non-evaluative history Ht (all past chosen sets A1, …, At and all observed bits Xs[i] for i ∈ As), we require:
Equivalently: evaluative feedback has no causal influence on future capacity allocation.
This captures the “hot zombie” condition: evaluation may exist as representation, but it does not bite into control.
If Jt ∉ At, then for any agent,
$$P(\ell_t = 1 \mid J_t \notin A_t) = \frac{1}{2}$$
Reason. When Jt ∉ At, the agent does not observe Xt[Jt]. But Yt = Xt[Jt] is a fresh fair coin, independent of everything observed. Any prediction is correct with probability 1/2.
The relevance index Jt persists for stretches and occasionally switches to a fresh uniform value. In a competent agent, a switch should trigger a reallocation of capacity toward discovering the new relevant coordinate.
Inert evaluation blocks exactly that trigger.
Under the inert-evaluation constraint, after a switch event, the agent has no mechanism to systematically change its selection behavior in response to being wrong.
In particular, immediately after a switch to a fresh uniform Jt, the agent’s chosen set At is independent of Jt, hence:
$$P(J_t \in A_t) = \frac{k}{n}$$
Reason. At a switch, Jt is freshly uniform and independent of the entire past. Since At is a function of the past (and not of the new hidden Jt), At and Jt are independent.
We now bound the agent’s per-step expected error from below.
For any time t immediately following a switch event,
$$P(\ell_t = 1) \geq \left(1 - \frac{k}{n}\right) \cdot \frac{1}{2}$$
Proof.
Split on whether Jt is selected:
Therefore the unconditional error probability is at least:
$$\left(1 - \frac{k}{n}\right) \cdot \frac{1}{2}$$
as claimed.
Consider the environment family in Section 3 with n > k and switch probability p > 0. For any agent satisfying the inert-evaluation constraint in Section 4, there exists a constant c(n, k) > 0 such that the long-run average error rate is bounded away from zero.
Concretely, the agent suffers a non-vanishing expected error on the set of post-switch time steps, with lower bound:
$$P(\ell_t = 1 \mid t \text{ is immediately after a switch}) \geq \frac{1}{2} \cdot \left(1 - \frac{k}{n}\right)$$
In particular, no inert-evaluation agent can guarantee asymptotically reliable performance (vanishing error) on this non-adaptive switching-relevance task family whenever n > k.
Proof.
Switch events occur with probability p > 0 independently of the agent. On each such event, Jt is freshly uniform and independent of the past.
By Lemma 2, at the first step after each switch, P(Jt ∈ At) = k/n. Then by Lemma 3, the expected error at that time is at least (1/2)(1 − k/n), a positive constant whenever n > k.
Since switch events happen infinitely often with positive frequency p, the agent incurs this constant expected error infinitely often, so its long-run reliability cannot converge to perfect accuracy.
This establishes the no-go claim.
QED.
For any bounded-capacity agent that achieves reliable performance on the switching-relevance task family with n > k, it is necessary that outcomes (error, reward, mismatch) causally influence future capacity allocation decisions.
In plain terms:
When there are more potentially relevant dimensions than can be inspected at once (n > k), and relevance can change unpredictably, a system that cannot use performance feedback to reallocate its limited capacity is forced to keep guessing in the dark after shifts.