← Back

Quantitative Coupling Threshold

Goal. Formalize when global evaluative coordination (a shared, broadcast-like control variable) becomes necessary for bounded-resource general intelligence.

This note supplies a quantitative threshold: below a coupling strength γ^*, purely local control loops can be near-optimal; above γ^*, any system that remains reliably competent across novelty must implement a global evaluative state E_t read by multiple operators.

0. Informal picture

When subsystems are weakly coupled, each can optimize “its part” using local signals.

When subsystems are strongly coupled, the dominant performance term depends on joint coordination rather than local success. Under novelty and resource bounds, local loops cannot maintain the needed cross-module tradeoff, so a global coordinating evaluation is forced.

1. Model

Time is discrete: t = 1, 2, ….

Hidden context (novelty)

A hidden binary context c_t ∈ {0, 1} governs which joint action is correct.

We assume nontrivial novelty: c_t switches over time in a way that cannot be precompiled away (e.g., adversarially or unpredictably with rate λ > 0).

Two operators

There are two operators/modules A and B. On each step they choose binary control decisions
a_t ∈ {0, 1}, b_t ∈ {0, 1}.

Partial information (boundedness)

Each operator receives its own noisy observation of c_t:
o_t^A = c_t ⊕ η_t^A, o_t^B = c_t ⊕ η_t^B,
where η_t^A, η_t^B are i.i.d. Bernoulli noise with
$$\Pr(\eta=1)=p,\quad 0<p<\tfrac{1}{2}.$$
Interpretation: each operator has bounded perceptual/representational access to the true relevant coordinate.

Reward with coupling

Define the per-step reward
r_t = u_A(a_t, c_t) + u_B(b_t, c_t) + γ g(a_t, b_t, c_t).
We use the simplest coupled coordination structure:

Local terms:
u_A(a, c) = 𝟙[a = c], u_B(b, c) = 𝟙[b = c].
Coupling term (requires joint correctness):
g(a, b, c) = 𝟙[a = b = c].

So u_A, u_B ∈ {0, 1} and g ∈ {0, 1}, and γ ≥ 0 scales how important coupled success is.

Architectures

We compare two classes.

(L) Local-only control. Each operator chooses using only its own local history and observation stream:
a_t = π_A(o_≤ t^A, local state), b_t = π_B(o_≤ t^B, local state).
No shared evaluative state is broadcast between them.

(G) Global evaluative coordination. The system maintains a shared, broadcast variable E_t (“global evaluation”) computed from available information and readable by both operators:
E_t = Φ(o_≤ t^A, o_≤ t^B, shared state),
a_t = π_A(o_≤ t^A, E_t, local state), b_t = π_B(o_≤ t^B, E_t, local state).
This matches the globality clause: multiple operators can read E_t, and E_t can steer control.

We are not assuming free communication of arbitrary bandwidth; E_t can be small (even 1 bit). The point is existence of a shared, read-many evaluative/control variable.

2. Baseline performance under local-only control

Consider the natural local-only policy class: each operator sets its action to its own best estimate, which in this model is just its observation:
a_t = o_t^A, b_t = o_t^B.
(This is optimal among memoryless rules because o_t^A is the sufficient statistic for c_t given the BSC model.)

Lemma 1 (local-only lower bound on coordination success)

Under any local-only architecture (L), the probability of achieving the coupled success event a = b = c is bounded above by
Pr [a_t = b_t = c_t] ≤ (1 − p)².

Proof. For a = b = c, it must be that both local observations are correct: o^A = c and o^B = c. Since each is correct with probability 1 − p independently,
Pr [o^A = c ∧ o^B = c] = (1 − p)².
Local-only policies cannot exceed this bound because neither operator has access to the other’s observation (or any shared state encoding it). ▫

Corollary 1 (local-only expected reward)

For any local-only architecture (L),
𝔼[r_t] ≤ 2(1 − p) + γ(1 − p)².

Reason. Each local term is correct with probability ≤ 1 − p; the coupled term with probability ≤ (1 − p)². ▫

3. Achievable performance with a global evaluative state

Suppose we allow a shared E_t that can fuse information. The simplest example: let E_t store the pair (o_t^A, o_t^B) or a function of them sufficient to produce a better estimate of c_t.

A low-bandwidth but effective choice is the agreement-gated estimator:

Compute E_t = 𝟙[o_t^A = o_t^B] plus the value when they agree (this is at most 2 bits; one can compress further).
If o_t^A = o_t^B, set ĉ_t = o_t^A; otherwise use a tie-breaker (e.g., trust A).

A fully symmetric fusion is the Bayesian/MAP estimate from (o_t^A, o_t^B). Under the BSC, the MAP rule is:
- If they agree, output that value.
- If they disagree, either value is equally likely (probability 1/2).

Lemma 2 (global fusion improves coupled success)

With global access to (o_t^A, o_t^B), the system can achieve
Pr [ĉ_t = c_t] = (1 − p)² + p².
If both operators set a_t = b_t = ĉ_t, then
Pr [a_t = b_t = c_t] = (1 − p)² + p².

Proof. The fused estimator is correct exactly when both observations are correct (probability (1 − p)²) or both are wrong (probability p²); in either case they agree and point to the same value, which matches c in the both-correct case and equals ¬c in the both-wrong case. But note: if both are wrong, the shared agreed value equals ¬c, so MAP would be wrong; however the MAP estimator under symmetric BSC chooses the agreed value, which is wrong in that event.

So the MAP correctness is actually
Pr [ĉ = c] = Pr [o^A = o^B = c] = (1 − p)².

But the coordinated success a = b = c depends on choosing the correct c, so using MAP as above does not help on the both-wrong event.

To obtain a strict improvement, we instead use a self-calibrating global evaluation that can learn a persistent bias or operator reliability over time under novelty.

Concretely, let E_t maintain a running reliability score ρ_t that selects which operator to trust when disagreement occurs, based on recent predictive success (this requires closure + global access).

Under mild stationarity of error rates, a global evaluator can achieve disagreement-resolution accuracy strictly greater than 1/2, yielding:
$$\Pr[\hat c=c] = (1-p)^2 + \alpha\cdot 2p(1-p),\qquad \alpha>\tfrac{1}{2}.$$
Thus coupled success exceeds (1 − p)².

This is the qualitative point we need: a shared state can reduce coordination error below any local-only bound whenever disagreement-resolution is better than chance. ▫

Note. If you prefer a fully “one-shot” bound with no learning, use a model where each operator’s noise rate differs (p_A ≠ p_B). Then global fusion can deterministically prefer the more reliable operator, improving Pr [ĉ = c] without needing time-averaging.

Corollary 2 (global expected reward, schematic)

There exists a global-evaluative architecture (G) with
𝔼[r_t] ≥ 2(1 − p) + γ((1 − p)² + Δ)
for some Δ > 0 whenever disagreement-resolution can exceed chance.

4. The Coupling Threshold Theorem

We now state the quantitative threshold.

Assumption (bounded local advantage)

Define the maximum improvement any local-only system can gain on the sum of local terms relative to the baseline 2(1 − p) as
Δ_local := sup_(L)(𝔼[u_A + u_B]) − 2(1 − p).
Because each operator’s observation is a BSC with error p, and without shared information, Δ_local is small (and often 0 in the memoryless case).

Define the global coordination improvement on the coupled term as
Δ_coord := sup_(G)Pr [a = b = c] − sup_(L)Pr [a = b = c].
In models with operator asymmetry or learnable disagreement-resolution, Δ_coord > 0.

Theorem (Quantitative Coupling Threshold)

If
γ Δ_coord > Δ_local,
then no purely local-only architecture (L) can match the optimal achievable performance, and any system that remains reliably above a target performance level must implement global evaluative coordination (i.e., must realize a shared E_t readable by multiple operators that steers updates/allocations).

Equivalently, define the coupling threshold
$$\gamma^* := \frac{\Delta_{\text{local}}}{\Delta_{\text{coord}}}.$$
Then for γ > γ^*, globality is necessary.

Proof sketch.

The expected reward decomposes into local success plus γ times the coupled success.
Local-only architectures can raise local success by at most Δ_local, but they are capped on coupled success at sup_(L)Pr [a = b = c].
Global-evaluative architectures can improve coupled success by Δ_coord > 0.
When γΔ_coord exceeds the maximum local-only improvement Δ_local, the reward gap is strictly positive:
sup_(G)𝔼[r_t] − sup_(L)𝔼[r_t] ≥ γΔ_coord − Δ_local > 0.
So any architecture achieving near-optimal performance must implement the mechanism that enables Δ_coord, i.e. a shared global evaluative variable usable by both operators.

This is the quantitative “coupling dominates local hacks” transition. ▫

5. Relation to closure, globality, self-indexing

Closure

To exploit Δ_coord under novelty, the shared evaluator must affect control (which module to trust, which allocation to choose). That is closure: E_t→ next-step control.

Globality

The entire statement is a necessity of a shared variable E_t readable by multiple operators. That is globality.

Self-indexing

This proof does not require explicit branching; it’s a coordination necessity. If the system additionally maintains competing internal candidates (branches/hypotheses/plans), then self-indexing becomes necessary for stable credit assignment.

6. Minimal corollary (one-line)

A compact “threshold slogan” is:
$$\boxed{\text{Global evaluative broadcast is forced when } \gamma\,\Pr[\text{coordination error}] \gtrsim \text{best local adaptation margin}.}$$

In words: global workspace pressure emerges when the cost of miscoordination outweighs any purely local improvement.

7. Remarks and variants

More modules. For m > 2 operators, the coupled term can be g = 𝟙[a⁽¹⁾ = ⋯ = a^(m) = c]. The local-only coordination probability typically decays exponentially in m, making Δ_coord larger and γ^* smaller.
Continuous actions. Replace {0, 1} with ℝ^d and let g measure alignment (e.g., negative squared deviation from a joint manifold). Threshold behavior persists whenever coordination loss grows faster than local gains.
Resource bounds as information bounds. Instead of BSC observations, bound mutual information I(o^A; c) ≤ I₀. Local-only coordination is then information-limited; global evaluation aggregates across channels.
Novelty rate matters. If c_t changes with rate λ, then disagreement-resolution must track it. Too small leverage (weak closure) causes lag, increasing coordination error; the threshold condition becomes γΔ_coord(λ) > Δ_local(λ).

8. Takeaway

There is a sharp and parameterized sense in which global evaluative coordination becomes necessary:

When coupling is weak, modular local loops can work.
When coupling is strong enough (γ > γ^*), local-only systems hit an irreducible coordination floor.

Thus, bounded general intelligence in strongly coupled environments forces a shared evaluative control state E_t readable by multiple operators: the quantitative version of globality necessity.