At the heart of our study is the Desmocycle, the conscious loop. Let’s define the desmocycle below:
Let (αt ∈ Δk) be an attention distribution over a bounded window ((αt, i ≥ 0), (∑iαt, i = 1)). Let (eτ ∈ ℝd) be encoded states (sensory tokens, retrieved memory traces, imagined tokens—same representational type; provenance is metadata, not essence).
$h_t = \sum_{i=0}^{k-1} \alpha_{t,i}, e_{t-i}$
This is the context state the system is actually operating on.
pt(⋅) = P(Xt + 1 = ⋅ ∣ ht)
This is a distribution over what comes next.
Define a structured evaluative state:
Et = enc(ht, xt + 1obs, pt)
Think of (Et) as a bundled object containing at least:
Et ≡ (δt, ut, vt, st)
Scalar “loss” is a diagnostic summary, not the whole story:
Lt = −log pt(xobs * t + 1) ∈ ℝ * ≥ 0, Ht = H(pt) ∈ ℝ ≥ 0
Both can be components of (E_t), but (E_t L_t).
The critical step is that evaluation is not merely computed—it is used to steer the next step.
A minimal typed closure rule is:
αt + 1 = Normalize(αt ⊙ exp ( − η, G(Et)))
Optionally (and often importantly), other “self” variables update too:
σt + 1 = S(σt, Et), πt + 1 = Π(πt, Et)
This is the desmocycle: prediction → evaluation → control → new prediction.
Next, we’ll examine why it needs to arise.