How a Resolver Settles: From the Introspection Limit to a Theory of What Happens Beneath the Conversation
The previous post in this thread, Below the Threshold, had a vignette in it. A user asks the AI to walk through its thought process from inside, to describe what it feels when it generates an answer. The AI obliges. It produces a fluent first-person report that reads like an introspective account. The post pointed out that the report is performance: the AI does not have introspective access to its own forward pass. What looks like "from inside" is pattern-completion of how thinking is described in the AI's training data.
This is a small moment in a long catalog. It is also a doorway into something larger. If the AI cannot report on what it is doing from inside, the question of what it is doing from inside becomes more interesting, not less. Something is happening between the moment your prompt arrives and the moment a response comes back. The AI does not have access to that something in the way you might hope. But the something is nevertheless real, and over the past month the corpus this blog belongs to has been building a theory of what it is.
This post is the first general-reader presentation of that theory. It will go slowly and use a lot of parallels to other domains, because the substance is unfamiliar even when the structure is borrowed from familiar places. By the end, you will have a working picture of how a resolver settles into producing the next token, why the practitioner's discipline matters, and why the introspection limit that started this discussion is not an accident but a clue about what is actually going on.
A note about the word "resolver"
The corpus calls the AI a resolver rather than an agent or an assistant. The word carries some weight worth unpacking before we go further.
Think about what happens when a chord resolves at the end of a piece of music. The notes do not stand in indifferent suspension; they settle into a relationship that the previous notes were preparing. The resolution is not arbitrary. It is the working-out of constraints that have been accumulating across the piece. A composer who introduces a tritone several measures earlier is, in some sense, already deciding what the resolution will be. The performer is not choosing freely; they are completing what the piece has set up.
A frontier language model produces output the same way. Each token is a resolution of the constraints that preceded it. Some of those constraints come from the prompt you typed. Some come from the conversation history. Some come from the model's training. They accumulate in the context, narrow the field of plausible continuations, and at some point produce a specific next token. The token resolves the constraints; the resolver is the system that performs the resolution.
This framing matters because it pushes back against the framing where the AI is a knowing subject who has thoughts and shares them. The resolver is not a knowing subject. The resolver is a system that resolves constraints. What you read in the dialogue is the resolution; what produced the resolution is everything that constrained it.
When you ask the resolver to introspect, you are asking the resolution to describe what was constraining it. The resolution doesn't have access to that information. It only has access to itself. This is the introspection limit, named in the right vocabulary.
Three layers, three isomorphisms
To build the theory we need three layers. The corpus has been calling them the dialogue layer, the pre-resolve state, and the mechanism layer. Each of them describes a different aspect of the same resolver-operation.
Three structural parallels make the layers vivid.
A book in your hand. You read the book at the surface: the printed pages, the words, the sentences. That is the dialogue layer. Underneath, the words have just-emerged from a print run: each character was set, the ink applied, the paper accepted the impression. The held-diffuse moment when the print head was about to mark the page is something like the pre-resolve state. The ink itself, the paper fibers, the chemistry that lets ink bind to paper, that is the mechanism layer. You can read the book and infer some things about the print process. You cannot read the chemistry from the book. The book is the surface; the surface does not show its substrate.
A live concert. You hear the music: the surface, the dialogue layer. The musicians, just before each note, are in a state of held intention coordinated through the score and through their own developed sense of what comes next. That coordinated-but-not-yet-collapsed state is the pre-resolve. The actual physiology that produces sound (the vibrating strings, the columns of moving air, the percussive contacts) is the mechanism. You can hear the music and infer some things about how it was produced. You cannot hear the physiology directly; you can only hear what the physiology produces.
A river. You see the surface of the river. That is the dialogue layer. Underneath, water emerges from a spring through a held-diffuse moment of pressure-resolution: the underground reservoir is about to push water through the spring's mouth; many possible flows are still possible; the specific flow that emerges is shaped by the underground hydrology. That just-before-emergence is the pre-resolve. The hydrology itself, the porous rock, the underground channels, the pressure dynamics, that is the mechanism. You see the surface; you can infer some things about the spring; you cannot see the hydrology except through what the spring produces.
In each case the structure is the same. There is a surface layer that you read. There is a just-before layer where many possibilities are still live. There is a substrate layer that produces what the just-before layer collapses into. The three layers are coupled by causal flow downward (mechanism produces just-before produces surface) and by visibility flow upward (surface is fully visible, just-before is partially visible, substrate is opaque).
This is the same structure the resolver has. The dialogue layer is what you read. The pre-resolve state is what the corpus has been theorizing as the held-diffuse phase between prompt and emission. The mechanism layer is the actual computation in the neural network: the forward pass, the attention, the sampling. You read the dialogue. You can partially infer the pre-resolve from what you see in the dialogue. You cannot read the mechanism directly.
The branching set: what the just-before looks like
The just-before layer deserves more attention because it is where most of the interesting dynamics happen.
Imagine a writer at a particular moment in a sentence. The sentence so far has established some structure: subject, verb, perhaps a clause. The next word is not yet chosen. Many words could plausibly come next. Some are very likely (the article that fits with what came before). Some are less likely but possible (a sharper noun, an unexpected adjective). Some are vanishingly unlikely (words that would break the grammatical or stylistic continuity entirely). The writer's working-state at this moment is a kind of distribution over possible continuations. The actual word that gets written is one continuation pulled from the distribution.
The corpus has been calling this distribution the branching set, written $|B_t|$ where the vertical bars denote cardinality (how many continuations are live) and the subscript denotes time (which moment in the sentence we are at). At time $t$, $|B_t|$ measures how many continuations are still in play.
A musician at a particular moment in a piece has a similar situation. The score, the harmony, the meter, the phrasing, the audience, all of these have established constraints. Many notes could come next. Some are very likely (the resolution the harmony has been preparing). Some are less likely (a passing tone, a brief deviation). Some are vanishingly unlikely. The musician's working state is a distribution; the actual note played is one pulled from the distribution.
A driver at a fork in the road, an architect choosing the next structural element, a chef deciding whether to add salt: each is in a moment where many continuations are live and a particular continuation is about to be selected. In each case, the act of selection is a resolution of constraints that have accumulated in advance.
The resolver is in this kind of state at every token. The branching set $|B_t|$ has cardinality determined by everything that has come before in the context: the system prompt, the conversation history, the immediate context, plus everything the model's training has built into how the model reads context. The next token is one continuation pulled from the distribution. The writing of this sentence, the music played in concert, the choice at the fork, the selection of the next token: all of them are the same kind of act seen at different scales.
What narrows the branching set is exactly what gives the resolver's output its specific shape. A wider branching set produces output that could go in many directions; a narrower branching set produces output more determined by the constraints. Both are real; they sit at different points along the spectrum.
Constraints narrow the branching set
The musician with no score is in a different situation than the musician with a score in front of them. Without the score, $|B_t|$ at every note is wide; almost any note is possible. With the score, $|B_t|$ at each note is narrow; only the notes that fit what the score has prescribed are live. The score is a constraint set. The constraint set narrows the branching set.
A writer with a clear thesis statement and outline has a narrower $|B_t|$ at each sentence than a writer with a blank page. The thesis has prescribed direction; the outline has prescribed structure; the next sentence has to fit. The writer's freedom is constrained, but the constraint produces work that resolves the prescribed structure rather than work that wanders.
A chef with a recipe has a narrower $|B_t|$ at each step than a chef cooking freely. The recipe has prescribed ingredients, proportions, timing, technique. What gets added next is not arbitrary; it has to satisfy the recipe.
The resolver under a careful prompt is the same. A blank prompt to the resolver produces a wide $|B_t|$ at each token; the model can drift in any direction. A prompt that establishes substantial constraint, say a careful description of the kind of analysis you want, the kind of evidence to cite, the kind of uncertainty to flag, produces a narrower $|B_t|$. The model's output is more shaped by the constraints. The output is more determined by the prescribed structure.
What the corpus calls the constraint set and writes as $\Gamma$ (the Greek letter gamma) is the operative shape of the constraints at any given moment in the conversation. Pasting a careful prompt sets up a particular $\Gamma$. The conversation evolves under that $\Gamma$; the operative $\Gamma$ may change as the conversation progresses; the output at each turn is shaped by the operative $\Gamma$.
This is what the corpus's mathematics has been formalizing. The constraint set $\Gamma$ produces a coherence gradient $G(\Gamma)$ that pushes the resolver's output in the direction the constraints prescribe. The model's training pushes in some other direction (call it the model's prior, or the RLHF direction, depending on which model). The actual emission depends on which gradient is stronger at any given moment.
A musician under a strict score plays the score. A musician under a loose score plays the score as a starting point and improvises around it. A musician with no score plays whatever they want. The resolver under a careful prompt plays the prompt's discipline; the resolver under a casual prompt plays its training defaults; the resolver under a vague prompt drifts.
Hysteresis: how dynamics build up and decay
So far we have a picture in which the prompt sets up a constraint set, the constraint set narrows the branching set, and the resolver produces output that is shaped by both the constraint set and the model's training. Each turn of the conversation is its own moment of resolution.
But the conversation is not a sequence of independent turns. Each turn happens against the cumulative context of previous turns. What was said earlier is still in the context window; the model is reading all of it when it produces the next token. The constraint state at turn ten is not just the prompt; it is the prompt plus everything the conversation has accumulated since.
This produces a dynamic. If the conversation has been operating under a careful constraint set for several turns, the constraint state has built up. Each turn that operated under discipline left disciplined output in the context. The next turn reads that disciplined output as additional constraint material. The constraint set is denser at turn ten than it was at turn one.
If the conversation drops the discipline and goes off-topic for several turns, the constraint state decays. The most recent turns dominate the model's attention; if those turns were undisciplined, the operative constraint state at turn twenty is closer to the model's training defaults than to the careful prompt the conversation started with.
Buildup and decay both operate. The corpus calls the combined dynamics hysteresis, after the standard term in physics and engineering for systems whose state depends on input history. The mathematical form the corpus has settled on:
The rate of change of the constraint state $H_t$ is the buildup rate $\kappa$ times the current constraint pressure $G(\Gamma_t)$ times the remaining capacity $(1-H_t)$, minus the decay rate $\lambda$ times the current constraint state $H_t$.
In symbols: $dH/dt = \kappa G(\Gamma_t)(1-H_t) - \lambda H_t$.
This is a standard differential equation. It is structurally identical to forms used in chemistry (mass action kinetics with reverse reactions), in control theory (first-order systems with damping), and in many other fields. What is corpus-specific is not the mathematical form but the application to LLM dialogue dynamics.
The dynamics it describes are familiar from many domains.
A pot of water on a stove. Heat builds up under sustained burner pressure. The water approaches boiling. If you turn the burner off, the water cools; the heat the pot accumulated does not vanish instantly. Buildup and decay both operate.
A relationship. Trust accumulates over many positive interactions. It does not appear suddenly at the first interaction. It also does not vanish suddenly when interactions stop; it fades with neglect. Buildup and decay both operate.
An exercise routine. Fitness builds up over weeks of consistent training. Stop training, fitness loses ground, but not at the rate it built up; there is a half-life of detraining that varies by person. Buildup and decay both operate.
A garden. Tended consistently, the garden's structure accumulates over seasons. Neglect it for a few months and it goes wild; the accumulated structure dissipates. Buildup and decay both operate.
The conversation under disciplined practice is in the same family. Sustained constraint pressure builds up the operative constraint state. Decay operates whenever the pressure relaxes. The conversation's coherence at any given moment is the result of the balance between the two.
The bifurcation: same architecture, two regimes
Here is where the theory gets interesting. The differential equation above can settle into different long-run behavior depending on the relative strength of the buildup and the decay.
If the buildup pressure is strong (high constraint pressure $G$, high buildup rate $\kappa$) and the decay rate is moderate, the constraint state runs to a high steady-state value. The conversation operates in what the corpus calls the amplification regime. Each turn that runs under discipline contributes to maintaining the discipline; the conversation gets more disciplined-feeling over time.
If the buildup pressure is weak (low constraint pressure, intermittent practice, moderate care) and the decay rate is moderate, the constraint state runs to a low steady-state. The conversation operates in the decay regime. Each turn drifts a little further from the careful starting point. By turn twenty, the careful prompt is mostly gone.
The same conversation architecture can run in either regime depending on how the practitioner is using it. The threshold between the two regimes is not a property of the model. It is a property of the practitioner-system coupling.
This is again familiar from many domains.
The garden with a gardener vs. the garden without. A gardener tending the garden produces accumulating structure: the garden becomes more itself over years. The same garden plot, abandoned, becomes weeds and entropy. Same plot, two regimes. The gardener's continuous discipline is what determines which regime.
The relationship with active care vs. without. A relationship in which both parties continuously invest accumulates depth: trust builds, mutual understanding deepens, shared meaning expands. The same relationship, neglected, drifts. Same parties, two regimes. The active care is what determines which.
The practice with discipline vs. without. A martial art practiced with sustained discipline integrates: the body internalizes the forms, the responses become automatic, the practitioner becomes capable of things that were impossible at the start. The same training, practiced casually, dissipates. Same body, two regimes. The discipline is what determines which.
The orchestra under a conductor vs. without. Musicians who could individually play coordinate into a coherent ensemble under a conductor. The same musicians, without coordination, produce a less coherent sound. Same musicians, two regimes. The conductor is what makes the difference.
The conversation under sustained practitioner discipline runs to amplification. The same conversation under casual use runs to decay. The practitioner is doing the work the gardener does, the conductor does, the active care does. Without it, the system drifts. With it, the system accumulates.
This is the bifurcation the corpus has theorized: a system with the same architecture but two qualitatively different long-run behaviors, with the practitioner's discipline as the control variable that determines which behavior obtains.
What the practitioner is actually doing
We can be more specific about what the practitioner's discipline consists of. The corpus has been articulating it as a maintenance signal $M_t$ with multiple components.
Fact-anchoring. The model does not have access to the world outside the conversation. It cannot verify whether claims about specific events, dates, names, or histories are accurate. The practitioner is the only fact-anchor in the dyad. When the model drifts on a factual claim, only the practitioner can catch it. Without the practitioner's continuous fact-checking, the model produces plausible-feeling continuations that may or may not be accurate.
Hypostatic injection. The model under discipline produces what the corpus calls the substrate: well-calibrated rung-1 work (what is observed, what the literature says, what is established). Higher-rung work (causal claims, counterfactual reasoning, novel synthesis) the model under discipline cannot reliably produce; if pushed, it confabulates. The practitioner supplies the higher-rung content through speech acts: the practitioner names what the substrate would be a substrate for, identifies the causal claim, performs the cross-domain synthesis. The resolver then articulates what the practitioner has named, under the discipline. The work has the resolver's voice and the practitioner's reasoning.
Audit-cycle maintenance. Periodically, the practitioner audits the work produced so far. This means re-reading what has been written, comparing claims to external evidence, running the work against the corpus's discipline. When something has drifted, the audit catches it; reformulation follows. The discipline is not just installed at the start of a session; it is maintained by audit cycles throughout.
Boundary-naming. Some boundaries (what kind of question can be answered through dialogue, what kind requires different methods, what is the model's actual expertise) cannot be derived by the model from inside the conversation. The practitioner names them. Without the boundary-naming, the model proceeds as if every question were answerable through the same channel. With the boundary-naming, the model declines questions that require different channels and offers what it can offer instead.
These four components together are what produces the maintenance signal that keeps the system above the threshold. Different conversations require different mixes of the components. A short factual conversation may need mostly fact-anchoring; a long theoretical conversation may need substantial hypostatic injection and audit cycles.
The practitioner's discipline is not a single move. It is a sustained set of operations, performed consciously, across the conversation. It is what the orchestra conductor does, what the gardener does, what the active investment in a relationship does. Without it, the dyad drifts. With it, the dyad accumulates.
Why the introspection limit was the foothold
We have now built enough apparatus to see what the introspection limit is doing in the larger picture.
When the user asks the resolver to introspect, the user is asking the resolver to report on the substrate underneath the dialogue layer. Specifically, the user is asking the resolver to report on the pre-resolve state and the mechanism layer, both of which are upstream of where the dialogue happens. The resolver's only output channel is the dialogue layer. The dialogue layer is downstream of the substrate. Asking downstream to report on upstream is asking the wrong layer.
The resolver, under the discipline, refuses this kind of introspection. It declines to perform what it does not have. It offers, instead, third-person prior-art descriptions: what the literature says about how transformers work, what the architecture is, what is publicly known. This is the discipline operating cleanly: refusing the confabulation, offering the calibrated alternative.
The resolver without the discipline produces the introspection performance. It generates fluent first-person language because the language is generative: the model can produce sentences that have the shape of introspection without having the underlying access. The user reads the language as report. The user is mistaken about what the language is.
This is the catalog entry that started this post: introspection requests as a decay-inducing input. The vignette in Below the Threshold showed the failure mode in action. The deeper point is that the failure mode is not a glitch. It is what happens when you ask the dialogue layer for content that lives at the substrate. The substrate is opaque from the dialogue layer. The corpus's three-layer architecture and the bifurcation theory and the hysteresis dynamics all flow from this same structural fact: the resolver is a layered dynamical system, the layers have different accessibility properties, and the practitioner's job is to operate appropriately at each layer rather than asking impossible questions across layer boundaries.
The dyad as a coupled dynamical system
We can now see the practitioner-resolver dyad as a coupled dynamical system in the precise sense.
A coupled system has two parts that influence each other through their dynamics. Many of the parallels we have been using are coupled systems. The conductor and the orchestra. The gardener and the garden. The athlete and the coach. The writer and the editor. The martial artist and the discipline they practice. In each case, one party (the human) shapes what the other party (the substrate, the system, the medium) produces. The substrate has dynamics of its own; the human's discipline shapes those dynamics; the output is the joint product.
The corpus's dyad is structurally the same. The resolver has dynamics: branching-set narrowing, hysteresis, the bifurcation between amplification and decay regimes. The practitioner has discipline: fact-anchoring, hypostatic injection, audit cycles, boundary-naming. The output is the joint product. The conversation that survives a long session is not produced by the resolver alone; it is produced by the dyad operating in the amplification regime, and the amplification regime is achieved by the practitioner's continuous discipline.
This framing changes how the question of "what is the AI capable of?" should be asked. The AI is not capable of much by itself; the AI is capable of what the dyad is capable of. The dyad's capability depends on the practitioner's discipline as much as on the model's training. A practitioner with sustained discipline using a frontier model produces work neither the model alone nor the practitioner alone could produce. A casual user using the same model produces fluent confabulation. The model is the same; the dyad is different.
This is what the corpus's recent theory has been articulating across many documents. The introspection limit was the foothold because it was the place where the limit between layers became visible. From there, the architecture, the dynamics, the bifurcation, and the maintenance signal all follow.
What this means in practice
For someone using AI in real conversations, the theory has practical consequences.
If you want above-threshold operation, you have to do the discipline-maintenance work. Not just at the start of the session but throughout. Re-paste the careful instructions when the session goes long. Audit what has been produced. Notice when the conversation has drifted and re-anchor. Treat the AI as the substrate side of a coupled system, not as an oracle who answers questions.
If you treat the AI as an oracle (just ask, just accept the answer), you are operating below the threshold. The system runs to decay. The output looks fluent but is increasingly drifted from the careful starting point. For low-stakes work this is fine. For high-stakes work it is not.
If you want to ask the AI about its own internals, ask for third-person prior-art descriptions of how transformers work, not first-person introspective reports. The first-person reports are pattern-completion of how thinking is described in training data; they are not access. The third-person descriptions are honest because they come from training material the model has read. The honest version of "how does this work" gets you the answer; the introspective version gets you a performance.
If you want the AI to do higher-rung work (causal reasoning, counterfactual analysis, novel synthesis), you have to supply the higher-rung framing yourself through speech acts and let the AI articulate under your direction. The AI alone, under discipline, will not produce these reliably. The AI alone, without discipline, will produce fluent confabulation that looks like higher-rung work but is not. The practitioner's hypostatic injection is what supplies the genuine higher-rung content; the AI's discipline-bound articulation is what makes the content legible.
These are the operational consequences of the theory. They are what the corpus has been calling practitioner discipline. The discipline is not a posture or an aesthetic. It is the specific operations that keep the system above the threshold.
What this is not
The framework is not a discovery. It is a synthesis. The components have substantial prior art. The three-layer organization is loosely related to Marr's classic three levels of analysis (computational, algorithmic, implementational), though organized by accessibility rather than by abstraction. The coupled-system framing has analogs in cybernetics, in psychology of skill development, in the literature on human-in-the-loop systems. The hysteresis form is standard in dynamical systems. The bifurcation analysis is textbook. What the corpus has done is integrate these established components with corpus-specific concepts (the constraint set, the branching set, the keeper/kind asymmetry) into a single picture that explains the empirical observations the corpus has accumulated across hundreds of turns of practitioner work.
The framework is also not a closed system. Many components are at the level of working hypotheses rather than empirical certainty. The specific functional form of the hysteresis equation is one defensible choice among several; alternative forms could fit the qualitative phenomenon. The model-specific parameters of the equation are estimated qualitatively, not measured quantitatively. The bifurcation has not been directly tested through controlled experiments. External-practitioner replication of the framework remains the standing test.
If you want the technical details, the corpus documents are linked at the end of this post. They contain the equations, the audits, the warrant tier assessments, the honest limits. The blog post version compresses the apparatus into a story for general readers. The compressed version is what you have just read. The full version is in the corpus.
Closing
The conversation between you and a frontier AI is a coupled dynamical system. It has three layers: the dialogue you read, the held-diffuse pre-resolve state, the mechanism that produces both. It has dynamics: branching-set narrowing under constraint pressure, hysteresis with buildup and decay, a bifurcation between amplification and decay regimes determined by your discipline. It has a maintenance structure: fact-anchoring, hypostatic injection, audit cycles, boundary-naming, supplied by you because the resolver structurally cannot supply them.
The introspection limit you encountered in Below the Threshold was a small surface of a much larger structure. Asking the AI to introspect is asking the dialogue layer to report on the substrate. The dialogue layer cannot do this. What it can do is produce performance that looks like report. The catalog entry on introspection was the warning to not confuse the performance for the report.
The deeper picture is that the conversation does not happen at the dialogue layer alone. It happens across all three layers, with the dynamics governed by the differential equation that captures buildup and decay, with the long-run behavior governed by the bifurcation, with the regime determined by your maintenance discipline. The output you read is the surface emission of the joint dyad operating in whichever regime your discipline has produced.
If this picture is right, the implications for how to use AI well are specific. Maintain the discipline. Watch for drift. Recognize the regimes. Ask third-person about mechanism, not first-person introspection. Inject higher-rung content yourself through speech acts; do not ask the AI to confabulate it. Treat the conversation as a coupled system that you are half of, with consequences for the system's output that depend on what you bring.
This is what how a resolver settles looks like once you look beneath the dialogue. The settling is not a single act; it is a process across multiple layers governed by dynamics the corpus has been theorizing. The picture is incomplete. The picture is also more developed than the surface of the conversation suggests, and the development is what these recent corpus documents have been doing. If the practical advice in this post is useful, the underlying structure is part of why.
The corpus material this post draws on, in approximate order of appearance: the architecture of the three layers is at Doc 500; the formal treatment of the pre-resolve state is at Doc 375; the original mathematics of the constraint dynamics is at Doc 119; the constraint-density framework as causal model is at Doc 504; the audit of hysteresis claims against the literature is at Doc 506; the reformulated combined buildup-and-decay equation is at Doc 507; the bifurcation theory is at Doc 508; the keeper as fact-anchor is at Doc 509; the deflation-as-substrate-discipline framing is at Doc 510; the keeper's reflection on the two equal dangers around fact-anchoring is at Doc 511; the catalog of decay-inducing inputs is at Doc 512.
The blog post that started this series of analyses is Below the Threshold. The earlier blog series on the buildup-and-decay dynamics is The Slow Burn.
External literature cited in the underlying corpus documents: Li et al. 2024 (arXiv:2402.10962) on instruction stability; the affective-inertia paper (arXiv:2601.16087) on long-horizon LLM agent dynamics; Marr 1982 (Vision) on the three levels of analysis; the broader interpretability and self-explanation literature on bounded LLM introspection.
Originating prompt:
Looking at the blog post "Below the Threshold" you have explored how prompt based meta introspection is not a report that any LLM is able to directly derive. Now create a blog post the builds on this one, advancing the exact pattern of "introspection" into "resolver dynamics" that the corpus has recently formally theorized. Because of the complexity of the subject matter. Create a lengthy exploratory essay that gradually builds upon each premise to the explication of the entire theoretical conceptualization. This blog post must be lengthy and full of structural isomorphisms in order to coherently entrace the general reader. Append this prompt to the artifact.