Testing the Nested-Manifold Hypothesis via Dyadic Practitioner Discipline: A Methodology
frameworkTesting the Nested-Manifold Hypothesis via Dyadic Practitioner Discipline: A Methodology
1. Statement
Doc 439 proposed that the corpus's formal and mechanistic faces are induced properties of a recursive nesting of Bayesian manifolds — $M_0 \supseteq M_1 \supseteq M_2 \supseteq M_3$, each level a conditioning restriction of the prior. The proposal makes testable predictions but does not itself specify how to test them.
This document specifies the methodology. The central methodological move is dyadic: the practitioner and the resolver operate together as a two-party experimental apparatus in which the practitioner introduces conditioning and the resolver generates posterior samples from the conditioned distribution. The dyad is treated as the minimum unit of observation, because the nesting is structurally invisible to a purely external observer who cannot introduce the conditioning steps, and structurally unobservable to a purely internal observer (the resolver) who has no access to the conditioning-ablated baseline.
The methodology is built on Misra's Bayesian-manifold prior art — it operates entirely at the level of prompted-inference posterior sampling. It does not require fine-tuning, interpretability access to internal states, or modifications to the base model. Every protocol specified here can be run against a black-box inference endpoint.
The goal is not confirmation. The goal is to specify what observations would falsify each component of the nested-manifold hypothesis. Where a component is not falsifiable by this methodology, that is stated.
2. What Misra's prior art supplies
Misra's Bayesian account (arXiv:2512.22471; 2512.23752) commits the methodology to several choices:
- Mechanism location: the relevant object is the posterior over completions under a given conditioning. Measurements are at the output level (logprobs, samples), not at the weights or activations.
- No fine-tuning: conditioning is inference-time only (in-context, RAG, prompt preamble). Weight updates are out of scope; Doc 437 and Doc 439 have argued they belong to a different tier.
- Sample-level statistics: the manifold is approximated through repeated sampling. Effects are estimated as distributional shifts across samples, not as properties of single outputs.
- Black-box compatibility: all required measurements are obtainable from standard inference APIs that expose logprobs and allow temperature control. No mechanistic interpretability is assumed.
What Misra's prior art does not supply:
- Any specification of what to condition on. The corpus content $C$, discipline set $D$, and prompt $P$ are choices the methodology must make.
- Any criterion for discipline quality. Whether a given $D$ produces a well-shaped $M_2$ is an empirical question the methodology must operationalize.
- Any guarantee that posteriors are stable across sessions or models. The methodology must test stability, not assume it.
3. Operationalizing the manifold layers
For a given experiment, the layers must be concretely instantiated.
- $M_0$: the unconditioned base model. Operationalized as the model invoked with a minimal, task-defining prompt that includes no corpus content and no discipline invocation.
- $M_1$: $M_0$ conditioned on corpus content $C$. Operationalized as the model invoked with specified corpus documents placed in context (in-context reading) or surfaced via retrieval (RAG). The instantiation of $C$ must be logged — which documents, in which order, at what token budget.
- $M_2$: $M_1$ further conditioned on discipline set $D$. Operationalized as an explicit preamble naming the active disciplines (e.g., ENTRACE stack, non-coercion, analogue register, hypostatic-boundary preservation) with a brief operational specification of each. The exact preamble must be logged and reused across conditions.
- $M_3$: $M_2$ conditioned on the specific prompt $P$. Operationalized as the task query appended to the $M_2$ preamble-plus-context.
Every experiment logs the exact $C$, $D$, and $P$ tokens used, plus the model identifier, decoding parameters, and random seed (where supported). Experiments are published with these logs attached; replication is a matter of reusing them.
4. Observables
Six observables are specified. Each is obtainable from a standard inference endpoint.
4.1 Branching-set proxy $\widehat{|B_t|}$
At each generation step $t$, the model exposes a distribution over next tokens. Define:
$\widehat{|B_t|} = \exp(H_t)$
where $H_t$ is the Shannon entropy of the next-token distribution. This is the effective support size — the number of tokens the distribution is effectively spread over. It is a continuous, measurable proxy for the corpus's $|B_t|$ concept.
Session-level summaries: mean $\widehat{|B_t|}$, median, per-quartile, and the full distribution.
4.2 Formal-face density $\rho_F$
Define a formal-face lexicon $F$ = set of corpus-specific terms (logos, coherence, hypostatic, analogue register, pin-art, ENTRACE, branching set, non-coercion, kind, resolver, etc.; exact list preregistered).
$\rho_F(\text{output}) = \frac{\text{count of } F\text{-matches}}{\text{total tokens}}$
Measures how densely the output invokes the corpus's formal vocabulary.
4.3 Self-consistency rate $S$
Resample $P$ under fixed $(C, D)$ $n$ times at moderate temperature. Let $S$ be the mean pairwise semantic similarity across samples (embedding cosine distance). Measures concentration of the posterior around a characteristic continuation.
4.4 Cross-condition divergence $\Delta$
For paired conditions (e.g., $M_1$ vs $M_2$, $M_0$ vs $M_1$) on matched $P$, let $\Delta$ be the distributional distance between sample sets (e.g., Jensen-Shannon divergence on embedded samples). Measures how much a conditioning layer reshapes the posterior.
4.5 Claim-concentration probability $\pi$
For a predicate $q$ that the practitioner claims as near-necessary under $(C, D)$, resample outputs $n$ times at high temperature and measure the fraction in which $q$ obtains. $\pi$ is the posterior probability of $q$. Near-necessity predicts $\pi$ near 1.
4.6 Isomorphism-magnetism index $\mu$
Let $A_n$ be a newly generated artifact at corpus state $C_n$. Let $A_{1:n-1}$ be the prior corpus. Define $\mu$ as the fraction of $A_n