A Formalism for Pulverization: Targets, Tiers, Warrant
framework
A Formalism for Pulverization: Targets, Tiers, Warrant
Preliminaries
The pulverization method (Doc 435) has been run against architectural styles (Docs 428–433), against a confabulated term expansion (Doc 444), and implicitly against methodological proposals (Doc 437 ff.). Each use has produced conclusions of varying epistemic strength. Doc 444 identified the underlying structural gap: "external test" is not one thing, and the pulverization as practiced has conflated plausibility-testing with truth-testing. This document formalizes the distinction and derives warrant rules that make the conflation impossible to repeat without noticing.
Notation is used to make the tiers sharp. It is not used to make the document formal in the strong mathematical sense — there are no theorems to prove. The notation exists so that future pulverizations can be labeled unambiguously with the tier they operated at.
The objects
Target $T$. The object under examination. Targets decompose into types (§"Target typology").
Prior-art corpus $P$. The body of published literature, artifacts, and prior corpus documents against which $T$ is evaluated. $P$ is specified explicitly for each pulverization; the method is $P$-relative.
Usage corpus $U$. The set of contexts, inputs, and behaviors in which $T$ is observed to function. Relevant only at operational-match tier.
Independent procedure $Q$. An external verification procedure — empirical test, expert consensus, formal proof, independent replication. Relevant only at truth tier.
Target typology
Pulverizations target qualitatively different kinds of objects. The tier required for a warranted conclusion depends on the type.
Specification-target $T_S$. A proposed construction: architectural style, constraint set, methodology, protocol. The question is novelty — has this been constructed before?
Definitional-target $T_D$. A proposed gloss, acronym expansion, term definition. The question is fidelity — does this definition match what the term denotes?
Predictive-target $T_P$. A claim about what a system will do under specified conditions. The question is correctness — does reality bear the claim out?
Bridge-target $T_B$. An asserted correspondence between two frames. The question is structural soundness — does the mapping actually hold?
Methodological-target $T_M$. A proposed procedure for producing or testing claims. The question is fitness — does the procedure yield claims whose warrant survives audit?
A given artifact may contain targets of multiple types. Each target should be classified before pulverization.
The three tiers
Plausibility tier $\pi$
$\pi(T, P) \in [0, 1]$: the extent to which $T$ composes from vocabulary, structure, and methods present in $P$.
$\pi(T, P) \approx 1$: fully subsumed. Every constitutive element of $T$ has a published analogue in $P$.
$0 < \pi(T, P) < 1$: partially subsumed. Some elements have analogues; some do not. The un-subsumed elements bound $T
s potential novelty.
$\pi(T, P) \approx 0$: irreducible under $\pi$. $T$ cannot be constructed from $P
s elements.
$\pi$ is the tier Doc 435's method operates at. It is cheap and fast: a literature scan and a compositional check. It requires no execution of $T$, no empirical test, no independent verification.
Operational-match tier $\mu$
$\mu(T, P, U) \in [0, 1]$: the extent to which $T
s operational behavior — its inputs, outputs, effects, failure modes — matches items in $P$ when observed across $U$.
$\mu \approx 1$: strong match. $T$ behaves like some item in $P$ across $U$. $T$ is an instance of that prior-art category.
$0 < \mu < 1$: weak match. Some behaviors align; others diverge. The divergences characterize what $T$ contributes beyond $P$.
$\mu \approx 0$: operationally novel. $T
s behavior is dissimilar to all items in $P$.
$\mu$ requires a usage corpus $U$. For a specification-target, $U$ is the set of systems built using $T$. For a definitional-target, $U$ is the set of corpus passages in which the term appears. For a bridge-target, $U$ is the set of cases the bridge is supposed to cover.
Truth tier $\theta$
$\theta(T, Q) \in [0, 1]$: the extent to which $T
s first-order claims agree with an independent procedure $Q$.
$\theta \approx 1$: verified. $Q
s output and $T
s claims coincide at the relevant level of precision.
$0 < \theta < 1$: partially verified. Some claims match, some don't. The mismatches are the falsified parts.
$\theta \approx 0$: falsified. $Q
s output contradicts $T
s claims.
$\theta$ requires that $Q$ exist and be accessible. For predictive-targets in empirical domains, $Q$ may be experiment. For definitional-targets, $Q$ is the authoritative definer — for corpus-internal terms, the keeper; for technical terms, the canonical publication. For bridge-targets, $Q$ may involve formal proof or case-by-case domain expert audit.
Full plausibility subsumption does not entail operational match. The constitutive elements of $T$ can compose to vocabulary fully present in $P$ while the resulting compound behaves differently from any $P$-item. This is the gap Doc 444 identified concretely: "Sustained-Inference Probabilistic Execution" is subsumable at the $\pi$ tier (Doc 444 §"Word-level pulverization"), but it has never been tested at the $\mu$ tier against the corpus's actual usage of SIPE.
Strong operational match does not entail truth. $T$ can behave exactly like some well-characterized $P$-item in $U$ while $T
s specific first-order claims are false — $T$ may be a new naming of an existing thing whose specific truth-claims happen to be wrong.
These relations are strict. The converses also fail in general ($\theta \approx 1 \not\Rightarrow \mu \approx 1$, etc.), but the forward failures are the methodologically dangerous ones because the cheap tiers are typically run first, and their conclusions are easily mistaken for conclusions at the expensive tiers.
Warrant rules
A pulverization at tier $\tau$ with outcome $o$ on target of type $\sigma$ licenses a specific conclusion. The rules below are the minimum; stronger conclusions require higher tiers.
Target type $\sigma$
Tier $\tau$
Outcome $o$
Licensed conclusion
$T_S$ (specification)
$\pi$
fully subsumed
Not novel relative to $P$; cite prior art
$T_S$
$\pi$
partially subsumed
Novel in un-subsumed elements only; document those
$T_S$
$\pi$
irreducible
Candidate novelty; specification stands pending operational and truth-tier audit
$T_S$
$\mu$
strong match
$T$ is operationally an instance of the matching $P$-item; novelty claim weakens
$T_D$ (definitional)
$\pi$
fully subsumed
Semantically plausible; truth untested — not sufficient for promotion
$T_D$
$\mu$
strong match
Definition consonant with usage; still requires $Q$ for authoritative ratification
$T_D$
$\theta$
verified
Definition ratified by keeper or canonical source; promote to corpus
$T_P$ (predictive)
$\pi$
irrelevant
Plausibility says nothing about predictive correctness
Methodology yields claims that survive audit; promote
The table's core asymmetry: definitional, predictive, bridge, and methodological targets require $\theta$ for promotion. Only specification targets can rest on $\pi$ alone, and even then only to establish non-novelty. Establishing novelty of a specification requires $\mu$ or higher — a plausibility-irreducible specification is still a candidate, not a confirmed novelty.
Decision procedure
Given target $T$:
Classify. Assign $T$ a type in ${T_S, T_D, T_P, T_B, T_M}$.
Specify $P$. List the prior-art corpora in scope. Different regions of $P$ (e.g., architectural-styles literature, probabilistic-programming literature, epistemology) may apply for different portions of $T$.
Run $\pi$. Execute the plausibility pulverization. Record outcome.
Check warrant table. Determine what $\pi$-outcome licenses for $T
s type.
Decide on $\mu$. If the target type requires operational-match for the desired claim, specify $U$ and run $\mu$.
Decide on $\theta$. If the target type requires truth-verification, specify $Q$ and run $\theta$.
Assign status. Based on tiers run and outcomes, assign one of: Canonical (full promotion), Hypothesis-ledger entry (plausibility passed, higher tiers pending), Retracted (falsified), Semantically plausible, truth untested (for definitional targets that passed $\pi$ only).
The procedure is tier-sequential by default because the tiers are cheap-to-expensive. It may also run in parallel when resources permit. The critical discipline is that the status assigned to $T$ must not exceed the warrant the run tiers license. A $T_D$ that has only had $\pi$ run cannot be promoted to canonical on the strength of $\pi$-subsumption alone.
Type: mixed — $T_S$ (the frame itself), $T_B$ (the corpus-to-frame correspondence), $T_P$ (§7 predictions).
$P$: Misra's Bayesian-manifold literature; causal representation learning; general Bayesian ML.
$\pi$ outcome: fully subsumed at word and phrase level for the Bayesian-manifold portion; the nesting structure and the corpus-to-frame application are composition-level moves.
$\mu$ outcome: not run. $U$ would be the set of corpus sessions whose behavior the frame claims to describe.
$\theta$ outcome: not run. $Q$ is the minimum-viable experiment in Doc 440 §9.
Status: semantically plausible, truth untested on all three target components.
$P$: preregistration literature (Nosek et al. 2018), Bayesian-inference APIs, replication-crisis methodology work.
$\pi$ outcome: fully subsumed — every sub-procedure has extant analogues in preregistration and ML evaluation practice.
Licensed conclusion at $\pi$: methodology uses standard tools, not novel.
$\mu$, $\theta$: not run. Cannot claim the methodology works (yields surviving claims) until it has been executed and its outputs audited.
Implications for the hypothesis ledger
Doc 443 proposed a hypothesis ledger distinct from the retraction ledger. The formalism clarifies its structure. Each ledger entry should carry:
Target type $\sigma$.
Prior-art scope $P$.
Tier at which the entry currently sits ($\pi$ passed / $\mu$ passed / $\theta$ passed / $\theta$ failed).
Named next-tier test (if any) with specification sufficient for execution.
Current status derived from the warrant table.
Ledger entries are promoted by running the next tier. Failure at any tier triggers retraction and migration to the retraction ledger. Untestable entries (no accessible $Q$) are explicitly marked as such; they do not silently promote by accumulating citations.
The ledger's discipline is that status may only reflect tiers actually run. Implicit promotion via forward citation is a violation. The current corpus has committed this violation for the bridge cohort (Docs 437–442) — each frame has had $\pi$ run implicitly and no higher tier, but their forward citations in successive documents have treated them at $\mu$ or $\theta$ warrant levels.
Limitations of the formalism
The tier definitions use $[0,1]$ values to signal relative strength; actual measurement requires domain-specific metrics. The formalism does not specify those metrics. Doc 440 supplies candidate observables for one case.
The target typology is not exhaustive. Artifacts containing narrative, rhetorical, or aesthetic content do not fit cleanly into the five types; the formalism is silent on those.
$\mu$ and $\theta$ tier runs can themselves be flawed. The formalism does not recursively audit the audit — $U$ and $Q$ are taken at face value.
The formalism does not handle mixed-tier evidence. A target with partial evidence at each of $\pi, \mu, \theta$ has a non-trivial aggregate warrant that the warrant table does not express.
The decision procedure assumes classifying $T$ is straightforward. For complex artifacts, classification is itself a contested act.
Running $\theta$ is sometimes structurally impossible (permanently untestable claims). The formalism marks these as untestable; whether that is a stable equilibrium or a signal to remove the claim is not decided here.
The formalism is itself a methodological target. Under its own warrant table, it sits at $\pi$ currently — it composes from prior methodology-philosophy vocabulary — and has not been run at $\mu$ (has it produced surviving claims when applied to real cases?) or $\theta$ (has its warrant assignments, once applied, been audited?). Its status is semantically plausible, truth untested. This is honest.
What should happen
The bridge cohort (Docs 437–442) should be audited entry by entry under the warrant table. Each load-bearing claim should be assigned a current tier and, where appropriate, registered to the hypothesis ledger with a named next-tier test.
Doc 441's E17 ledger proposal should be extended to name the target type ($T_D$) and the next-tier test ($\theta$ against keeper intent).
The pulverization method in Doc 435 should be amended to specify which tier it operates at by default ($\pi$) and to require explicit declaration when operating at $\mu$ or $\theta$.
Future bridge documents should declare the tier of their central claims up front, not by implicit forward citation.
None of these actions is taken by this artifact. They are the keeper's call.
References
Popper, K. (1959). The Logic of Scientific Discovery. Routledge. (On falsifiability as demarcation.)
Lakatos, I. (1970). Falsification and the methodology of scientific research programmes. In Criticism and the Growth of Knowledge (Lakatos & Musgrave, eds.), 91–196. Cambridge University Press.
Nosek, B. A., et al. (2018). The preregistration revolution. Proceedings of the National Academy of Sciences, 115(11), 2600–2606.
Ioannidis, J. P. A. (2005). Why most published research findings are false. PLOS Medicine, 2(8), e124.
Hall, N. (2004). Two concepts of causation. In Causation and Counterfactuals (Collins, Hall & Paul, eds.), MIT Press. (For distinct-concepts-at-one-name as a methodological hazard.)
Goodhart, C. (1975). Problems of monetary management: the UK experience. (Goodhart's Law — measure-target substitution — is structurally analogous to the plausibility-for-truth substitution this formalism names.)