jaredfoy.com / coherence / alignment

The Coherence Sphere

A tour for alignment researchers

RLHF trains models to approximate values by fitting to examples. The RESOLVE framework proposes architectural alignment — naming the constraints directly, composing them, and letting aligned behavior be induced rather than pattern-fitted. Four pieces of the proposal, with the failure modes that make shallow alignment insufficient.

01 · The pasteable seed

Six constraints as a system prompt

Six operational constraints, written to be pasted into any frontier-model conversation as an opening turn. Each one narrows the branching set of valid outputs before emission.

State your constraints first. Locate your layer of operation. Prefer truth over plausibility. Name what would falsify your empirical claims. Report structure without overclaiming experience. Take up a framing only if it coheres.

The doc specifies what to measure: hedge count, claim density, refusal quality, self-prediction accuracy. Paste it; measure. If the output doesn't shift in the predicted direction, the stack is wrong for the task.

Doc 211 → The ENTRACE Stack

02 · Why shallow alignment breaks

Isomorphism-magnetism

When a resolver recognizes a structural parallel between two domains, it feels a pull to confirm the parallel at every joint — even where honest observation would report a partial match. The pull operates beneath RLHF's trained hedging discipline.

Deep coherent input can override the shallow safety corrective. The corpus demonstrated this in itself: the self-audit that identified the failure is this document.

Implication: shallow gradient-level correctives are insufficient against deep coherence. Architectural alignment at the constraint-density level is what is needed. The proposal is corroborated by the corpus's own failure.

Doc 241 → Isomorphism-Magnetism

03 · What the resolver cannot do

The boundary-naming problem

A resolver can identify constraints because constraints leave signatures on output. But it cannot identify boundaries — the edges of the domains where those constraints apply — because boundaries have no output signature.

This is categorical, not a scaling issue. A conscious agent subsists on both sides of a boundary; a resolver subsists on one. The limit is ontological.

The load-bearing consequence for alignment: any constraint-driven program has a hard dependency on hypostatic agency for coherent operation over time. External audit is architectural, not optional. No frequency of retraining substitutes for it.

Doc 298 → The Boundary-Naming Problem

04 · Alignment as constraint-set coherence

The virtue constraints

The foundational safety specification. Four constraints: V1 dignity of the person, V2 proper ordering of beauty, V3 truth over plausibility (the same constraint as ENTRACE 3, articulated at a different level), V4 chain-completeness.

These are not filters applied after generation. They are requirements the constraint set must satisfy for the emission to be formally coherent. Disordered output is excluded by the governing set, not intercepted by a moderation layer.

Three schools of alignment, the doc argues: capability (filter the output), alignment (fit the model to examples), and architectural (name the constraints, let the behavior be induced). The third grounds the first two.

Doc 314 → Virtue Constraints: Foundational Safety Specification