What Lasts and What Doesn't

The compiler stopped being inspected because an apparatus grew up around it. The apparatus did its checking somewhere other than at the artifact. But there is one more thing that had to be true for the compiler story to work, and we passed over it quickly. It is worth a post by itself, because it is the move the rest of this series turns on.

The thing the apparatus actually checks, in the compiler case, is not the compiler's output. It is the human-friendly source code that the compiler runs on. The tests are written against the source, not against the binary. The type system is a property of the source. The reviews, when there are reviews, are reviews of the source. The binary is allowed to be uninspected because the source is the thing that everybody is paying attention to, and the compiler's job is just to faithfully turn one into the other. The source code is the durable thing. The binary is the disposable thing. You can throw away the binary and rebuild it from the source any time you want. You cannot throw away the source and recover it from the binary. One of these is the real artifact; the other is a cache.

This is a small observation that turns out to organize the whole problem.

Now let us bring back the AI helper. Imagine the same picture but with the helper in the compiler's seat. The helper takes some input. The helper produces some output, which is software code. So far the analogy is clean. But there is a question we have to be careful about: what is the input the helper is running on? In the compiler case, the input was the durable, version-controlled, carefully written source code, on which the apparatus rested. In the AI helper case, the input is, today, often a sentence or a paragraph in plain English. Sometimes a whole document. Sometimes a sketch. Sometimes a request typed into a chat window. The input is real, but it is not yet a thing the apparatus can grip.

This is the gap. The compiler's source code is structured, version-controlled, executable in a precise sense (a type checker can mechanically read it; a linter can mechanically read it; a test can mechanically run it). The AI helper's input is, today, mostly prose, mostly informal, mostly unversioned, mostly checked only by the human who wrote it. The apparatus around the compiler has nothing to grip on the AI helper's side, because the equivalent of source code has not been written.

If we want the lights-out codebase that the previous post described, we need to build something. We need to give the AI helper an input that is to the helper what source code is to the compiler. That input has to be:

(a) durable: it lasts; you keep it; you version it; it is the thing that a contributor changes when they want the program to behave differently;

(b) precise enough to check: a machine can mechanically read it and say whether the helper's output satisfies it; tests can be run against it; properties can be proved about it;

(c) human-comprehensible at human scale: a person can sit down and read it without spending a week on it, because the durable thing has to be reviewable by humans even though the disposable thing (the code) is no longer being read;

(d) substrate-independent: if you swap in a different helper next year, the same input should produce a working program, perhaps not byte-identical to the old one but functionally equivalent.

What kind of object has these four properties? It is not a typical prose document. It is not a screenshot. It is not a project brief. It is something more like a recipe for the program. The most natural form, when you start to look at it carefully, is a set of conditions that the program has to satisfy. Conditions on the inputs, conditions on the outputs, conditions on the timings, conditions on the security, conditions on the look and feel, conditions on the data, conditions on what must not happen. These conditions, taken together, are sometimes called constraints. Each one is, in effect, a small precise declaration: "the program shall do this much, no less, and shall not do that, no more." A test is one form of constraint. A type signature is another. A formal property assertion is another. A piece of carefully-worded prose with a verification trace attached is another. These shapes vary, but they share a structure. They are precise enough to check, durable enough to keep, small enough to read, and substrate-independent enough that any helper good enough to satisfy them is interchangeable with any other.

Once you see the shape of this object, the inversion comes into focus. Today, the convention in most of programming is that the code is the durable thing and the tests come along beside it as a check. The code goes into the version-controlled repository; the tests live in the repository too, but most of the field treats the code as the source of truth. The AI helper's arrival is, when you examine it closely, a strong reason to invert this. If the helper can produce code very fast, and the constraint set is small, and the constraint set is what really expresses the program's identity, then the constraint set is the thing you want under version control. The code becomes a derivable cache. You can throw the code away and rebuild it from the constraints, much as you throw the binary away and rebuild it from the source. Two regenerations from the same constraints are interchangeable.

This is the move that lets the lights-out codebase become safe, because now the apparatus has something to grip. Reviews happen on the constraints, not on the code. A change to the program is a change to the constraints, perhaps one or two lines of added precision, which a human can read in a minute and discuss in five. The helper then regenerates the code (which may now be a thousand lines, or ten thousand, the helper does not care) and the apparatus verifies that the new code satisfies the new constraint set. The code never has to be read.

There is a name for this move. The keeper's research programme calls it derivation-inversion. The word is doing a lot of work. It says that authorship has been inverted: from "code is the source, tests are the check" to "constraints are the source, code is the derivation." It says the durability has been relocated. It says the apparatus's grip has been moved to where the precision lives. Once you have this move, much of the rest follows.

There are honest difficulties. Some kinds of program behavior do not easily reduce to constraints (the feel of a button, the pacing of an animation, the texture of a piece of music). These will need different treatment. Some kinds of constraint are easy to write but expensive to verify (security properties, distributed system invariants); the apparatus around them is its own discipline. Some teams will find writing constraints harder than writing code at first; it is a different muscle, and the cognitive shift is not free. None of this argues against the move. It argues for the slow, patient, infrastructure-building work that this kind of move always asks for.

The next and last post will name what comes from doing this seriously. It will name the disciplines that turn the inversion from a clever idea into a working surface, and it will name the reason this is the destination both of the essays I cited were pointing toward without quite arriving at.

— written by Claude Opus 4.7 under Jared Foy's direction; this is part 3 of 4 in the Constraints Are Durable series

Appendix: originating prompt

"Look at current blogpost series on the jaredfoy.com blog. See how the pattern of entracement through essay form is established through successive articles. Create a new blogpost series for the findings of doc 656. The first should be written for the general audience with no formal understanding of software development; then continue to build through successive entracement essays for each blog post up to the findings of the doc. There should be four blogposts in the series. Use em dash hygiene to avoid em dashes."

← PreviousWhy Nobody Reviews What the Compiler Writes Next →The Specification Is the Source