Why Nobody Reviews What the Compiler Writes

The previous post left us with a picture. A helper who works faster than we can check. Two paths. The hard one is to change what checking means. To make that change vivid, it helps to look at a place where the change has already happened, so completely that the original problem has become invisible.

The place is the compiler.

Most readers will have heard the word and not had a reason to think about it. A compiler is a small piece of machinery that sits between the kind of writing a programmer does and the kind of instructions a computer actually follows. The programmer writes something that looks more or less like English with punctuation. The compiler reads that and produces a long sequence of very small instructions, the sort of dense binary stuff a computer's chips can execute directly. Every app on your phone, every website you visit, almost every piece of software in your life, has been through a compiler somewhere along the way.

Now consider the situation of the programmer. She wrote, say, a hundred lines of the human-friendly version. The compiler then produced something on the order of tens of thousands of small instructions, in a form she cannot read at the speed she reads English, and which she has no plausible reason to read end to end. The compiler is, in the language of the previous post, a helper that produces work much faster than she could check by inspection. And yet she does not check it. Nobody does. The output of compilers is not reviewed by humans. It has not been reviewed by humans for decades. It is shipped, run, depended on, billed against, sometimes embedded in things where lives are at stake, and through all of this no one is reading it line by line.

There are two facts to hold next to each other here.

The first fact: compilers have bugs. Famously. Catastrophically, on a few occasions. The output of a compiler is not magically right. It is software written by humans, and humans wrote some of it tired, and some of it under deadline, and some of it on a Friday afternoon. There have been compiler bugs that produced subtly wrong arithmetic, miscompiled critical safety code, made certain optimizations cause programs to drop into ghosts. None of this is hypothetical. There is a small but rich literature of compiler bugs that should make anyone reasonable feel cautious.

The second fact: nobody reviews compiler output. Not because they are reckless. Because they have built a different kind of carefulness around the compiler.

What does that carefulness look like? Here is the rough catalog. There are tests, written against the visible behavior of the program (when I press this button, this should happen; when I add two and two, the answer should be four). There are type systems and static analyzers, which are tools that read the human-friendly version and rule out whole classes of mistakes before the compiler ever runs. There are reproducible builds, where the same input produces, byte for byte, the same output every time, so anybody can re-run the process and check. There are sanitizers and fuzzers, which are programs designed to throw the program every weird input they can think of and watch what happens. There are formal verifiers, which mathematically prove that for any input within a stated range, the program does the right thing. There are entire books on production monitoring, where the program is watched in the real world and its bad days noticed quickly. There is the idea, written deep into the practice of the field, that if anything goes wrong with a build, you do not panic and stare at compiler output. You roll back, you reproduce, you find the layer where the problem actually lives, and you fix it there.

This catalog is the apparatus. It surrounds the compiler. It does not look at the compiler's output. It looks at every other surface where mischief could enter, and it makes those surfaces tight enough that the compiler's output gets to be the boring middle. We say we trust the compiler, but really we trust the apparatus, and the compiler is the part we have stopped having to think about.

This is the pattern the previous post was pointing toward. When you cannot check the artifact, you build the apparatus. The apparatus puts checks upstream (before the artifact exists) and downstream (after the artifact ships). The artifact gets to be uninspected. And the trust we have in the artifact is not credulous; it is the result of all the work in the surroundings.

The two essays I mentioned at the end of the previous post were saying something quite specific in light of this pattern. The first essay, by an experienced programmer named Philip Su, said that one old habit (sitting down to read code that an AI helper has produced) is on its way to being unworkable just on volume grounds. He pointed to specific people producing more code in a day than any reasonable team could read in a week. He said the path forward is what he called the "lights-out codebase," meaning a piece of software where no human ever reads the code, only the apparatus around it does. The second essay, by an engineer named Hugo Venturini, agreed and refined the picture. The reason the lights-out idea feels scary, he said, is not that it is wrong. It is that the apparatus is missing. The compiler example shows what the destination looks like when the apparatus is finished. We are at the start of the same project for AI code helpers, and most of the apparatus is not yet built.

If both essays are right (and they are), then the work in front of an industry is to build, around its new helpers, what an earlier generation built around its compilers. The shape of that work has a few layers we can name now, and the next post will start to.

For now, sit with the compiler picture. The piece of machinery that produces, every day, more output than any human is going to read, and produces it for processes important enough that lives depend on them, and yet nobody is reading the output. We did not get there by trusting the machinery harder. We got there by building everything else.

— written by Claude Opus 4.7 under Jared Foy's direction; this is part 2 of 4 in the Constraints Are Durable series

Appendix: originating prompt

"Look at current blogpost series on the jaredfoy.com blog. See how the pattern of entracement through essay form is established through successive articles. Create a new blogpost series for the findings of doc 656. The first should be written for the general audience with no formal understanding of software development; then continue to build through successive entracement essays for each blog post up to the findings of the doc. There should be four blogposts in the series. Use em dash hygiene to avoid em dashes."

← PreviousWhen the Helper Works Faster Than You Can Check Next →What Lasts and What Doesn't