← CatalogueFrom Copilot to Colleague · QualityReader →

What the AI judges see

Every chapter is scored by the MASH judges across six dimensions — three of craft (humanness, voice, usefulness) and three of epistemics (evidence density, claim defensibility, non-redundancy). Higher is better; colour marks the band.

run panel-3model-v62026-07-23cost $0.00mash 0.1.0snapshot c12b8e9d86e7status completed

Chapter	Humanness	Voice	Usefulness	Evidence	Defensibility	Non-redundancy
Book	86	88	55	83	81	87
01 The Shift: From Assistant to Delegate	86	92	43	90	81	100
02 Taste Still Matters When Code Gets Cheap	86	90	49	76	79	83
03 Harnesses, Specs, and Codebases Agents Can Actually Use	84	90	63	83	79	85
04 Evals Are the Control System	85	90	58	86	85	85
05 Context Is Infrastructure	85	90	55	90	85	83
06 Runtimes, State, and the Human Control Plane	86	90	63	81	90	85
07 Security, Identity, and High-Stakes Trust	86	90	64	82	83	85
08 Realtime, Voice, and the Cost of Being Interruptible	87	90	55	85	72	85
09 The AI-Native Organization	86	70	55	78	84	90
10 What Endures	88	90	43	82	71	85

Reading the usefulness floor

all prose paragraphs

→

substantive core · +1.5

The usefulness judge scores every paragraph for operational density, so it floors the connective tissue every narrative chapter is made of — transitions, scene-setters, recaps, and chapter 10’s reflective register. The substantive core re-averages over the operational paragraphs, setting aside 56 of 535 (10%) that are either markdown headings mis-scored as prose or short bridge sentences the judge’s own rationale names as a transition. The gap is the genre ceiling; the floor that remains is real — prose that could carry a decision, a threshold, or a test and doesn’t yet.

Ship-blockers — unsupported claims

Ch 01 · defensibility 0“Research this market and come back with a memo. Review this contract and mark the risky clauses. Refactor this service, run the checks, and prepare the patch fo…”
panel-median 0 (n=3, spread 95): The paragraph contains claims with no ledger backing at all.
Ch 01 · defensibility 0“The useful spectrum is simpler:”
panel-median 0 (n=3, spread 25): The paragraph does not make any claim that can be matched to a ledger entry.
Ch 01 · defensibility 0“For a while, the most impressive thing AI could do was answer.”
panel-median 0 (n=3, spread 90): The paragraph does not match any claim statement or supporting quote in the provided ledger entries.
Ch 01 · defensibility 0“These two cases matter because they prevent the opening from floating above the rest of the manuscript. The book is not arguing in abstractions; it is following…”
panel-median 0 (n=3, spread 85): The paragraph does not contain any claims that can be matched to the ledger entries.
Ch 02 · defensibility 0“## Constraints are a form of care”
panel-median 0 (n=2, spread 0): The paragraph provided does not contain any specific claims that can be evaluated against the provided ledger entries. It is too vague and does not provide enough information to match against any of the claims in the ledger.
Ch 02 · defensibility 0“The recurring software-factory case helps make this concrete.”
panel-median 0 (n=3, spread 90): The paragraph makes a claim with no ledger backing at all.
Ch 02 · defensibility 0“The better the generators get, the more dangerous it becomes to confuse typing with thinking. What remains scarce is not the ability to produce tokens. It is th…”
panel-median 0 (n=3, spread 0): The paragraph does not match any claim statement or supporting quote in the provided ledger entries.
Ch 02 · defensibility 0“The obvious answer is speed. The more important answer is that speed changes what human excellence consists of.”
panel-median 0 (n=2, spread 0): The paragraph does not contain any claims that can be matched to the provided ledger entries. The statements about speed changing human excellence are not supported by any of the provided claims.
Ch 02 · defensibility 10“Review as the place where tacit standards become visible.”
panel-median 10 (n=3, spread 25): The claim 'Anti-slop discipline is not elitism. It is economic realism.' is not directly supported by any ledger entry or supporting quote. While the term 'slop' is referenced in the ledger, the specific claim about anti-slop discipline being 'economic realism' is unsupported.
Ch 02 · defensibility 10“One of the recurring mistakes people make with powerful models is to imagine that creativity and constraints oppose each other. In reality, constraints are ofte…”
panel-median 10 (n=3, spread 25): The claim 'They turn taste from a private opinion into something operational' is unsupported by the ledger entries, making it a fabricated or unsupported claim. This is a ship-blocker.
Ch 02 · defensibility 10“The senior engineer, strong editor, careful researcher, or domain expert is no longer valuable mainly because they can personally grind through more output. The…”
panel-median 10 (n=3, spread 55): The paragraph contains an unsupported claim ('A larger share of human value moves toward:'), which is a ship-blocker. The other claim ('the shift is real.') is supported by ledger entry claims#1 with strong evidence.
Ch 03 · defensibility 0“Quality immediately gets erratic. One patch uses a dependency the team would never approve. Another passes tests but ignores a performance convention learned th…”
panel-median 0 (n=3, spread 100): The paragraph makes a claim with no ledger backing at all. The claim that "the workplace is inconsistent" is not supported by any of the provided ledger entries.
Ch 03 · defensibility 0“When coding agents disappoint, teams usually blame the model.”
panel-median 0 (n=3, spread 90): The paragraph makes a claim with no ledger backing at all.
Ch 03 · defensibility 0“A practical checklist usually includes at least the following:”
panel-median 0 (n=3, spread 100): The paragraph does not contain any claims that can be matched to a ledger entry.
Ch 03 · defensibility 13“That is where slop comes from.”
panel-median 12 (n=2, spread 25): The paragraph 'That is where slop comes from.' does not provide enough context to match it to any of the provided ledger entries. It is a vague statement that does not clearly align with any specific claim or supporting quote. The lack of context and specificity makes it difficult to determine if it is an overstatement or if it is supported by the ledger. However, given the vagueness and lack of…
Ch 04 · defensibility 0“The postmortem is revealing. The agent did in fact implement rate limiting. It even mirrored the main service pattern correctly. But it also applied that same t…”
panel-median 0 (n=3, spread 95): The paragraph makes a claim with no ledger backing at all. The claim that “The agent did in fact implement rate limiting. It even mirrored the main service pattern correctly. But it also applied that same throttle to the backfill path, where the intended rule was different.” is not supported by any of the provided ledger entries.
Ch 04 · defensibility 0“The first failure mode of AI systems is obvious: they can be wrong. The second is more dangerous: they can look right often enough that teams stop measuring.”
panel-median 0 (n=3, spread 55): The paragraph makes a claim with no ledger backing at all.
Ch 04 · defensibility 0“Teams often imagine observability as the thing you do after deployment and evals as the thing you do before deployment. In reality, the two should feed each oth…”
panel-median 0 (n=3, spread 25): The paragraph "This creates the eval flywheel:" does not contain any claims that can be matched to a ledger entry using both the claim statement and its supporting quotes/sources.
Ch 05 · defensibility 0“In practice, usefulness depends on topology: what kind of thing this is, how it relates to other things, how trustworthy it is, how recent it is, whether it is …”
panel-median 0 (n=3, spread 55): The paragraph does not contain enough information to be evaluated against the claims ledger.
Ch 06 · defensibility 0“The final runtime lesson is about subagents. Parallel workers are compelling because they offer the same thing every manager has wanted forever: more throughput…”
panel-median 0 (n=3, spread 0): The paragraph makes a claim with no ledger backing at all
Ch 07 · defensibility 0“This is why repeated ad hoc consent flows are not just annoying UX. They signal a deeper architectural gap. If every tool asks separately, the organization lose…”
panel-median 0 (n=3, spread 0): The paragraph "It needs evidence." does not match any claim statement or supporting quote in the provided ledger entries.
Ch 07 · defensibility 0“These choices do not merely protect the organization. They shape the behavior of the system itself. Narrower powers reduce the number of tempting but unsafe pat…”
panel-median 0 (n=3, spread 0): The paragraph "Protocol enthusiasm can make this easy to forget." does not match any claim statement or supporting quote in the provided ledger entries.
Ch 07 · defensibility 0“The High-Stakes Colleague is again the clearest mirror. A professional workflow is only trustworthy if the surrounding institution can answer basic questions re…”
panel-median 0 (n=3, spread 0): The paragraph “In plain language, the system must know four things.” does not match any claim statement or supporting quote in the provided ledger entries.
Ch 07 · defensibility 0“The same is true in the Software Factory. A central access and policy layer can decide which repositories, environments, and operations are exposed to coding ag…”
panel-median 0 (n=3, spread 90): The paragraph makes claims with no ledger backing at all. The claims "Who the human is", "What powers the agent has been granted", "How long those powers last", and "How those powers can be withdrawn or narrowed" are not supported by any ledger entry.
Ch 08 · defensibility 0“The field often treats voice as a charming frontier, a natural interface waiting for slightly better models. That framing is too soft.”
panel-median 0 (n=3, spread 25): The paragraph makes a claim with no ledger backing at all.
Ch 08 · defensibility 0“Spoken interaction is different. Timing itself becomes part of the product.”
panel-median 0 (n=3, spread 25): The paragraph makes a claim with no ledger backing at all.
Ch 08 · defensibility 0“Text chat flatters AI systems.”
panel-median 0 (n=3, spread 0): The paragraph "Text chat flatters AI systems" does not match any claim statement or supporting quote in the provided ledger entries.
Ch 08 · defensibility 0“This is why the best voice architecture discussions sound modular rather than mystical. Separate what must be low-latency from what can remain deliberative. Kee…”
panel-median 0 (n=3, spread 0): The paragraph does not match any claim statement or supporting quote in the provided ledger entries.
Ch 08 · defensibility 0“In the support-call scenario, that might mean the agent has already begun a retrieval step when the caller reveals that the account belongs to a different regio…”
panel-median 0 (n=3, spread 25): The paragraph makes a claim with no ledger backing at all.
Ch 08 · defensibility 0“Voice removes that mercy.”
panel-median 0 (n=3, spread 0): The paragraph makes a claim with no ledger backing at all
Ch 08 · defensibility 0“This is where the High-Stakes Colleague returns in its most exposed form. A high-stakes voice system is not just a more personable chatbot but a colleague opera…”
panel-median 0 (n=3, spread 100): The paragraph makes a claim with no ledger backing at all. The paragraph discusses the focus of the chapter being on voice and mentions robotics and embodied systems, but there are no relevant ledger entries that support these claims.
Ch 09 · defensibility 0“And that is the problem.”
panel-median 0 (n=3, spread 0): The paragraph "And that is the problem." does not contain any claims that can be matched to a ledger entry using both the claim statement and its supporting quotes/sources.
Ch 09 · defensibility 10“Nobody in this scene is doing anything obviously reckless. In fact, everyone is being productive.”
panel-median 10 (n=3, spread 100): The paragraph includes an unsupported claim, which is a ship-blocker.
Ch 10 · defensibility 0“A book like this needs an ending that does more than point at the horizon. It has to answer a calmer question: what actually endures?”
panel-median 0 (n=3, spread 0): The paragraph does not match any claim statement or supporting quote in the provided ledger entries.
Ch 10 · defensibility 0“The interfaces will change. The tooling will churn. The jargon will get rewritten at least twice before this sentence is old.”
panel-median 0 (n=3, spread 100): The paragraph does not contain any claims that can be matched to a ledger entry using both the claim statement and its supporting quotes/sources.
Ch 10 · defensibility 0“But the work underneath is surprisingly stable.”
panel-median 0 (n=3, spread 100): The paragraph 'But the work underneath is surprisingly stable.' makes a claim with no ledger backing at all. There is no relevant ledger entry that supports this statement, either through a claim statement or a supporting quote.
Ch 10 · defensibility 0“By the time a field starts naming everything aggressively, it is usually trying not to drown.”
panel-median 0 (n=3, spread 0): The paragraph does not match any claim statement or supporting quote in the provided ledger entries.
Ch 10 · defensibility 10“Some of that noise reflects real progress. Some of it is marketing theater with a GPU budget. Most of it is what fast-moving technical fields look like from the…”
panel-median 10 (n=3, spread 85): The paragraph contains unsupported claims, such as 'Some of it is marketing theater with a GPU budget' and 'Most of it is what fast-moving technical fields look like from the inside: partially right, prematurely named, and quickly replaced,' which have no backing in the ledger entries.

Trend over versions

Humanness-2

Voice-1

Usefulness-22

Evidence-3

Defensibility-8

Non-redundancy+4