Article

Verification Is Not Validation

Verification confirms a model implements its specification correctly; validation confirms the specification matches reality — and a flawlessly verified model can be entirely invalid.

May 27, 2026

TL;DR

Verification and validation answer two different questions that fail independently. Verification asks whether a model correctly implements what it was specified to be — whether the code faithfully realizes the equations, logic, and assumptions it was built from. Validation asks whether that specification corresponds to reality — whether those equations are right about the world. A model can be flawlessly verified and entirely invalid, and most claims that "the model is tested" cover only verification. The two have opposite economics: verification is internal, cheap, and largely automatable; validation is external, expensive, referent-bound, and a matter of judgment. That asymmetry is exactly why the cheap question quietly stands in for the expensive one — and why "tested" is the word to distrust.

Two Questions That Look Like One

Between a simulation result and the world it claims to describe sit three distinct things. There is the system being modeled — the real or proposed entity, with its actual behavior. There is the conceptual model — the equations, logic, and assumptions chosen to represent that system. And there is the implementation — the code that executes the conceptual model.

Verification and validation live at different joints in this chain. Verification is the question between the conceptual model and the implementation: did we build the model right — does the code compute what the equations say it should. Validation is the question between the conceptual model and the world: did we build the right model — do those equations correspond to how the system actually behaves. They are not two grades on one scale where validation simply demands more rigor. They examine different joints, which is why a perfect result on one carries no information about the other.

Why They Fail Independently

Because the two questions sit at different joints, their failures are orthogonal. A verification failure is an implementation that diverges from its specification: a sign error in an equation, a unit mismatch at an interface, an integration step that accumulates differently than the math intends. The conceptual model may be a flawless description of reality, and the code still gets it wrong.

A validation failure is the opposite. The implementation computes exactly what the conceptual model specifies — verification passes at every level — and the conceptual model is wrong about the world: an assumption that does not hold in the modeled conditions, a mechanism left out, a regime the equations were never meant to cover. Each failure can occur while the other is perfectly satisfied, which means a clean verification result tells you nothing about validity, and a validated concept tells you nothing about whether it was implemented faithfully. The state that does the damage is verified-but-invalid: internally flawless, externally false. It is the most dangerous condition precisely because verification's cleanliness lends the whole result an authority the conceptual model has not earned.

Verified, and Wrong

Consider a radar-detection model. Its code is checked exhaustively against its propagation equations: the implementation provably matches the specification, the unit tests pass, the numerical behavior is stable, and a careful review confirms every line realizes the intended math. By any verification standard, the model is correct.

Now suppose those equations assume a propagation regime — clear, well-behaved conditions close to free space — that does not hold in the environment being modeled, where atmospheric ducting, terrain masking, and clutter dominate. Every run is internally correct and every detection-range conclusion is wrong. No additional verification touches this error, because the error is not in the implementation. It is in the conceptual model's correspondence to reality, a joint that verification never inspects. The model will produce confident, repeatable, internally consistent detection ranges that an analyst can build a decision on — and the decision will rest on a regime that was never true.

Different Economics, and the Substitution They Invite

Verification is cheap because it is internal. Both sides of the comparison are in hand: the specification and the code. That makes verification largely automatable, runnable continuously, and reducible to a clean pass or fail — a property you can re-establish on every change, especially when execution is deterministic enough that a regression check means something exact.

Validation is expensive because it is external. It requires a referent — real measurements, instrumented trials, an accredited model, or disciplined expert judgment — and for the systems most worth simulating, that referent is often scarce, costly, or does not yet exist. Its verdict is rarely binary; it is a question of whether output behavior is accurate enough for an intended use over a stated domain. So one question yields a green checkmark on every commit while the other yields a qualified, perishable judgment that is hard to obtain and harder to keep current. Organizations drift, structurally rather than dishonestly, toward reporting the cheap result and letting "tested" imply "valid."

AI-assisted model construction sharpens this. Generation can produce structure that is syntactically clean, schema-valid, and internally consistent — verifiable — far faster than anyone can check whether that structure corresponds to reality. Verification keeps pace with generation; validation does not. The effect is to fill the verified-but-invalid space faster than before, which makes keeping the two questions distinct more urgent, not less.

What It Takes to Claim Both

Claiming both means keeping the joints separate and stating, explicitly, which one has been established and over what domain. Verification can be continuous: with a deterministic core, "the implementation still matches the conceptual model" is a property re-confirmed on every change rather than asserted once. Validation has to be scoped — tied to an intended use and a domain of applicability, bounded honestly where the referent is thin, and re-opened whenever the model is pushed toward the edges of the conditions its assumptions were meant to cover.

The two also age differently, and must be re-checked on different clocks. Verification is re-established whenever the code changes. Validation is threatened whenever the use changes — when a model validated for one regime is quietly applied to another. A system that conflates them re-runs the cheap check faithfully, leaves the expensive one frozen at its first pass, and slowly accumulates use beyond anything it was ever validated for, all while continuing to report that it is tested.

Why "Tested" Is the Word to Distrust

"Tested" is the word that hides the gap, because it can truthfully describe a model that is internally flawless and externally false. The discipline is not to test more; it is to say which question has been answered — that the implementation is faithful to the model, that the model is faithful to the world, and over what domain each of those claims holds. Those are separate statements, earned by separate work, and collapsing them into one reassuring word is how a result that cannot be defended comes to be trusted anyway.

A simulation that can state both claims separately, with their limits, is one whose conclusions survive scrutiny. A simulation that can only say "it's tested" has answered the question that was easy to answer and left the one that actually decides whether the result is true. That difference is not visible in the output. It is visible only in whether anyone can say, when asked, which of the two questions the green checkmark was answering.

Continue

Back to Field Notes