Economics of Testing: How Much Testing do we Need?

How much testing do we need? You're asking the wrong question

Testing is an economic investment, not a cost to minimize. The question "how much testing do we need?" is broken: it should be "which risks justify which investments?". This post introduces the Cost of Quality framework, portfolio thinking, and a four-step process for making testing decisions grounded in risk.

I read arguments and battles on "what's the proper coverage" on the internet a lot. This saddens me.

To me, this question is completely broken: it frames testing as a quantity problem. And quantity problems get quantity answers: "80% code coverage". Arbitrary numbers that feel responsible but aren't grounded in anything real.

To me, the right question is: which risks justify which investments?

A story about cutting coverage and improving quality

A few years ago I worked on a project sitting at 70% code coverage. In reports, this looked nice. In practice, the suite was full of flaky tests which failed intermittently for reasons unrelated to actual defects.

The costs were layered. The direct cost was obvious: the team was spending up to two man-days a week investigating failures that turned out to be noise. But the planning cost was worse: because the spikes were unpredictable, some weeks nothing, some weeks the whole team was blocked, sprint planning became guesswork. And then there was the trust cost, which was the most damaging: even when tests passed, nobody was confident they meant anything. The suite had lost its credibility. There was no way to tell "we're fine" from "the tests just aren't catching anything".

I ran the numbers: the effort going into maintaining and investigating those flaky tests was not producing expected value. So we cut them. Coverage dropped to 40%. We filled the gaps with manual testing while we slowly rebuilt, writing reliable tests for the things that actually mattered.

Quality improved, and not because we tested less, but because the evidence that remained was trustworthy. When the suite passed, it meant something. When it flagged a problem, we acted on it instead of adding it to the "probably flaky" pile.

"Less testing" was actually "better investment".

Same question, completely different answers

I can think of two very different systems. An insulin pump: a defect can kill a patient. The software is FDA-regulated, subject to formal verification, extensive documentation, and extensive testing at every level. A corporate internal CMS used by 50 people to publish blog posts: a defect means wrong formatting, or a post doesn't save on the first try.

Ask both teams "how much testing do you need?" and you should get completely different answers. Not because one team is more professional or more serious than the other, but because the risks are different. The stakes are different. The consequences of failure are different.

Testing strategy follows from risk, not from a generic idea of "the right amount".

What testing actually does

Testing is a learning activity; you explore a system, build understanding of how it behaves, and discover where it breaks. That learning produces evidence: about defects, failure modes, risks. In economic terms, this is appraisal: generating information that supports decisions.

It does not reduce risk on its own.

Risk is reduced when findings lead to action: a fix, a mitigation, a control, a rollback plan, a guardrail. "We tested it" does not mean "it's safe". "We tested it, found X, and did Y about it" — that's risk reduction.

This is what made the flaky test situation so corrosive. The tests were running, but the evidence they produced couldn't be trusted. So no risk was actually being reduced. The suite looked like an investment. It was a cost.

Testing as investment

Investment, in the economic sense: an asset obtained at cost, with an expectation that the future value it creates will exceed that cost.

This is exactly what testing is. You spend time, money, and attention today to reduce the probability and impact of costly failures tomorrow. The return isn't guaranteed — you're estimating expected value under uncertainty, not booking a fixed outcome. But that's true of any investment.

Same with any insurance, actually. You pay premiums expecting never to file a claim. If you never claim, that isn't waste — the protection was real. The decision to insure is based on risk exposure, not on the assumption you will definitely suffer a loss. Nobody buys scuba diving insurance for their daily commute. Construction sites invest heavily in safety equipment because the risks are real and the consequences of getting it wrong are severe.

Preventive healthcare works the same way. Catching a problem early is cheaper than treating it late. That's not a guarantee; it's a reason to invest.

But in my case, 70% coverage was a bad investment, more like a cost actually. The effort wasn't producing the future value we needed, namely reliable evidence about risk.

The Cost of Quality framework

There's a useful vocabulary for where quality-related money goes. The Cost of Quality framework breaks it into four categories.

Prevention: training, standards, design reviews, threat modeling — activities that stop defects from being introduced.

Appraisal: testing, code reviews, static analysis, audits — activities that detect defects that exist.

Internal failure: rework, debugging, re-testing, delayed releases — costs incurred when defects are found before they reach users.

External failure: incidents, support load, lost revenue, fines, reputation damage — costs incurred when defects reach users.

External failure is the most expensive category, because it combines remediation cost with customer impact, operational disruption, and often lasting reputational damage.

The economic goal is to invest in prevention and appraisal at a level that reduces the more expensive failure costs. The two man-days a week we were spending on flaky test investigation was internal failure cost, rework caused by unreliable appraisal. We were paying the failure tax without getting the appraisal benefit.

Diminishing returns and portfolio thinking

So the framework helps you see where your money goes. But even when you're investing in the right category — appraisal — not every dollar spent there delivers equal value.

The first tests you write tend to catch the most obvious, highest-impact problems. Each additional test catches progressively rarer or lower-impact issues. At some point, the cost of the next test exceeds the expected value of the risk it covers.

So you can't just "add more" — you need to choose what kind of investment you need for which risks. That's portfolio thinking: selecting a mix of complementary activities to achieve maximum risk reduction per resource spent. Static analysis catches things integration tests don't. Exploratory testing finds what scripted tests miss. The combination matters.

To build a portfolio, you need to know what you're investing against.

Start from risks

Investing against what, exactly? Risks to specific qualities your system needs to have. ISO 25010 gives a useful vocabulary for those: eight product quality characteristics that map to the categories of risk your system may carry. Functional suitability, performance efficiency, compatibility, usability, reliability, security, maintainability, portability.

For an insulin pump, the dominant risks sit in reliability, functional suitability, and security. For an internal CMS, they're mostly usability and maybe functional suitability. Different risk profiles mean different portfolios — different mixes of testing activity, different intensity, different coverage targets.

When investment decisions are grounded in risks, they become traceable and reviewable. You can explain why you're testing what you're testing. You can revisit the decision when conditions change.

What comes next

What we did on that project — cutting coverage from 70% to 40%, adding manual testing, rebuilding selectively — was an instinctive version of a more systematic process: calculate the cost, cut what isn't delivering, rebuild deliberately.

That process has four steps: identify and quantify risks, categorize and prioritize by exposure, decide where and how to invest in testing, then review and rebalance. It's a continuous cycle, not a one-time plan.

Next post in this series will cover the first step: how to identify and quantify the risks you're actually managing.

The full research behind this series is at BeyondQuality.

How much testing do we need? You're asking the wrong question

A story about cutting coverage and improving quality

Same question, completely different answers

What testing actually does

Testing as investment

The Cost of Quality framework

Diminishing returns and portfolio thinking

Start from risks

What comes next

Written by

Vitaly Sharovatov

Qase MCP: Bringing AI-Generated Testing Into the Real SDLC

QA made easy.

A story about cutting coverage and improving quality

Same question, completely different answers

What testing actually does

Testing as investment

The Cost of Quality framework

Diminishing returns and portfolio thinking

Start from risks

What comes next

Written by

Vitaly Sharovatov

Qase MCP: Bringing AI-Generated Testing Into the Real SDLC

You might also like

How to Write Test Cases

Global Shared Parameters: Ending “Parameter Chaos” in Test Case Management

How to choose a test automation tool?

How does AI test automation work?: FAQs answered

Browse posts by popular tags

QA made easy.