7 votes

Coverage is not strongly correlated with test suite effectiveness

2 comments

  1. Seirdy
    Link
    I felt this article misses the forest for the trees. Coverage isn't just for verifying correctness; it's useful for detecting API changes, eliminating dead code, and building a better...

    I felt this article misses the forest for the trees. Coverage isn't just for verifying correctness; it's useful for detecting API changes, eliminating dead code, and building a better understanding of program behavior.

    From a comment I posted on lobste.rs:

    I also find coverage extremely valuable for finding dead or unreachable code.

    I frequently find that unreachable code should be unreachable, e.g. error-handling for a function that doesn’t error when provided with certain inputs; this unreachable-by-design error handling should be replaced with panics since reaching them implies a critical bug. Doing so combines well with fuzz-testing.

    It’s also useful for discovering properties of inputs. Say I run a function isOdd that never returns true and thus never allows a certain branch to be covered. I therefore know that somehow all inputs are even; I can then investigate why this is and perhaps learn more about the algorithms or validation the program uses.

    In other words, good coverage helps me design better programs; it’s not just a bug-finding tool.

    This only holds true if I have a plethora of test cases (esp if I employ something like property testing) and if tests lean a little towards integration on the (contrived) “unit -> integration” test spectrum. I.e. only test user-facing parts and see what gets covered, and see how much code gets covered for each user-facing component.

    3 votes
  2. mtset
    Link
    This whole article is quite short, and well summarized in just one sentence: That's quite a blow to the conventional wisdom that high code coverage is predictive - even necessary - for an...

    This whole article is quite short, and well summarized in just one sentence:

    More tests do find more bugs, but it's the number of tests and not their code coverage that has most of the predictive value.

    That's quite a blow to the conventional wisdom that high code coverage is predictive - even necessary - for an effective test programme.

    2 votes