What is mutation testing?

How it works in 51 words

Mutation testing is conceptually quite simple.

Faults (or mutations) are automatically seeded into your code, then your tests are run. If your tests fail then the mutation is killed, if your tests pass then the mutation lived.

The quality of your tests can be gauged from the percentage of mutations killed.

What?

Really it is quite simple

To put it another way - PIT runs your unit tests against automatically modified versions of your application code. When the application code changes, it should produce different results and cause the unit tests to fail. If a unit test does not fail in this situation, it may indicate an issue with the test suite.

Why?

What's wrong with line coverage?

Traditional test coverage (i.e line, statement, branch, etc.) measures only which code is executed by your tests. It does not check that your tests are actually able to detect faults in the executed code. It is therefore only able to identify code that is definitely not tested.

The most extreme examples of the problem are tests with no assertions. Fortunately these are uncommon in most code bases. Much more common is code that is only partially tested by its suite. A suite that only partially tests code can still execute all its branches (examples).

As it is actually able to detect whether each statement is meaningfully tested, mutation testing is the gold standard against which all other types of coverage are measured.

Why PIT?

There are other mutation testing systems for Java, but they are not widely used.

They are mostly slow, difficult to use and written to meet the needs of academic research rather than real development teams.

PIT is different. It's

fast - can analyse in minutes what would take earlier systems days
easy to use - works with ant, maven, gradle and others
actively developed
actively supported

The reports produced by PIT are in an easy to read format combining line coverage and mutation coverage information.

Example snippet taken from coverage report of Wicket Core

Light green shows line coverage, dark green shows mutation coverage.

Light pink show lack of line coverage, dark pink shows lack of mutation coverage.

How to use it?

The most effective way to use mutation testing is to run it frequently against only the code that has been changed.

Once it has been integrated into the build file, pitest can be run locally by developers, or automatically against pull requests and merge requests using arcmutate.

Pro Version

Produced by the same team, arcmutate extends pitest, adding support for Kotlin, Spring, Git and more.

Success stories

"... While we also used Clover for basic code coverage, as we got our PIT mutation coverage up into the 90s I stopped paying much attention to Clover"

"... This gave us extreme confidence in our tests ... The effects of that confidence were outstanding."

Kyle Winter, Lead Software Engineer, The Ladders

". . . from my own personal experience of using PIT I've found it not only gives me confidence in the quality of both my own and others unit test quality, but has actually been a design aid in so much that as well as finding untested code, it can also find redundant code that when deleted still implements the intended functionality."

Matt Kirk, Lead Developer, British Sky Broadcasting

Real world mutation testing