Unit tests: the obvious, the bad and the good

A while ago I technically screened a developer for one of our biggest client. This developer was a huge fan of unit testing. Whatever software puzzle I submitted could be solved with an unit test.

Here is a sample of the interview:

Say I have a C++ program that crashes because it exhausts its virtual memory space. What do you do to solve the problem?
Well... Err... I'd write an unit test... And then... Err... Problem solved!

Let's sum this up with a chart:

No need to say that the person wasn't hired. What I realized by then is that unit testing was not only often misunderstood but overrated. I don't imply we ignore unit testing at Bureau 14, we just have some degree of moderation in their usage...

The obvious unit test

Let's say we want to implement a string library in C++ for some reason. I'm talking something very simple with interfaces similar to the STL string. We're going to call it xt_string.

Test driven development is about writing tests before writing code. If you ask me, when you write the tests is not that important. I'd even say that writing tests before writing code will make you believe you don't need to write specifications or documentation (agile or not, write specifications, thanks).

Nevertheless, we're going to write obvious tests that I often call "space continuum integrity tests". Basically when these tests fail, something is terribly wrong either with your code or the physical laws of the universe. Generally, it's a problem within the code.

1
2
3
4

xt_string st1 ;
BOOST_CHECK (st1. empty ( ) ) ;
xt_string st2 ( "test" ) ;
BOOST_CHECK_EQUAL (st2, "test" ) ;

The bad unit test

Because we care about performance, our xt_string uses custom allocated buffers aligned on the cache. And lucky us, we have a function to validate that! Quick, to the unit tester!

1 2	xt string st("oh my"); BOOSTCHECK (is cachealigned (st. buffer ( ) ) ) ;

The buffer method returns the underlying buffer, as you probably inferred.

This test is horribly bad. Horribly. Why? Well because it's in "The bad unit test" section for instance.

It seems clever at first, but you're shooting yourself in the foot with a bullet that travel through the future. You pull the trigger now and your foot explodes one year later.

We have a simple policy here: each unit test must be ignorant of the internals. In other words, we only do black box unit tests.

Here is a non-exhaustive list of reasons :

The most obvious : you're making inner rework of the class twice more expansive. First change the class, then the unit test.
Unit testing is all about enabling you to modify your code and get some early validation. If you need to change the unit test when you change the class, you don't have that validation anymore. You're just doing the same work twice. That just means a twofold increase of errors.
People reading the unit test might base some code on it, as unit tests are often used as an example of "how to use the object". Thank to this test, users will go on assume things preventing - or making it really risky - modification.

In short: a good unit test validates features and their implications without assuming anything about the implementation.

The good unit test

So that's all there is to a good unit test? Actually, there's more to it.

Let's have a look at this:

1
2
3
4
5
6
7
8
9
10
11
12

xt string st1("yes");
xtstring st2 ( "no" ) ;
BOOST CHECKEQUAL (st1, "yes" ) ;
BOOST CHECKEQUAL (st2, "no" ) ;
st1 = st2 ;
BOOST CHECKEQUAL (st1, st2 ) ;
xt string st3("maybe");
BOOSTCHECK (st3, "maybe" ) ;
st2 = st3 ;
BOOST CHECKEQUAL (st2, st3 ) ;
BOOST CHECK(st1 != st2);
BOOSTCHECK (st1 ! = st3 ) ;

Looks pretty much like an obvious test of affectation and equality, isn't it? Except that it's more than that. This also tests the underlying memory management. What's good with this test is that if you're unifying your string buffers somewhere in the future, it's going to make sure that copy on write works.

Of course this example is far from complete and much more should be written to have some reasonable degree of validation.

Whatever implementation of strings you chose to have, the above test must remain true. You can rewrite the string class from scratch, you won't have to change the test. It's obvious, easy to understand and detects side effects.

A good unit test must have a certain degree of extra-lucidity: ability to detect future issues is the hallmark of a good unit test.

Think about what you might do with the tested code in the forthcoming years and you'll write good unit tests.

More about writing good unit tests

The real benefits of unit testing comes when you validate the interactions of your different structures and functions.

Ideally, your unit tests should reflect what the users are going to do with your program. Don't try to have 100 % coverage immediately (or ever actually). Instead, aim for all typical usage scenarii. That means that after the build, when the tests are run, you know your program won't crash immediately or spit blatantly wrong results. This doesn't replace integration testing and regression testing, but it gives a quick feedback about what you're doing.

The second good reflex is that whenever a bug is found and reproduced, write an unit test that exhibits the problem and then fix it. In doing so you reduce the probability of getting the same bug twice to almost zero.

Few more words

Passing unit tests doesn't mean your program works. Never overestimate the reliability and coverage of unit testing.

Most of all, never forget that unit testing is here to increase software quality and save time. Never put your team in a position where writing tests takes too much time from designing news features and fixing bugs.

Your customers won't like it.