Good and Bad Tests

16. January, 2017

How do you distinguish good from bad tests in your code?

Check these criteria. Good tests

  • Nail down expectations
  • Monitor assumptions
  • Help to locate the cause of a failure
  • Document usage patterns
  • Allow to change code
  • Allow to verify changes
  • Are short (LOC + time)

Bad tests

  • Waste development time
  • Execute many, many lines of code
  • Prevent code changes
  • Need more time to write than the code they test
  • Need a lot of code to set up
  • Take ages to execute
  • Are hard to run

Expectations

There are a lot of checks in your compiler. Those help to catch mistakes you make. Do the same with your tests. There are a lot of things that compilers don’t check: File encodings, existence of files, existence of config options, types of config options.

Use tests to nail down your expectations. Read config files and validate the odd option.

Create a test which collects the whole configuration of your program and checks it against a known state. Check that each config option is set only once (or at least that it has the same value in all places).

When you need to translate your texts, add tests which make sure that you have all the texts that you need, that texts are unique, etc.

Assumptions

Convention over configuration only works when everyone agrees what the convention is. Conventions are assumptions. Your brain has to know them since they are no longer in the code. If this approach fails for you, write a test that validates your assumptions.

Check that code throws the exceptions that you expect.

If you have found a bug in a framework and added a workaround, add a test which fails when the bug is fixed. Add a comment “If this test fails, you can remove the workaround.”

Speed

The world speeds up. No one can afford slow tests, tests that are hard to understand, hard to maintain, tests that get in the way of Get-Things-Done™. Make sure you can run your tests at the touch of a button. Make sure they never fail unless they should. Make sure they fail when they should. Make sure they are small (= execute fast, few lines to understand, little code to write, easy to change, cheap to delete). Ten tests, each asserting a single fact, are better than one test that asserts ten facts. If your tests run for more than ten seconds, you lose.

Documentation

There is code rot. But long before that, there is documentation rot. Who has time to update the comments and documentation after a code change?

Why not document code usage in tests? Tests tell you the Truth™. Why give someone 100 pages of words they can’t trust when you can give them 100 unit tests they can execute?

Conclusion

Make your life easier. Stop wasting time in your debugger, begging for production log files, running code in your head. Write a good test today, it will watch your back for as long as the project lives. Write a thousand good tests and they will be like an army of angels, warding you from suffering, every time you press that button.


Quality in Software

19. November, 2016

Every software developer has seen too much bad code. Which raises the question what good examples are there?

Python, for one. Coverity, which scans software for all kinds of flaws, ranks the quality of Python among the best. Just 0.005 defects per 1,000 lines of code (the average being between 15/kloc and 50/kloc).


Open Source As Good As Proprietary Software

28. February, 2012

The Coverity Scan 2011 Open Source Integrity Report (registration necessary) says: “Open source quality is on par with proprietary code quality, particularly in cases where codebases are of similar size.”

Which isn’t that surprising considering that it’s the same people who write both.

But there are a couple of hard number in the report which are interesting:

Linux 2.6 has about 0.62 defects per 1000 lines of code (KLOC) which Coverity says “is roughly identical to that of its proprietary codebase counterparts.” They can’t tell names but I guess the counterparts are Windows and Mac OS X. They have 0.64 defects per KLOC.

The industry average is 1.0 defects per KLOC which matches well with my (more anecdotal) knowledge that the best software developers make about 3-4 mistakes per KLOC of which 75% are found during development.