What does TDD help you do?

published: Fri, 17-Jun-2005   |   updated: Fri, 17-Jun-2005
Helmsley Castle, N. Yorks

This is a bit of a trick question. Many developers when they come across TDD (test-driven development) for the first time assume that it is a technique for writing quality code. Well it certainly looks that way: you write the test and then you write the code to pass the test. Refactor a bit to get rid of duplication, and, bam, you get well-tested code of high quality and furthermore, as a side-benefit, 100% code coverage. Not bad, eh?

Actually those who do TDD for any length of time recognize that writing quality code is merely a side-benefit of the methodology. If you do TDD properly and really refactor the heck of your code, you'll find that TDD turns into a methodology for discovering the ultimate design of your application. You are merely discovering the design through writing code and not through drawing UML diagrams in Visio.

Let's see how this works. The first step in any TDD development project is to write a test. What test? Where does the idea for that come from? Well, presumably we have some requirements for the software we're trying to write, otherwise why would we be writing it?

OK. Let's suppose at this point that we've been asked to write a program that should go through a project's C# code and extract out the literal strings, add them to a resource file on the fly, and replace the literal strings with code to get the strings from the resource. This program should run as part of the build process.

So we have our requirements. I'm sure the problem space is sufficiently familiar that we don't have to ask the customer to divide the requirements any further. But what should our first test be?

It's difficult, eh? There are several issues that we must solve in these requirements. How do we find out the source files in a project? Where do we get the name of the resource file? Is it part of the project file too? What about scanning a source file looking for literal strings? We'll have to worry about comments too since the "literal string" may be embedded in one and we should ignore it. Should we worry about column names and the like for SQL statements since these would not have to be localized. And so on, so forth.

So, ignore all these questions. Or rather list them on a pad so we don't forget them. What is the simplest test we can think of?

I'd start at the top: write a test to open a project file. One that exists.

public class TestProjectFileProcessor {
  [Test] public void EnsureExistingFileCanBeOpened() {
    ProjectFileProcessor pfp = new ProjectFileProcessor("someproject.csproj");
    Assert.IsTrue(pfp != null, "Processor object is null after new")

The Assert.IsTrue() call here is a little superfluous (after all, if the new succeeds, the language guarantees that the object will be non-null, providing there wasn't an out-of-band exception prior to the assignment), but it also ensures that an exception is not thrown. But in writing the test I've had to make a decision: I need a class that processes a project file. It also makes sense that the constructor accepts the name of a project file. I don't know yet what this class may do or how it will do it, but at least I've made a start.

Compile and it fails: there is no ProjectFileProcessor class. I implement it the simplest way possible (which doesn't involve opening the file <g>). Compile, run, test passes.

So what happens when the file doesn't exist? I think the constructor should throw an exception. Write a test to test for it (hint, use the [ExpectedException] attribute), then write the code.

Maybe the constructor shouldn't throw. Perhaps a better model should use the Open/Close pattern (that is, there should be an Open() method that opens the file, and a Close() method that closes it after processing is over). Alter the two tests for this, write the Open() and Close() methods on the main class and see the test pass.

Hang on, though, the Open/Close pattern implies IDisposable, especially for a file. The Dispose() method should call Close() (or vice-versa, of course). So, write a test for Dispose(), and implement IDisposable and Dispose() in your class.

And so on, so forth. Notice that (1) I'm making no assumptions about the final design of my program, I'm discovering it through a cycle of writing tests, writing code, refactoring, and (2) each step is tiny and quick to complete. You'll also find, as you go along in this rapid-fire cycle, that you become loath to violate the single responsibility rule (that is, each class has one responsibility and one only). At some point I can imagine that I'll come to the conclusion that ProjectFileProcessor is the wrong name for this class: it deals with opening and closing a project file, and maybe reading the lines from it. Time to rename it to ProjectFile, say.

I can also imagine a SourceFileExtractor class that knows how to extract source file names from an open project file instance. But, whoa, watch out! Notice that I'm trying to design up front now. Stop that, let yourself be guided by the tests and how you want the tests to read. They will guide the class model and the calls on the class model.

So, returning to the original question: TDD is a technique for discovering a class model, a domain design. The fact you write code in doing so is a double benefit: you automatically get an implementation of your design and you know that the implementation is fully tested.