Thread pool (part 2)

published: Mon, 12-Dec-2005   |   updated: Mon, 12-Dec-2005

Time to start on this little project of mine to design and write a thread pool class. No sooner than I decide this course of action, though, that first, work and my personal life explode (sigh), and that second, Maxx, a fellow architect here shows me Joe Duffy's post on why you shouldn't write a thread pool, or, rather to be fair, why many people's reasons for doing so are not valid. Hah! I shall forge on since Joe's argument is mostly about performance. Nevertheless, I shall bear his posts on multithreading in mind as I TDD myself to a solution.

Last time I wrote down a set of requirements for a thread pool that would suit the particular scenario that kicked off all this activity: we want to initiate several requests to other machines to get data from them and, since the request/reply protocol is lengthy, the executing thread for a request/reply ends up being blocked for a good period of time.

First thing is to create a new solution in Visual Studio. (Be warned: I'm going to be using .NET 2.0 in this series so you'll need VS2005. If you don't have it, use C# Express instead. While you're at it, I'll also be using NUnit and TestDriven.NET.)

Now I already have a JmBucknall.Threading project and a JmBucknall.Structures project, so I created a new solution for my ThreadPool implementation and added these two projects (and their related test projects) to it. The reason for adding the JmBucknall.Structures project is that it has the limited priority queue that I've already intimated I was going to use.

So at this point, I've just set up my environment with some stuff I already had. Nothing new. Just to make sure I had it all set up right, I built the solution and ran my tests. Green bar (although since I was using TestDriven.NET, it merely said "X succeeded, 0 failed" in the output window).

Time now for something new. Looking at my requirements list, there are several points of attack. There's all the requirements for the thread pool for a start, there's some for the tasks, there's logging and statistics.

I think I'm going to start with the task. It seems pretty isolated from the rest and, apart from the thread pool itself, is the primary interface point for the user. It also promotes a kind of bottom-up design which is how I tend to think of things.

"The task should have a priority" seems a good place to start. I add a new item for the Task class in the JmBucknall.Threading project and a test class for it in the .Test project. I then write the following test:

namespace JmBucknall.Threading.Test {
  [TestFixture]
  class TaskTester {

    [Test]
    public void CheckNewTaskHasPriority() {
      Task<int> t = new Task<int>(1);
      Assert.AreEqual(1, t.Priority, 
        "Task's priority should be same as parameter to constructor");
    }
  }
}

In other words: the Task class is generic on the priority type, and if I construct a task with an int priority of 1 then the value of the Priority property must be 1. I compile, it fails (obviously, since I haven't written the implementation code yet).

Aside: many people don't view this step -- a failed compilation -- as a failing test. Indeed, it's hard to argue that it is since the only thing that has been run up to now is the compiler. I don't really care to be honest; I tend to do it anyway since it gives me a list of code things I need to write: the compile errors in the Task list.

So I write the simplest thing to make the test pass.

namespace JmBucknall.Threading {
  public class Task<T> {

    public Task(T priority) {
    }

    public T Priority {
      get { return 1; }
    }
  }
}

But it doesn't even compile since the constant 1 is not of type T. Cool, the compiler is making sure that the simplest thing is not even syntactically correct. So, I give up on that and write the simplest thing that would work correctly.

  public class Task<T> {
    private T priority;

    public Task(T priority) {
      this.priority = priority;
    }

    public T Priority {
      get { return priority; }
    }
  }

Actually, this is an important question in TDD, one that doesn't have a satisfactory answer apart from "it depends" or even "do what you feel comfortable with". How much should you write to make a test pass? The example I gave is the usual one: write something that uses a constant, the constant that the test is expecting. I must admit I tend not to do something so simple, since I can foresee the next test coming up that will cause the simple code to fail.

Should I make the property writable? No: there are no requirements that talk to altering the priority of a task once created and therefore I defer to the YAGNI ("you ain't gonna need it") principle. In other words, only write code you need right now; don't try and second-guess what you might need in the future: that particular future may never turn up and you'll have wasted all that time.

Next test then. A task should be executable (the thread pool will be executing it eventually), but I'm unsure of how to test this. I think the problem is that so far I'm creating a concrete class, when in reality what will probably be required is an abstract class (or an interface?) from which the user will be designing other task classes to do specific work. I could just add an Execute() virtual method and make sure it's called by the thread pool later on. But in reality I won't just yet since I should only write it once I have a test that calls it. And that will only happen once I get to the thread pool implementation.

Having written this (admittedly minor) post (and left it in a state of limbo for several days), I was left with a weird feeling that I'd gone wrong somewhere. I know that the requirements state that a task should have a priority, but does it actually mean that the task object should have the priority as an attribute or should the priority be considered more of an parameter that you pass along when you enqueue the task? Perhaps the priority should be more related to the enqueuing mechanism in the thread pool than the task itself?

After all, consider this: if the priority were part of the task, the thread pool would have to ask the task what priority it was. That goes against the "Tell, Don't Ask" principle. This principle, which people tend to forget (like me just now!) or just haven't heard of it, states that you should write client code (code that uses an object) to tell an object what to do rather than ask the object for various bits of data and then do the work itself.

So, the thread pool can legitimately tell the task to "Execute yourself" and perhaps eventually to "Abort yourself", but should refrain from asking the task "Hey, what's your priority?" in order to do some work with it.

All in all, I think I should've started with the thread pool class instead. Rats. But (as Julian valiantly manages to scrape something positive from the situation) it is an important point to make about TDD: you will write code that won't get used, that you'll alter, that you'll throw away for something better. The first thing you write to satisfy a requirement or a user story is most likely to have completely disappeared after a few cycles. However, because TDD encourages you to write code that is simple and well-contained, it'll cost you less than if you design a whole bunch of stuff up front as a detailed design.

But, hey, at least I got the Visual Studio solution set up. Next time, we'll approach the problem from another angle.