Wake up, Delphi Informant!
published: Wed, 12-Nov-2003 | updated: Thu, 27-Oct-2005
Last week I attended the 2003 Borland Conference as speaker and as Microsoft booth babe. I'll talk more about my experiences there at some other time, but today I'd like to talk about something else.
In the Borland backpack that was given out to all attendees there was a copy of Delphi Informant. Now, I haven't looked at this magazine for a long time. When I worked at TurboPower (now 20 months ago) I used to borrow one of the other developers' copy to read every now and then, but I never subscribed personally.
I was always of the opinion that the mag was too light and articles suffered from not enough technical review. It didn't help that they used Rod Stephens as Algorithms Guy (bias alert: I was Algorithms Guy for The Delphi Magazine) who I considered good academically but bad practically (time after time, he would assume that the largest value an integer variable could have was 32767, and this was with 32-bit Delphi).
Anyway, so I got this copy of Delphi Informant. I started to read some of the articles.
Right away in one article on using multiple threads in a Windows service I spotted a race condition. The author was discussing the point that a service becomes more responsive if the work done by the service is done by a worker thread, allowing the main thread to quickly respond to the Service Manager messages. The stop service message handler was coded like this:
procedure TMyService.ServiceStop(Sender : TService; var Stopped : boolean); begin SetEvent(hStopEvent); FMyWorkerThread.Terminate; CloseHandle(hStopEvent); FMyWorkerThread.Free; Stopped := true; end;
Whereas the thread class' Execute method did this:
procedure TMyWorkerThread.Execute; var iWaitResult : DWORD; begin Init; repeat DoWork; iWaitResult := WaitForSingleObject(hStopEvent, 2000); until iWaitResult = WAIT_OBJECT_0; CleanUp; end;
So, in essence, to stop the service, it signals the event shared by the service and its worker thread. The worker thread is waiting on the event (or soon will be) with a timeout wait. Since the event is signaled, the method exits the loop, and the thread object does some cleanup. The service then terminates the thread, closes the event handle, frees the thread and returns, saying the service has stopped.
Explained like that, it's hard to see the race condition. And that's how the author explains it. See if you can work out the problem before reading on.
Let's be awkward and explain it like this. The service signals the event. It calls Terminate on the thread object, which, if you didn't know, merely sets a flag in the thread object to say "terminate as soon as possible". It then closes the event handle, and frees the Delphi thread object. Freeing the thread object will cause the main thread to wait for the other thread to complete (that's how it's been coded). Meanwhile, what about that thread? Say it wasn't waiting, but was in the middle of DoWork while all this was going on (or maybe it wasn't even running because the machine is a single CPU machine). It reaches the WaitForSingleObject code but the handle has already been closed. So the Wait call returns an error, but, since the code doesn't check for it, the Execute code merrily goes round the loop again. And again. And again. The thread never stops, and the service freezes.
To be fair the author does say this: "The only thing to be aware of is that, as written, the ServiceStop event doesn't know whether the worker threads actually terminated. If this is a problem, you can add some code to ServiceStop to confirm the Terminated property of the worker threads before saying: Stopped := true;" Unfortunately, although the first sentence is spot on, the second sentence is completely wrong and shows that the author isn't aware of the race condition (since he postulates adding some code just before setting Stopped, which, as I've shown, is way too late), or of how TThread.Destroy actually works (if we hope the race condition doesn't occur, the thread will definitely have stopped and been cleaned up by the time Free returns).
The simple answer is to add a WaitFor call just after signaling the event (instead of the utterly useless call to Terminate) and to close the event handle afterwards:
procedure TMyService.ServiceStop(Sender : TService; ver Stopped : boolean); begin SetEvent(hStopEvent); FMyWorkerThread.WaitFor; FMyWorkerThread.Free; CloseHandle(hStopEvent); Stopped := true; end;
Of course, you can easily set FreeOnTerminate for the thread object, and then you don't even need the call to Free.
Another point in the same article recommends adding a global variable to hold the event handle. Not true at all. You have written a descendant thread class, so add it as a field in that class. You can access the field (through a property, presumably) from the main code and from within Execute. This makes it a lot simpler to have many worker threads, maybe each with a different "stop" event. It also avoids the incorrect "we don't need to create a name for the event because the handle is global" explanation in the article (in essence, we don't need to because we don't have to).
And, yes, you guessed it, there's more inaccuracies in the very same article. First, the author talks about TMultiReadExclusiveWriteSynchronizer as being a class to help you achieve thread safety for your data, without talking about what the heck it was really designed for. He rightly says that the class has problems (it does, apart from its ridiculously long name, and these problems have been beaten to death in the newsgroups) so you shouldn't use it, and then recommends using a critical section instead despite the fact that it's "more expensive". Huh? The TMultiBlah class is written using critical sections, for heaven's sake! In fact, Danny Thorpe has agreed that they should've ditched TMultiBlah for simple critical sections a long time ago.
And then we get the beginner bug in the example code for using critical sections:
procedure Foo(...); begin try FMyCriticalSection.Enter; // do something now we're protected finally FMyCriticalSection.Leave; end; end;
Admittedly, the bug will never raise its head -- it's more of a not-following-standard-programming-patterns type error -- but it shouldn't be here in this "advanced" article. So, all in all, the article suffered from being too light and from not having had enough technical review (and you should note that the masthead does list a Technical Editor: it seems like this person is treating his job as a sinecure).
Then I read Xavier Pacheco's article on using the .NET Framework, the first in a series. He was discussing some collection classes: Stack, Queue, ArrayList. Hey, I thought, this article is in my area of expertise. Squared. Unfortunately, it was way too light again.
Some quick points raised by this article, in no particular order. Although they are all fairly minor, they point to the article being too "light." He calls stacks a "FILO" data structure; but all of my algorithms books tend to call it LIFO (yeah, nitpicking, I know). The Peek() method is called Peak at one point. One of the constructors for a Stack allows you to specify the initial capacity for the Stack instance; what is the default capacity if you don't use this constructor? He mentions that you can create a stack using the items from another collection; in which order are the items added to the stack (i.e., in which order would they be popped)? He shows how to enumerate a stack, but doesn't say in which order the items are returned: in pop order, or in push order? How is the stack implemented internally? Similar questions arise about the Queue class, again unanswered. What's a "growth factor" for a Queue (the answer to this requires knowing how the Queue class is implemented, of course), and why isn;t there one for a Stack?. For the ArrayList there are yet more similar questions.
Come on Delphi Informant, kick your Technical Editor awake: you can do better than this.