Deadlock when capturing standard output
published: Thu, 17-Nov-2005 | updated: Thu, 17-Nov-2005
Over the course of my work over the past month or so, I've been very involved in multithreading and concurrency, especially with regard to C# and .NET. As a first approximation (!) to my involvement, witness my recent articles on lock-free data structures in C# 2.0 (here's the final one; it has links to the others), but here's another item in the same general area that you may also find interesting and useful.
This one came about because of some code I wrote and that worked in my tests but that failed in the wide (or should that be, wild?) world out there.
In the middle tier of our main application, we'll sometimes shell out
to SQL Server's bcp.exe
command line program to bulk load
lots of data into the database. Simple, you cry, the Process
class is what you want, and indeed it is. Furthermore we need to
"screen-scrape" the output from bcp.exe
for monitoring
and analysis purposes.
Again, you cry, simple: you can redirect the standard output from a process you launch, by setting a flag, and then you can read the stream created. The word "then" in that sentence was the nub of my problem. Here's how I wrote the code to begin with.
BAD CODE!!! string output; using (Process bcp = new Process()) { bcp.StartInfo.RedirectStandardOutput = true; ...more setting up of bcp.StartInfo... bcp.Start(); bcp.WaitForExit(); output = bcp.StandardOutput.ReadToEnd(); }
Seems pretty obvious, no? You set up the info needed to start the
child process, start the child process, wait for it to finish, and
then grab all the output from it. Cool, and it worked fine on my
development machine. Unfortunately, on the first heavily-used server
we tried it on, we seemed to deadlock. (I say "seemed to" because you
could also argue that bcp.exe
was waiting for user input,
and that was a trail I went down for an hour or so since I couldn't
see the deadlock in my code. So, before reading any further, can you
see the deadlock?)
The problem lies in what
Joel Spolsky calls
The
Law of Leaky Abstractions. I'd assumed that the way this code
worked was that the .NET Framework set things up in such a way that
the code in the Process
class would continue gathering
the output in some kind of stream (like a memory or a file stream)
until the child terminated, and then I could call
ReadToEnd()
on the stream's reader to get at the data.
Pah! Wrong. What happens under the hood, so to speak, is that the
Process
class sets up a pipe between the child process
and the parent calling process. The ends of the pipe are, by
definition, in different threads. Pipes work in that one end is
pushing data into the pipe and the other is taking it out. It's not
instantaneous: the pipe is set up with a buffer (whose size is
determined by the operating system). Providing that the reading end is
getting data as fast as the writing end is putting it there, the
buffer doesn't fill. If the buffer does fill, the writer will block
until the reader reads some or all of the data in the pipe's buffer.
Look again at my code. This code is the reading end of the pipe, but it doesn't read any data until the child process has terminated. Indeed the thread executing this code will block on the wait statement until the child process (the writer) has completed. So, if the buffer fills, both threads will block.
The nasty thing here is that it's not a "traditional" deadlock in the sense that thread A locks X, thread B locks Y, and then thread A tries to lock Y with B trying to lock X. Nope, not this time. The writer is blocked by the pipe because the buffer is full and so has to wait for the reader to read something. The reader is blocked waiting for the writer to finish until it tries to read something from the pipe.
The answer is to put the ReadToEnd()
call before the call
that waits for the child to finish.
string output; using (Process bcp = new Process()) { bcp.StartInfo.RedirectStandardOutput = true; ...more setting up of bcp.StartInfo... bcp.Start(); output = bcp.StandardOutput.ReadToEnd(); bcp.WaitForExit(); }
To me this looks wacky. My thoughts go like this: say the child writes
"hello" to standard output and then doesn't write anything else for 5
minutes. Won't the ReadToEnd()
immediately terminate? The
only thing to read is "hello", therefore it's already read to the end
of the stream, no? Therefore the caller rushes to the statement that
waits for the child to finish and thereby loses all the other output
of the child.
Except it doesn't work like this. Now that I understand the actual
implementation (breaking the abstraction), I can see that
ReadToEnd()
in this case will not finish when it reaches
the end of the current written data as I originally thought. Instead,
it will read until the pipe closes and the child finishes. If there's
nothing to read at a particular point, it'll block until there is
something to read or until the pipe closes.
In fact, I'd argue the call to WaitForExit()
is not
strictly necessary. The pipe will only close once the child has
finished. When ReadToEnd()
completes, the child has
already terminated (modulo a millisecond or two), and so there's no
point in waiting for it to finish: you have all the child's standard
output already.
This is a great example of a leaky abstraction: to code the capture of
the child's standard output correctly you have to understand the way
it's been implemented. Since the StandardOutput
property
of the Process
class is typed as a simple
StreamReader
(the abstraction) there are no hints that
there is anything untoward going on (the abstraction leaking).
Having said all this: if I'd read the
MSDN
page on the StandardOutput
property for
the Process
class it would have warned me about this up
front. But, heh, who reads the documentation?