Posts on algorithms and data structures
published: Thu, 16-Jun-2005 | updated: Sat, 6-Aug-2016
Here are the articles I've written regarding algorithms and data structures.
Red-black trees (part 5, bis)
Something that occurred to me after I'd posted part 5 in my red-black tree series is that I was relying on you, the reader, visualizing the rotations and recolorings in my insertion example. I should have shown each intermediate step for each insertion, rather than just the end result. Doing so would have helped you picture the changes that are happening during this process. Read more..
Red-black trees (part 5)
In our previous installment, we'd introduced the principles behind red-black trees as well as an appreciation of why they would produce more balanced binary search trees than pure randomness by using a process called a rotation. Then we got stuck when actually inserting a node. Obviously this is one algorithm we are going to have to crack in order to be able to build a red-black tree that we can search. Read more...
Red-black trees (part 4)
Last time in this series on red-black trees, we convinced ourselves that leaving the balance of binary search trees to chance was pretty good but not completely optimal. Even if we could guarantee that we added items in a completely random order to our trees (and, to be honest, that's an unrealistic expectation), we would still only get average search times of 1.39 * log2(n), instead of the theoretic log2(n). So we shall have to exploit an "active" balancing algorithm, and the best one we know of is the red-black tree. Read more...
Calculating pi with C#
A couple of evenings ago, my wife was reading an article about a federal prosecutor, when she suddenly exclaimed "How many digits of π (pi) do you know? Because this lawyer knows 500 digits." I had to admit that I'd only memorized π to eight decimal places, 3.14159265, but I immediately jumped in with "I know how to write a program that can print out thousands of digits of π." So she threw down the gauntlet. Read more...
Red-black trees (part 3)
In this continuing series on red-black trees, we've now seen how to insert into a normal binary search tree. In fact the implementation and code were so simple, it's hard to even recognize that anyone would want to do anything else. Why invent more complicated binary trees, when the binary search tree seems to do so well? Read more...
Red-black trees (part 2)
Now that I have the technology available to draw trees quickly and accurately (and you have seen the algorithm and its implementation at first hand), it's time to go back to the main subject: red-black trees. Except that I have some more foundation to lay first. So in this episode we'll look at binary search trees, and how to insert new nodes into them. After all, visually, a red-block tree is a binary search tree with colorful nodes. Read more...
Red-black trees (part 1)
After my few recent rants, I need to cleanse myself with some hardcore algorithms and data structures. That's what you're here for, right? Read more..
Random iTunes playlists (part 2)
In the last installment, I'd got to the point where I could write code in C# to enumerate all the tracks in the iTunes library. I now need to do a little more work in order to select various tracks for inclusion in my random playlist. Read more...
Random iTunes playlists (part 1)
Enough of all this angst, let's get down to some programming and algorithms. Read more...
Hazard Pointers (part 2)
Last time in this series on hazard pointers I looked at how the concept could help provide a node reuse manager for a lock-free stack using C#, admittedly without fleshing out the details of how hazard pointers would work under the hood. This time I just want to look at the lock-free queue in the same way, to see how hazard pointers would help provide lock-free memory management. Next time, we'll delve into the C# code for the hazard pointers themselves. Read more...
Hazard Pointers (part 1)
Back in September, I discovered a subtle bug in my node reuse code for my lock-free data structures. The bug has a name, the ABA problem, and I kind of promised to come back and solve it at some point. Well, it's a New Year, that was then and this is now, so let's go ahead and solve the problem. It helps that my January 2006 article for The Delphi Magazine solves the ABA problem for Delphi, and that implementation was the hardest, at least in theory. Anyway, unlike my thinking at the time, we're going to use hazard pointers to implement a lock free node reuse algorithm in C#. Pointers? C#? A contradiction, or just nasty unsafe code? Read more...
Triangular Matrix
Every time I have to do this, I forget and have to look it up again. The last time, I thought I'd remembered how to do it and then wasted the next 10 minutes because I'd actually remembered it the wrong way round (well, I was tired, and an article had to be finished). The "this" I'm talking about? Implementing triangular matrices. Read more...
A little linked list trick
Let's have a little bit of fun today and talk about a minor yet elegant algorithmic change to a singly-linked list to make it a teensy bit more functional, and in doing so talk a bit about generics. This is certainly not earth-shattering stuff, but merely an enjoyable diversion from the horrors of your daily development. I'll be using C# 2.0 to implement this algorithmic amuse-bouche, so fire up your Visual Studios. Read more...
Linked List Patent
OK, this is hilarious and points out the complete idiocy of the US Patent Office in judging any kind of algorithmic patent: someone has patented (and it was granted in April 2006, no less) a linked list that uses several pointers so that you can iterate the items in the list in one of many orders. And, to make the patent even more delicious, the inventor, Ming-Jen Wang, lives in Colorado Springs. Oh, please Ming-Jen, get in touch, so I can buy you a beer and shake the hand of such an amazing inventor! Read more...
The ABA Problem
Um, I have an apology to make. There is a very subtle bug in my C# implementations of the various lock-free data structures. Sigh. And by subtle I mean very subtle, like the difference in taste between Vittel and Evian. Read more...
Lock-Free Linked List (Part 3)
So last time, just like a cheesy Saturday morning movie serial, I left you hanging, wondering how I was going to solve the insoluble in C#. The insoluble issue in this case is the fact that we have two items of information that will vary independently of each other in different threads, but that we need to see as a unit when we write new values of them using CAS. And, of course, the size of both items is more than the size we can CAS. Read more...
Lock-Free Linked List (part 2)
Last time in this series of articles on lock-free linked lists we pointed out some of the issues we needed to solve in order to have any chance of implementing such a data structure. These were to do with deleting nodes from the list: the first being that deleted nodes could not be disposed or reused since other threads may still be using them, and the other was the problem of inserting a node after a node that was just about to be deleted. I ended with the subtle hint that we would have to add some extra fields to our node class. Read more...
Lock-free linked list (part 1)
Every now and then I just need to do some programming. Not only that, but something I haven't done before, preferably translating some weird code from something like an academic paper or from a book by Knuth into C#. So something algorithmic or data structure-y: I want to continue learning and being stretched. Read more...
Lock-free data structures: redux
Back in November 2005, I wrote a series of articles on writing lock-free data structures in C# and illustrated the concepts with a lock-free stack, queue, limited-priority queue, and a free list. All seemed well, except several people have commented that I never uploaded the source code and please could they have it? Why didn't I acquiesce? Read more...
Kruskal's Algorithm
I noted recently that has been some time since I last blogged here about an algorithm. I do apologize. It's unfortunate, but I can only give the same old excuse as any other tech blogger: sometimes work overpowers everything and writing an algorithm article sometimes can take a little while. Anyway, having offered an apology (which I hope was accepted), let's dive straight into one I've not used before, one of a set that deals with graph data structures. Read more...
Calculate the date from an ISO date
One of the most popular pages on my website is the one where I discuss how to calculate the ISO week number of a date. Today out of the blue I was asked, if given an ISO week, how would you do the reverse calculation and find the original date. Read more...
Lock-free data structures: the limited priority queue
Last time in this series on building lock-free data structures in C# 2.0 we designed and implemented a lock-free queue, a FIFO structure that doesn't use locking to ensure correct thread safety. In doing so we also built a lock-free free list to ensure that we wouldn't block on the memory manager when allocating nodes. This time we'll tie it all up by creating a lock-free priority queue, one for a limited number of priority values, based on the code I wrote earlier. Read more...
Lock-free data structures: the free list
In my previous article in this series on lock-free data structures, I'd implemented a lock-free queue. In doing so, I showed that despite the simplicity of the underlying data structure, making it perform in a lock-free manner in a multithreaded application was no mean feat.
In my closing remarks though, having breezily talked through the design and the implementation of the non-blocking queue, I briefly noted that there was a problem with the dequeuing operation (and indeed there was the same problem with the popping operation in the stack): there was no guarantee that a node had been finished with once the underlying linked list had been updated. Read more...
Lock-free data structures: the queue
Last time we looked at implementing a lock-free stack, a stack that can be used in multithreaded applications but that doesn't need locks (such as mutexes) in order to provide thread-safe operation. This gave us an appreciation for one of the basics of lock-free programming: the Compare-And-Swap (CAS) function. This time we'll look at how to implement a queue, another very basic data structure, but this time things start getting difficult. Read more...
Lock-free data structures: the stack
It's a fact that lock-based concurrency doesn't scale: only one thread at a time can execute the thread-safe code that's being protected by a lock or a mutex. The more threads there are accessing the lock, the more will be just sitting there waiting. Read more...
Priority Queue For Limited Priorities
Recently, my brother-in-law, who programs in VB.NET, needed a priority queue that also had some behaviors of a normal queue: dequeue should remove the highest priority item as usual, but if there were two or more objects with the same priority, they should be removed in arrival order (also known as FIFO or first-in-first-out order). The normal heap implementation of a priority queue could not do that since it takes no account of arrival order; the only ordering is by priority and if there were several items with the same priority, they'd be returned in some non-deterministic order. Read more...
Using the Singleton pattern (part 1)
Whenever a couple of developers have a chat about Design Patterns next to the water cooler, the conversation is almost guaranteed to turn to the Singleton pattern. In fact, when two Delphi developers discuss Singleton, it's like they are talking about two different languages since Delphi can't do the basic trick of making a Singleton work. But no matter which language the developers use, they'll always talk about the implementation of it. Never, it seems, do they take a step back and wonder if Singleton is a "good" pattern to use. Read more...
Writing a parser for CSV data
One of the simplest parsing jobs we can undertake is to read a CSV (comma separated values) file as a set of records, and to retrieve the individual fields in each record. Read more...
Overengineering IOpenable
A week or so ago, I published an
article about an IOpenable
interface and
OpenableBase
class I'd been developing. I went ahead and
posted the article despite a few misgivings: I felt as if I were
missing something important about the code I was posting. I worked on
the code a little bit afterwards but was unable to find out what was
wrong. It took Mike Scott, an old friend from yesteryear, to point out
that I seemed to be making it more difficult than it should be. He was
quite gentle about it for a Scotsman, but in reality he should have
beat me about the head with a haggis. Yes, gentle reader, I've been
guilty of over-engineering. Read more...
Object-Oriented State Machines
In the March issue ofThe Delphi Magazine, I discussed refactoring state machines that had been implemented as giant case (or switch) statements into an object-oriented model. Essentially, the class model I used was the State pattern as described by the Gang of Four in Design Patterns. Anyway one of my readers sent me an email that made a valid point. Read more...
Recursive Selection Sort
I was browsing through the access logs for my website and I came across a search phrase that was used to access my site from Google: "recursive selection sort in vb". We'll ignore the VB part (I'm not in the mood for writing VB code tonight), but the first part got me thinking. Read more...
Named Indexers
So there was a posting on borland.public.delphi.non-technical raving about Delphi's array properties and bemoaning that C# didn't have them. The post seemed to be a standard "Delphi is better than C#" argument, but it had a couple of inaccuracies, which will make for an
interesting article. Read more...
Simple Shuffling
Continuing my trek through the Google search phrases that are used to access my web site, today I'm going to discuss "simple shuffle algorithm," a phrase that was used a few days ago. Read more...
GCD and fraction class
A nice little algorithm this time, calculating the Greatest Common Divisor. However, it was too simple, so then I wrote a type to handle fractions and arithmetic using fractions, but then that led to the algorithm for converting a floating-point number into a fraction. Read more...
Checking that a Binary Tree is a Heap
One of the readers of this web site came to it from Google with the search phrase "algorithms to test
binary tree is a heap." I hadn't solved that particular problem before, so I decided to do so for my regular readers. Read more...
Variable Length Records
Recently a reader asked me how to implement a record manager that could store records that had varying lengths. I came up with three possible schemes, with some variants. Read more...
Thoughts on the Leap Day and the English Tax Year
The local paper ran a piece on the leap day, and they got some facts wrong (not surprisingly). I happened to be thinking about the English tax year for some reason, and suddenly a way to link them together came into my mind. It helped that I could link myself into it as well. Read more...
Reviewing the MSDN article on hash tables
I came across a new series on MSDN on writing and using data structures in .NET. Seeing as this is my area of expertise, I took a look. They're good, but there were a few factual errors in the discussion on hash tables. Read more...
Writing a priority queue in C# (part 3)
We continue our occasional series on writing a priority queue class in C#, by altering what we mean by priority and by adding serialization. Oh, and we fix a subtle garbage collection bug in our previous code. Read more...
Writing a priority queue in C# (part 2)
I had a couple of hours to write another episode in my series on writing a priority queue in C#, such that the result could be a paid-up member of the .NET collections family. This time: writing the enumerator and doing some clean up. Read more...
Writing a priority queue in C# (part 1)
Recently someone asked me whether I'd converted the priority queue class I'd implemented in my book into C# for use by .NET developers. Nope, but it triggered something in my mind: let's kill two birds with one stone by writing a priority queue using TDD. It's going to take a couple of articles though. Read more...
Simple pattern matching
Another good algorithmic question, this time on GotDotNet: how do you match a simple pattern with ? and * wildcards to a given string? Read more...
Calculating the ISO week number
Someone asked this question on The Code Project: how do you calculate the ISO week number for a given date. Read more...
Calculating the difference between two dates in months
This is still something that developers think is easy. It isn't and just shows that they haven't thought enough about the answers they expect. Read more...
Julian's very first article
For a bit of fun, I've uploaded my very first published article. Check it out; if you can remember Turbo Pascal 6 and the program overlay manager (Borland called it VROOM), you'll feel really nostalgic. Read more...