Random iTunes playlists (part 2)

published: Sat, 13-Oct-2007   |   updated: Sat, 13-Oct-2007

In the last installment, I'd got to the point where I could write code in C# to enumerate all the tracks in the iTunes library. I now need to do a little more work in order to select various tracks for inclusion in my random playlist.

I pushed the keyboard away at this point to think about and decide how to select tracks. There were some tracks I would not like to listen to at all, and of the rest there would be some I'd like to listen to more than others. In other words, I had to assign a weighting to each track in the library playlist. The weighting would either be zero or some positive value. The higher the positive value the more likely the track would be picked.

The way I was thinking of it was as a long piece of string. Each track would correspond to a marked off portion of this string, the portion being proportional to the weighting of the track. My random selection algorithm would be akin to the pinning the tail on the donkey nursery game, but instead being blindfolded and pointing at some part of the string.

Stepping back from the metaphor and thinking about the implementation, I would need some kind of object that had a method to calculate the weighting given a track. Thinking even further out, I could imagine having several different versions of this object that would select tracks based on different criteria, but in a much more programmatic way than the iTunes smart playlist feature; for example, using regular expressions on the track name (say, selecting all tracks that matched "lov(er|ing|e|es)"). Already I was thinking of an interface, note, but more on that later.

For simplicity I decided the weighting should be represented by a double value in the range 0 to 100. This meant that I would need some validation happening at some point to force the weighting in that range, but I left it for now.

Here's my first cut at such a class:

  public class WeightingCalculator {
    public double GetWeighting(IITTrack track) {
      if (!(track is IITFileOrCDTrack))
        return 0;
      if (track.Size < 100 * 1024)
        return 0;
      switch (track.PlayedCount) {
        case 0:
          return 10;
        case 1:
        case 2:
          return 20;
        case 3:
        case 4:
        case 5:
        case 6:
          return 40;
        default:
          return 50;
      }
    }
  }

(Note: the reason for the size calculation there is that I have several ripped CDs where there's a chat track in between song tracks. This would remove most of those chat tracks from consideration. It's a little heavy handed — maybe I should set the track type to "chat" or something — but it's OK for now. One of the issues with iTunes is that the time for each track is returned as a string (e.g., "3:21"), which needs converting before I could use it for comparisons and calculations (and I was too lazy to work it out straightaway). It also gives me a method that will return zero for some tracks, which is important for testing. Note also that I'm explicitly looking for tracks that either appear as a file or as a CD track. For now I'm excluding other types of tracks that may be in the iTunes library.)

Now to use this class. I know I can enumerate all the tracks in the library, so I should create an instance of this class and call the GetWeighting() method for each track. If the weighting is zero, I can ignore the track. If the weighting is non-zero I should store it with the track somehow. That implies a wrapper or helper class to hold both the original iTunes track object and this weighting.

A few moments with the c, p, and cc templates in CodeRush, and I hasd this:

    internal class Track {
      private IITTrack iTunesTrack;
      public IITTrack ITunesTrack {
        get { return iTunesTrack; }
      }
      private double weighting;
      public double Weighting {
        get { return weighting; }
      }

      public Track(IITTrack iTunesTrack, double weighting) {
        this.iTunesTrack = iTunesTrack;
        this.weighting = weighting;
      }
    }

Now I had this class, I could instantiate a List<Track> and add instances of this Track class to the list instance.

      Console.WriteLine("Reading all tracks from library...");
      WeightingCalculator calculator = new WeightingCalculator();
      library = new List<Track>();
      foreach (IITTrack track in trackCollection) {
        double weighting = calculator.GetWeighting(track);
        if (weighting > 0.0) {
          library.Add(new Track(track, weighting));
        }
      }

Having got this list of tracks, I now needed to randomly select an element from it and that selection should take into account the weighting of each track. Going back to my string metaphor, I quickly realized that I didn't have enough information. If I were to randomly select a point on my string, the one marked off with sections of arbitrary lengths, I would generate a random value between zero and the full length of the string, and measure that length from that start of the string.

To do this with my track weightings I needed to store where the track appeared in the "string". In other words, I needed to calculate a start (or end) point of each track as a cumulative weighting value. So, the first track in my selection list would have a start point of zero, the next would have a start point equal to the weighting of the first track, the third, a start point of the sum of the first two weightings, and so on.

First, I altered my Track class (and in the process noticed it was an immutable class, a pattern I'm striving for more and more, in order to help with multithreaded applications). Handy CodeRush hint: I added the new property then used the smart cut feature to delete the previous constructor, and then used cc again to create a brand new one that initialized all the internal fields. I named the cumulative weighting of the track as it's "position":

    internal class Track {
      private IITTrack iTunesTrack;
      public IITTrack ITunesTrack {
        get { return iTunesTrack; }
      }
      private double weighting;
      public double Weighting {
        get { return weighting; }
      }
      private double position;
      public double Position {
        get { return position; }
      }
      public Track(IITTrack iTunesTrack, double weighting, double position) {
        this.iTunesTrack = iTunesTrack;
        this.weighting = weighting;
        this.position = position;
      }
    }

Then I changed the selection loop

      Console.WriteLine("Reading all tracks from library...");
      WeightingCalculator calculator = new WeightingCalculator();
      double cumulativeWeighting = 0;
      library = new List<Track>();
      foreach (IITTrack track in trackCollection) {
        double weighting = calculator.GetWeighting(track);
        if (weighting > 0.0) {
          library.Add(new Track(track, weighting, cumulativeWeighting));
          cumulativeWeighting += weighting;
        }
      }

Now the fun bit: selecting a random element according to the weighting. The first thing to notice is that, because of the way I've constructed the loop, the elements in the list are automatically sorted according to the cumulative weighting. Any time you have a sorted list and you're trying to find a particular item taking advantage of this sort order, you should be thing binary search.

This will be a slightly different binary search though since we're not looking for an exact match. In essence, given a value, we're going to look for the element whose position is less than or equal to this value, and whose next neighbor's position is greater than this value.

I started off with a standard binary search, as if I were trying to find an exact match:

    private static Track GetSelectedTrack(List<Track> library, int target) {
      int l, r, m;
      l = 0;
      r = library.Count - 1;
      while (l <= r) {
        m = l + (r - l) / 2;
        if (library[m].Position < target)
          l = m + 1;
        else if (library[m].Position > target)
          r = m - 1;
        else 
          return library[m];
      }
      return null;
    }

Now, that's all very well — and you should note that the line that returns the mth item is very unlikely to ever be executed because of the approximations inherent in using doubles — but I need to change that last line that returns null to return the item that the target value falls in. Consider the state of play at this point: all we know straightaway is that l > r since that's the condition that terminates the loop.

We actually can derive another conclusion: in the previous cycle through the loop either l was equal to r or was equal to r-1. If there had been a greater difference between l and r, there would have been at least one more cycle through the loop. To prove this, suppose l was equal to some value x and r equal to x+2. m would then be equal to x+1. If l was altered, it would be set to x + 2, which is the value of r, and which still satisfies the loop condition. If r was altered, it would be set to x, which is the value of l, and which again satisfies the loop condition.

We can also ascertain that at the beginning of the cycle, the target value will either fall in the element at l-1, or l, or, if r is different from l, at r. The reason for this is that either l has its original value, 0, or was advanced by one in some previous cycle from an item that was checked. When that previous item was checked all we found out was that the target was greater than the position. For the item at r+1 (if r has changed) we know that target is less than its position.

So we have four cases to consider: either they were equal or not, and l was incremented or r decremented at the end of the cycle. In every case, when calculated, m would have been equal to l.

  1. Equal and l incremented. The item to choose is at m (which is equal to r and to l-1).
  2. Equal and r decremented. The item to choose is at m-1 (which is equal to r and to l-1).
  3. Not equal and l incremented. This one is easy: we actually have to go round the loop again, since now l == r.
  4. Not equal and r decremented. Again, we choose the item at m-1 (which is equal to r and to l-1).

Hence the best answer is that we return the item at r if we fall out of the loop.

    private static Track GetSelectedTrack(List<Track> library, int target) {
      int l, r, m;
      l = 0;
      r = library.Count - 1;
      while (l <= r) {
        m = l + (r - l) / 2;
        if (library[m].Position < target)
          l = m + 1;
        else if (library[m].Position > target)
          r = m - 1;
        else 
          return library[m];
      }
      return library[r];
    }

And having explained how to select items randomly from a weighted distribution, I'll end this installment now. To be continued...

(In this series: part 1, part 2)