Typing (the programming kind)

published: Mon, 19-Dec-2005   |   updated: Mon, 19-Dec-2005
Copper Mountain in summer

There was a time a couple of years ago when I thought generics were the best thing since sliced bread. They still are and the ability of the Anders and C# team to come up with something so amazing as LINQ and lambda expressions for a C# 3.0 preview built on this foundation is pretty amazing. But, in a sense, all these new experimental features gives us is the ability to fake dynamic typing on a static typing foundation. And that leads me to dynamically-typed languages.

You see, last week, Ruby on Rails 1.0 was released. And I finished a book called "Beyond Java" by Bruce Tate. (This article is about neither, though.) And I got to thinking about languages and language design and typing.

Think of it this way. When we talk about typing we're thinking along two main axes. There's the strong typing versus weak typing axis, and then there's the dynamic typing versus static typing axis.

C# and Java and Delphi for .NET are statically- and strongly-typed languages. When you want to use a variable you have to declare that variable first, and when you declare a variable, you give it a type. From then on that variable's type is fixed and unchanging, all the way to the end of its lifetime. You can sometimes refer to the variable by an ancestor type, all the way to "object" (the base class, whatever it may be called in your neck of the language woods), but it still retains its declared type.

Easy-peasy stuff. The compiler does a lot of work making sure that you use your variables in a consistent manner with their types (throwing off errors, if you don't) and everything's hunky dory. You can't cast a variable to a totally different type (say from a boolean to a string) because the compiler won't let you. Type-safety means it's harder to shoot yourself in the foot, but there are still myriad other ways to do so.

Weakly-typed languages, on the other hand, are fraught with danger. The best example of a language of this type is C. You can change the type of a variable really easily, just coerce it to a pointer and then the world's your oyster. This flexibility does give you great power and expressivity, but, boy, pay attention or you can mess things up drastically. Buffer overflows, trashing memory, double-deallocations, the whole nine yards. C++ is a lot like this too, especially if you forsake the safe type coercions. So is 32-bit Delphi. Essentially the compiler assumes that you know what you're doing when you cast a Foo as a Bar and merrily lets you do it.

All the language examples I gave (C, C++ and Delphi) can be viewed as weakly- but statically-typed languages. You have to declare a variable's type when you declare the variable but you can munge (a technical term, that) the variable to one of another type through the magic of non-typesafe casts.

The languages of interest these days are the dynamically-typed languages, like Ruby, Python and Perl. Most people view them as being weakly-typed (they conflate dynamically-typed with weakly-typed), but in reality they're not.

Let's take Ruby since I'm in the process of teaching myself it. In Ruby you can write this:

x = 10
y = x * x
x = "hello"

Looking at this, people will say that this example definitely shows that Ruby is weakly typed. Not true. It may be dynamically typed (first x is a Fixnum, but after this further point x is a String), but it is most certainly not weakly typed. Once x is redefined it remembers nothing of its previous existence.

Here's the result of multiplying the second instance of x by itself (in case you didn't know, irb is the interactive Ruby shell):

irb(main):003:0> x = "hello"
=> "hello"
irb(main):004:0> y = x*x
TypeError: cannot convert String into Integer
	from (irb):4:in '*'
	from (irb):4

So as you can see Ruby is strongly typed: x is a String and there is no way to "multiply" a string by another, nor is there a conversion to cast or convert a string to an integer to facilitate such a multiplication. So the statement fails, bearing all the hallmarks of a strongly-typed language.

Here's the equivalent in Delphi32:

  x : PChar;
  y : integer;
  x := AllocMem(10);
  StrCopy(x, 'hello');
  y := integer(x) * integer(x);

Yep, you just multiplied two pointers to a string together by inappropriately casting them to integers. That's weak typing for you: great for raw speed, but awful if you like walking on your feet.

Now, just because the first example shows some syntactically valid Ruby code doesn't mean you should code like this, just as the Delphi example above is not something you'd usually write. Give me a break. That's not what it's all about. I kind of like not having to explicitly declare that x is of this type or another. And it's also true that dynamic typing means that you have to be careful of misspellings, just in case.

However, I would venture that just because the compiler does a whole lot of type-checking in strongly-typed languages it doesn't mean that every syntactically correct application in such a language is correct. No way. You still have to test the application, the assembly, the module, or whatever. It's the same in Ruby as well. You can't just write a whole bunch of dynamically typed code and hope that it works. Nope: you test it.

If you're test-infected (that is, you always write unit test code for your production code), you will naturally write Ruby code that can be shown (that is, tested) to be correct. If you do write code like I did in the irb session then running your test will show you that it is wrong. Nothing very different than shipping code like this C# fragment without testing it:

ArrayList list = new ArrayList();
Foo foo;
Bar bar = (Bar) List[0];

Here the first element of the list is actually a Foo, but the ArrayList indexer was written to return a bare object. The compiler knows nothing more than this, so it will allow a cast to a Bar, even though flow analysis would show that the assumption is wrong. The interesting thing is that the above fragment is one justification for implementing generics in C# 2.0: to improve the type information for the compiler so that you don't shoot yourself in the foot. But, in general, good developers don't make this kind of mistake in the first place.

Ah, the mythical good developer. I've heard the argument made that developers, in general, are not "good developers" and therefore they need all the support they can get from the compiler, hence strong- typing will avoid lots of mistakes. Yet these same developers are the ones that don't run FxCop as a rule either: the compiler can't catch everything. A bad developer is going to do the minimum needed to get the job done (or at least the "nominal" job done) and whether the language he uses is strongly-typed or not will make not the blindest bit of difference.

If you, like me, prefer testing your code thoroughly, dynamically typed languages should hold no fear for you. You already know that testing is the best way of proving your application to be correct: the compiler only catches some obvious mistakes. But in catching these mistakes you must run it, and let it work out the dependencies and track the variables and report back with its (hopefully empty) list of warnings and errors. Now imagine that you don't have to run a compiler in order to run your tests (dynamically-typed languages tend to be interpreted): how much time would you save?

Now, before you launch your email app to blast me to smithereens, rest assured that languages like Ruby are not going to solve every problem. But, heck, neither are languages like assembly, C, C++, Java, Delphi, Haskell, C#, VB, Smalltalk. At the moment, for attaching a database to a web browser application, Ruby on Rails seems the way to go. For other types of application, other languages may be better. (And "better" in what sense? Faster to write and test? More speed of execution? Less memory footprint? More able to utilize the L1 cache? All of the above? Which business driver is driving you? Don't be pushed by business drivers that aren't important.)

I've been dabbling in Ruby too little over too long a timeframe. Time to twist the knob to the max and do some real work with it to discover what it can be like to write Ruby code. Stay tuned.