julian m bucknall >> To Underscore or Not to Underscore

To Underscore or Not to Underscore

published: Thu, 21-Jul-2005 | updated: Thu, 21-Jul-2005

Recently I've been involved in an effort to compile formal coding standards for the languages we use here at Configuresoft. We've had standards for some languages for some while (for example, our C++ standards have been complete for many years), whereas if you'd been developing in other languages you'd had to consult "race memory" (that is, look at what has been written before and mimic it).

Of course the rationale for having coding standards in the first place are threefold: make the code readable, understandable, and maintainable; only by having coding standards can we develop as a team in an agile environment. (One of the tenets of eXtreme Programming is having coding standards as a matter of fact.)

For the C# coding standards, I started out with the Naming Guidelines section of the Design Guidelines for Class Library Developers document, took into account what FxCop likes to flag as erroneous, and then threw in a bunch of preferences culled from several years writing C# code. There have been some lively (that's a good word) debates internally about some of the rules.

One of the topics for interesting debate was the use (or not) of an underscore prefix for private field members in a class or structure, in order to differentiate them from local method variables. (One of the alternatives, using "m_" as a prefix to mimic our C++ standards, was universally abandoned early on.)

That is, should we write:

public class Foo {
  private ArrayList list;
  ..etc..
  public Add(Bar bar) {
    list.Add(bar);
  }
}

Or, should we have:

public class Foo {
  private ArrayList _list;
  ..etc..
  public Add(Bar bar) {
    _list.Add(bar);
  }
}

Now, I've worked at places where the first convention was used and in other where the second was in force. Having used both, I'm a fan of the former: not using an underscore at all and having no need to differentiate fields in any special way.

So, to try and keep the discussion on track, I came up with a set of pros and cons for using an underscore:

Pro: In reading code, it’s obvious which identifiers denote the private fields of the class compared with local variables, so you're less likely to confuse them.
Pro: Intellisense will group together all of the private fields in one area of the list making it easy to find the one you need as you're coding head down.
Pro: You don’t have to use the this variable so often.
Con: If the Intellisense field grouping matters, it probably means the class has too many private fields and needs refactoring.
Con: If the reader of a method could get confused as to whether an identifier is a class field or a local variable, possibly the method is too long and should be refactored.
Con: Many times a private field is the backing store for a public property; the code in the class’ methods should be using the property instead of the field, in which case the field is only used in a couple of places within a class anyway.
Con: Code generators like WinForms or ASP.NET do not label private fields with an underscore -- if we require underscores then either we make an exception in this case or the developer must manually go through and change the names.
Con: If an underscore is required for private fields, it could indicate that we should have a special prefix for local parameters, for local variables, for class static variables, and so on. (And the arguments against all of these would follow the same points as above.)

Notice that some of my arguments against using an underscore are to do with complexity: if you need the extra information provided by the underscore it points to the class or method as being too long, too complex, or having too much behavior. To my mind not being immediately able to understand whether an identifier is a local variable or a field of the object as you're reading a method points to a too-much-complexity issue not to a naming convention problem. If the method is long or complicated, having a bunch of identifiers prefixed with underscores is just not going to help matters.

Look at the two code snippets above. Does the readability of the Foo.Add() method increase with the use of the underscore? No, I'd say not. In fact, I'd even posit that the underscore adds a little "jolt" to reading the method. It looks as if I've messed up the alignment or indentation, perhaps. Underscores just seems to disappear into the line underneath. I'm distracted by this possibility instead of concentrating on what the method does.

Now, I must admit that, in my years of programming, I've hardly ever used an underscore. Just by having some present, code looks weird to me. I seem to have the hardest time reading Ruby code, for example. So I do have some bias in the matter.

Anyway, the feedback I received either supported the "no underscore" position or didn't care either way. There were only a couple of people that really pushed for the underscore. With the choice down to me, I opted for simplicity; so our C# coding standards do not require the extra decoration for private fields; it's an error to use an underscore.