Archive for Programming

The Curse of the Singleton

It took us six weeks to break the curse of the singleton. Six weeks! By the end of it, we’d rewritten most of our data access layer.

We began the process of removing singletons innocently enough. I thought I was well prepared for the task. I’d just finished reading The Pragmatic Programmer (my review of The Pragmatic Programmer) and Working Effectively with Legacy Code (my review of Legacy Code). I remember telling Gordon I’d tackle the problem over the weekend…

What’s a singleton?

The Singleton Design Pattern is one of the first patterns introduced in many software design books. But don’t let this fool you like it did me. Its prominent position has nothing to do with its importance. The Singleton is usually listed first because it’s the easiest pattern to explain and implement. It made a convenient place for the author to start, but the Singleton’s real uses are very limited.

Which is appropriate, actually, since the real use of the singleton is to limit usage. A class that implements the Singleton pattern allows only one object to be instantiated at a time. There are a few cases where this is desirable. For example, classes that control access to a single hardware device or that set up global variables. But the danger of the Singleton is that there are many cases where you’ll want to misuse it.

Why are they bad?

Scott Densmore lists the four key characteristics of the Singleton and how each can get you into trouble in his Why Singletons are Evil blog post.

For another cautionary tale of the cycle attraction, infatuation, disappointment, and rejection, read Singleton, I love you, but you’re bringing me down.

In our case, we’d gleefully implemented Singletons for database access, content indexing, security and access control, and in a few other places where we thought we needed just one instance. If Steve Yegge were here, he’d call what we’d done an instance of the Simpleton pattern — a failure to grasp basic principles of object-oriented programming. You can read more about Yegge’s thoughts on the singleton and design patterns for dummies.

Our automated tests were running slowly because we had to set up and tear down the database for every test. Making a change to one component would frequently cause several tests to fail. Everything was tied together at the hip — at the Singleton classes — and it was impossible to disentangle our code to test particular items in isolation. We had tests, but not unit tests. They were integration tests, and the points of integration were the handful of singleton classes we’d built.

Worse, our database performance was lousy. Since we had a global variable for our database object, we could sprinkle database access code throughout the rest of our object model. We discovered that we were opening and closing database connections all the time. And we’d had to implement tricky locking code to guarantee that our SQL statements would get executed in the right order.

What did we do about them?

The Singleton let us be lazy about our programming habits. It allowed us to make assumptions we shouldn’t have. You can call it premature optimization or a retreat into procedural programming techniques from an earlier era. Ultimately, we’d found that it allowed us to cut too many corners.

So we slowly rooted out each Singleton class from our API and reimplemented the functionality in other ways. Fortunately, we had a large battery of integration tests to help guide us. And luckily, we’d decided to tackle the problem during our first Alpha test, when we could still afford to make sweeping changes. But correcting bad design takes much longer than avoiding it in the first place — even if you’ve read all the right books.

Six weeks later, we finally sorted out the mess we’d made for ourselves. There’s a handful of odds and ends left to do, but the design feels better. My gut tells me it’s an improvement, and our tests — now we have both unit and integration tests — show that we’ve almost tripled the speed of the data access layer.

It was worth our time to break the Curse of the Singleton. Beware lest ye, too, fall under its spell!

Review: Working Effectively with Legacy Code

I know what you’re thinking: “Infovark’s been around for barely a year! Surely you guys aren’t having to deal with legacy code already?” If you accept Michael Feather’s expansive definition of legacy code — code without unit tests — then yes, despite our best efforts, we have lots of legacy code.

But even if you don’t buy that definition, and even if you’re working on a completely greenfield application, chances are you’ll have a lot of code in your project that isn’t fully understood. Or perhaps isn’t fully understood by all members of the team. And it’s in dealing with this issue that Working Effectively with Legacy Code really shines.

What happens when you need to change code that isn’t fully understood? Are you making it better or worse? The author says that you can’t know the answer to this question without tests in place. Having documentation is nice, but unit testing provides measurable output.

The brass tacks

Unlike many programming books, this one is organized in a Q&A format. Once past the introduction — which you can skip if you already understand the importance of refactoring and test-driven development — you’ll find the chapters organized by topic. Here’s a sample of a few chapter headings:

  • It Takes Forever to Make a Change
  • I Can’t Run This Method in a Test Harness
  • Dependencies on Libraries are Killing Me
  • I Don’t Understand the Code Well Enough to Change It
  • This Class is Too Big and I Don’t Want It to Get Any Bigger

This is a great way to organize a highly technical book. Each chapter has a specific purpose. The author then spends the chapter discussing the ways you can get out of the jam and weighing the pros and cons of each.

You’ll find in-depth examples of each of the techniques used, but be prepared to shift between languages. To get the most out of the book, you’ll need to be comfortable scanning unfamiliar syntax.

As a bonus, the book contains an index of common refactoring patterns. Certain patterns make appearances in more than one chapter, and the index provides another place for the author to work through some real-world examples.

All in all, this is a practical field manual for a set of problems that occur too often out in the wild. I highly recommend it.

Review: The Pragmatic Programmer

You’ll find The Pragmatic Programmer on many software developers’ must-read books lists. After reading it from cover to cover, I’ve added it to my essential reading list as well.

It’s not a book for beginners, though. The subtitle of the book, “From Journeyman to Master” sums it up. The Pragmatic Programmer describes the skills, attributes, and attitudes that a mid-level programmer needs to become a professional developer.

Its purpose is to distill the wisdom gathered from a career in programming into about 70 tips. Each of these tips is explained and illustrated with examples that most programmers will find familiar.

The tips are not necessarily about writing code. The authors, Andrew Hunt and David Thomas, take a holistic approach to the craft of programming. They cover topics like communicating effectively, planning and scheduling, and building teams.

I’d read somewhere that you can judge the quality of a craftsman by the quality of his tools. The Pragmatic Programmer is a book I’d expect to see on any professional developer’s shelf.

Write Big to Write Small

Writing code is like writing literature. Sometimes you have to write big to write small.

OK, if you’re a rock star, you might be able to think big and write small, but I can’t manage that feat. I often need a day of hunting down nasty copy-and-paste bugs before I realize, hey, all this stuff is redundant. I can write a function that makes this duplication unnecessary.

I can’t count the number of times I’ve tried to write concise, elegant code from the start, only to find that I’ve been writing small by thinking small. Getting lost in the details is a great way to write optimized code that works like a charm — but doesn’t actually do anything of value.

Once I thought that design patterns might be the answer to my programming struggles. Then I discovered that my accuracy rate in picking the right pattern for any given programming problem was really poor. More often, it’s led me to implement overly complex solutions for simple problems.

For a long time, I thought it might be my lack of a computer science degree. But I haven’t found a programming style or development framework that fixes the problem. Maybe there just isn’t an efficient way for me to get my thoughts into the computer. The only way I know to do it is to write a big, sloppy mess at the start. Then I’ll slowly, carefully, edit it down to its bare essentials.

Mozart might be able to write a symphony in a single sitting without penning a stray note. Not me.

Switch By Type in C#

The introduction of generics into C# 2.0 simplified many programming tasks. It especially helped in the creation of type-safe collections.

During our reworking of the Infovark data access layer, we created several generic methods to return items from our database. This allowed us to eliminate many duplicated methods and eliminate a lot of type casting. For example, before we began using generics, we had methods with signatures like this:

  1.   public Entity GetEntity(long id, int revision) {}
  2.   public Relationship GetRelationship(long id, int revision) {}
  3.   public Resource GetResource(long id, int revision) {}

became this after our refactoring:

  1.   public T GetObject<T>(long id, int revision) where T : MetaInstance
  2.   {
  3.     Type type = typeof(T);
  4.  
  5.     if (type == typeof(Entity)) return _DatabaseHandler.GetEntity(id, revision) as T;
  6.     if (type == typeof(Relationship)) return _DatabaseHandler.GetRelationship(id, revision) as T;
  7.     if (type == typeof(Resource)) return _DatabaseHandler.GetResource(id, revision) as T;
  8.  
  9.     return null;
  10.   }

Wait a minute. All we’re doing here is wrapping all our unique methods up in a generic method! Why bother? Good question. We abandoned that approach in favor of something more sensible later on.

That’s actually not the point of this post. The point is that C# doesn’t have the ability to perform a switch on types. See all those if statements in our generic method? That’s how we worked around the lack of type switching. If anyone has a better approach, we’d like to hear it.

Peter Hallam’s WebLog provides more information about why C# doesn’t have the ability to switch on a type. I’m not convinced by the reasons outlined there. As one commenter noted, there’s already a type-switching construct in C#: the catch statement.

Don’t Mix Your Serialization

Who doesn’t like mixing their Raisin Flakes with their Oaty-O’s in the morning? Yum! But it’s not a good idea if you’re talking about serial formats in C# 3.5 instead of breakfast cereals. You’ll get output that might leave a bad taste in your mouth.

Breakfast Quiz

Question: You’re writing a web API for an application. To give developers the most flexibility in interacting with your system, you want to expose classes that can be serialized to either XML or JSON. Using WCF and C# 3.5 SP1, what are your options?

Answer: There’s only one option unless you rely on 3rd party serialization libraries. You must mark the class with the [DataContract] attribute and mark each serializable member with [DataMember]. This allows you to serialize and deserialize using the DataContractSerializer and DataContractJsonSerializer for XML and JSON respectively.

I mention this because we’d gone to great lengths to customize our XML using the IXmlSerializable interface. This gave us fine control over the properties we wanted to appear in our XML output and how they were formatted. But if you use the IXmlSerializable interface, you can’t also annotate the class with the [DataContract] attribute. You’ll get a compiler error. Sowmy Srinivasan explains this serialization restriction.

I know what you’re thinking: If the framework provides an IXmlSerializable interface, isn’t there also an IJsonSerializable interface? Sadly, no. There’s no way to fine-tune the JSON output. Sigh.

So, if you’re currently using IXmlSerializable, you can forget about the DataContractJsonSerializer. Or you can accept that you’re fighting the framework, forget about your fancy-pants XML format, and accept the default serialization, keeping these data member best practices in mind.

What did we choose?

Infovark has too much invested in our XML layout at this point. We’ve built our XSD files, XSL Transforms, and many, many unit tests. So we gave up on the DataContractJsonSerializer and turned to the excellent JSON.NET, written by James Newton-King. It’s now version 3.0 and fully supports the new LINQ constructs.

It’s a little more work, but we think it’s worth it.

Converting IEnumerable to a Comma-Delimited String

I’m not sure whether it’s the fastest way to convert an enumerable collection of longs or ints to a comma-delimited list in C#, but it might be the shortest.

  1. IEnumerable<long> ids = new long[]{1,3,4,5};
  2. string delimitedIds = string.Join(",", ids.Select(x => x.ToString()).ToArray());

If you need a LINQ-free version for backward compatibility, check out Missing Functions on IEnumerable on Steve Cooper’s blog.

Creating Dummy Targets for Configuration Objects

The ConfigurationManager class introduced in .NET 2.0 makes it easy to read application settings from an XML file. I especially like the ability to derive a class from ConfigurationSection to hold custom settings for your application. This MSDN tutorial on creating custom configuration sections can help you get started.

I used this to make the configuration files for several of our Infovark add-ins, but ran into a snag with our main API library. In order to interoperate with COM, we had to put out Infovark.Api.dll in the GAC.

This presents a big problem for using *.config files. If your assembly is in the GAC, your configuration file must live in the GAC as well. (By default, configuration files are sidecar files located in the same directory as your *.exe file.) Since the GAC lives in a special place on a Windows machine, it’s difficult to read and write from that location without special permissions. And you can forget about browsing to it using Windows Explorer. This makes it tough for folks to change configuration options, which defeats the whole point of XML-based configuration files.

It’d be nice if we could load the configuration file from an specific spot on the computer. But while the Configuration object has both Save() and SaveAs() methods, there’s no corresponding Load() method. Huh? According to MSDN, the “right” way to point your application at a different configuration file is to create a whole new app domain with the appropriate settings. Um… sure.

How about we just hack up a workaround instead?

Using a dummy target

You can fool the configuration object into loading settings from whatever .config file you want, if you don’t mind a hack or two. The Configuration object exposes an OpenExeConfiguration() method that takes a string. Despite its name, you don’t have to pass it an .exe file. Any file path will do, as long as the path exists.

Since my .dll was in the GAC, I didn’t have a target for the OpenExeConfiguration() to use. I could have pointed it at another .dll — or at a .txt file for that matter — but that wouldn’t be very intuitive. Instead, I created a temporary file without an extension in the location I wanted to save the configuration file. Then I can open a Configuration object using the dummy target. Saving the Configuration object will cause it to write a file named “[configurationTarget].config” to the path I specified. You can see the code I used below.

///
  1.         /// Loads a .NET configuration file using the specified target.
  2.         /// Since configuration files are normally sidecar files, you
  3.         /// normally provide the path to an .exe or .dll file. Unlike
  4.         /// ConfigurationManager.OpenExeConfiguration(), this method
  5.         /// creates a dummy file without an extension to use as its target
  6.         /// if the target file does not always exist.
  7.         ///
  8.         ///
  9. The path and name of the dummy file used as the target.
  10.         /// A Configuration object
  11.         public Configuration LoadConfiguration(string configurationTarget)
  12.         {
  13.             bool useDummyTarget=false;
  14.             try
  15.             {
  16.                 FileInfo fi = new FileInfo(configurationTarget);
  17.                 if (!fi.Exists)
  18.                 {
  19.                     useDummyTarget = true;
  20.                     using (StreamWriter sw = fi.CreateText())
  21.                     {
  22.                         sw.WriteLine("Hi! This file only exists to make the Microsoft .NET framework happy.");
  23.                         sw.WriteLine("It's important because Infovark can't load its configuration file without it.");
  24.                         sw.WriteLine("(Don't ask. It's a long, long story.)");
  25.                         sw.Flush();
  26.                         sw.Close();
  27.                     }
  28.                 }
  29.  
  30.                 return ConfigurationManager.OpenExeConfiguration(configurationTarget);
  31.             }
  32.             catch(Exception e)
  33.             {
  34.                 throw new ConfigurationErrorsException("Unable to load a configuration file using " + configurationTarget + " as a target. See inner exception for details.", e);
  35.             }
  36.             finally
  37.             {
  38.                 // Clean up our dummy file.
  39.                 if (useDummyTarget) File.Delete(configurationTarget);
  40.             }
  41.         }

Once I’ve opened the Configuration object, I don’t need the dummy file any more. I delete it to avoid have weird extension-less files hanging around.

It’s not pretty, but it gets the job done.

Tools: ReSharper 4.0

We just finished our trial period for ReSharper from JetBrains. We’re buying licenses right now. It’s become indispensable to us. It’s that good.

ReSharper is like pair programming for introverts. It’s like a real-time FxCop, offering refactorings and best practices advice while you type.

Gordon had used ReSharper in its 2.0 days. I’d heard many positive things about ReSharper, but hadn’t tried it myself. The recently released 4.0 version offers support for C# 3.5, including the var keyword, object and collection initializers, and lambda expressions. Check out the in-depth review by Simon Hart if you want more details. Or just try it yourself.

Enums are Ints that Ain’t

Enumerations are incredibly useful in Microsoft .NET, but they can be odd to work with at times. While researching something to do with the new System.Addin namespace in C# 3.5, I was reminded of some enum craziness I’d forgotten.

Enumerations are implemented as collection of integer constants. You can cast any integer to an enum type, regardless of whether it’s been defined in the collection. That makes the code below legal, despite the fact that no item in our enum has a value of 55.

  1. public enum Color
  2. {
  3.     Black = 0,    
  4.     Red = 1,
  5.     Green = 2,
  6.     Blue = 3
  7. }
  8.  
  9. Brush.Color = (Color)55;

You can’t rely on the compiler to enforce legal enum values. This means if you’re using enums in case statements, you ought to include a default statement to catch those cases where an unexpected value gets passed. It’s always a good defensive coding measure, but I’d mistakenly assumed I could skip it in the case of enums. Not anymore.

[Edit: See this old post from Greg Vaughn for some other examples of Enum wackiness. Weird.]