Author Archive

Converting IEnumerable to a Comma-Delimited String

I’m not sure whether it’s the fastest way to convert an enumerable collection of longs or ints to a comma-delimited list in C#, but it might be the shortest.

  1. IEnumerable<long> ids = new long[]{1,3,4,5};
  2. string delimitedIds = string.Join(",", ids.Select(x => x.ToString()).ToArray());

If you need a LINQ-free version for backward compatibility, check out Missing Functions on IEnumerable on Steve Cooper’s blog.

Tools: Beyond Compare 3

With my roots in maintenance programming, I’ve used a wide assortment of programming tools. Maintaining legacy code is not much fun. Anything that makes it easier gets added to my personal collection of productivity applications. I make a point of learning them inside and out.

I carry the tools with me from project to project. I recommend them to coworkers. I’ll use them even if my employer provides alternate tools for free. I’ll complain loudly if the company I work for won’t allow me to use them — for policy reasons, consistency reasons, or security reasons. (Most of my bosses eventually concede the point. After all, they’re interfering with my ability to get things done, which is presumably why they hired me.)

Beyond Compare by Scooter Software is one of those tools. It’s the best file comparison tool I’ve ever used. And not just me - It’s also earned praise from other programmers.

When I heard that version 3 was soon to be released, I was both excited and a bit nervous. What new features would they add? Were they going to break any behavior I relied on?

The website lists the new features, but the best new feature is the polished user interface, which makes it easier to move blocks of text and make inline edits to files.

File Comparison in Beyond Compare 3

File Comparison in Beyond Compare 3

And no, they didn’t mess up anything in the process.

I played with the beta for several weeks, and recently bought the full version. If you’re not using it already, you should try it.

Creating Dummy Targets for Configuration Objects

The ConfigurationManager class introduced in .NET 2.0 makes it easy to read application settings from an XML file. I especially like the ability to derive a class from ConfigurationSection to hold custom settings for your application. This MSDN tutorial on creating custom configuration sections can help you get started.

I used this to make the configuration files for several of our Infovark add-ins, but ran into a snag with our main API library. In order to interoperate with COM, we had to put out Infovark.Api.dll in the GAC.

This presents a big problem for using *.config files. If your assembly is in the GAC, your configuration file must live in the GAC as well. (By default, configuration files are sidecar files located in the same directory as your *.exe file.) Since the GAC lives in a special place on a Windows machine, it’s difficult to read and write from that location without special permissions. And you can forget about browsing to it using Windows Explorer. This makes it tough for folks to change configuration options, which defeats the whole point of XML-based configuration files.

It’d be nice if we could load the configuration file from an specific spot on the computer. But while the Configuration object has both Save() and SaveAs() methods, there’s no corresponding Load() method. Huh? According to MSDN, the “right” way to point your application at a different configuration file is to create a whole new app domain with the appropriate settings. Um… sure.

How about we just hack up a workaround instead?

Using a dummy target

You can fool the configuration object into loading settings from whatever .config file you want, if you don’t mind a hack or two. The Configuration object exposes an OpenExeConfiguration() method that takes a string. Despite its name, you don’t have to pass it an .exe file. Any file path will do, as long as the path exists.

Since my .dll was in the GAC, I didn’t have a target for the OpenExeConfiguration() to use. I could have pointed it at another .dll — or at a .txt file for that matter — but that wouldn’t be very intuitive. Instead, I created a temporary file without an extension in the location I wanted to save the configuration file. Then I can open a Configuration object using the dummy target. Saving the Configuration object will cause it to write a file named “[configurationTarget].config” to the path I specified. You can see the code I used below.

///
  1.         /// Loads a .NET configuration file using the specified target.
  2.         /// Since configuration files are normally sidecar files, you
  3.         /// normally provide the path to an .exe or .dll file. Unlike
  4.         /// ConfigurationManager.OpenExeConfiguration(), this method
  5.         /// creates a dummy file without an extension to use as its target
  6.         /// if the target file does not always exist.
  7.         ///
  8.         ///
  9. The path and name of the dummy file used as the target.
  10.         /// A Configuration object
  11.         public Configuration LoadConfiguration(string configurationTarget)
  12.         {
  13.             bool useDummyTarget=false;
  14.             try
  15.             {
  16.                 FileInfo fi = new FileInfo(configurationTarget);
  17.                 if (!fi.Exists)
  18.                 {
  19.                     useDummyTarget = true;
  20.                     using (StreamWriter sw = fi.CreateText())
  21.                     {
  22.                         sw.WriteLine("Hi! This file only exists to make the Microsoft .NET framework happy.");
  23.                         sw.WriteLine("It's important because Infovark can't load its configuration file without it.");
  24.                         sw.WriteLine("(Don't ask. It's a long, long story.)");
  25.                         sw.Flush();
  26.                         sw.Close();
  27.                     }
  28.                 }
  29.  
  30.                 return ConfigurationManager.OpenExeConfiguration(configurationTarget);
  31.             }
  32.             catch(Exception e)
  33.             {
  34.                 throw new ConfigurationErrorsException("Unable to load a configuration file using " + configurationTarget + " as a target. See inner exception for details.", e);
  35.             }
  36.             finally
  37.             {
  38.                 // Clean up our dummy file.
  39.                 if (useDummyTarget) File.Delete(configurationTarget);
  40.             }
  41.         }

Once I’ve opened the Configuration object, I don’t need the dummy file any more. I delete it to avoid have weird extension-less files hanging around.

It’s not pretty, but it gets the job done.

It Doesn’t Get Better Than This

Our latest revision got high marks from our source control tools.

We are totally 1337.

Tools: ReSharper 4.0

We just finished our trial period for ReSharper from JetBrains. We’re buying licenses right now. It’s become indispensable to us. It’s that good.

ReSharper is like pair programming for introverts. It’s like a real-time FxCop, offering refactorings and best practices advice while you type.

Gordon had used ReSharper in its 2.0 days. I’d heard many positive things about ReSharper, but hadn’t tried it myself. The recently released 4.0 version offers support for C# 3.5, including the var keyword, object and collection initializers, and lambda expressions. Check out the in-depth review by Simon Hart if you want more details. Or just try it yourself.

Enums are Ints that Ain’t

Enumerations are incredibly useful in Microsoft .NET, but they can be odd to work with at times. While researching something to do with the new System.Addin namespace in C# 3.5, I was reminded of some enum craziness I’d forgotten.

Enumerations are implemented as collection of integer constants. You can cast any integer to an enum type, regardless of whether it’s been defined in the collection. That makes the code below legal, despite the fact that no item in our enum has a value of 55.

  1. public enum Color
  2. {
  3.     Black = 0,    
  4.     Red = 1,
  5.     Green = 2,
  6.     Blue = 3
  7. }
  8.  
  9. Brush.Color = (Color)55;

You can’t rely on the compiler to enforce legal enum values. This means if you’re using enums in case statements, you ought to include a default statement to catch those cases where an unexpected value gets passed. It’s always a good defensive coding measure, but I’d mistakenly assumed I could skip it in the case of enums. Not anymore.

ISO Date Validation RegEx

I needed a regular expression to correctly parse ISO 8601 format dates and times. The standard includes many alternative representations, but I was particularly concerned about the subset of ISO 8601 formats allowed in XML.

Paul Ward had posted a date parsing expression on the RegExLib site. I extended it a bit to handle times and time zones. I’m pretty sure they work, though it can be hard to tell sometimes. (See Jeff Atwood’s Now You Have Two Problems.)

ISO date
^(\d{4})\D?(0[1-9]|1[0-2])\D?([12]\d|0[1-9]|3[01])$

ISO time
^([01]\d|2[0-3])\D?([0-5]\d)\D?([0-5]\d)?\D?(\d{3})?$

ISO offset
^([zZ]|([\+-])([01]\d|2[0-3])\D?([0-5]\d)?)?$

ISO date and time
^(\d{4})\D?(0[1-9]|1[0-2])\D?([12]\d|0[1-9]|3[01])(\D?([01]\d|2[0-3])\D?([0-5]\d)\D?([0-5]\d)?\D?(\d{3})?)?$

ISO date, time, and offset (the works)
^(\d{4})\D?(0[1-9]|1[0-2])\D?([12]\d|0[1-9]|3[01])(\D?([01]\d|2[0-3])\D?([0-5]\d)\D?([0-5]\d)?\D?(\d{3})?([zZ]|([\+-])([01]\d|2[0-3])\D?([0-5]\d)?)?)?$

If you find it useful — or find a bug — let me know.

256 Character Filenames Should be Enough for Anybody

One of the key components of Infovark is a file crawler. We monitor specified folders for additions, updates and deletes, so that we can let users know what changes have occurred. I figured that making a recursive descent through files and folders in Windows would be a snap. And it would be, if not for the details.

Walking a file hierarchy is one of those basic examples included in just about every programming book, just like “Hello World!”. You can find sample code everywhere, but here’s the MSDN article on Iterating Through a Directory Tree. Notice how it recommends that you read about how NTFS works, at the end of the article? That’s where the important details hide.

Inconstant Constants

Implement the naive version of file recursion, and you’ll likely get a System.IO.PathTooLongException in fairly short order. This is because Windows filenames have a maximum limit of 260 characters. Most of the time. There are a few gotchas. Check out the Naming a File article. After reading it, you’ll probably have similar reactions to the folks on this Joel on Software forum thread about Windows MAX_PATH.

Here’s the gist: The Windows shell has a 260 character limit on its filenames. This is the MAX_PATH constant. However, the OS Kernel itself supports filenames with up to 32,000 characters (for compatibility with UNIX systems). So it’s trivially easy to work around this “constant” using a variety of hacks. For example:

  1. Sometimes you can squeeze in a few extra characters by using the DOS 8.3 short name format. This gives your files bizarre names like C:\PROGRA~1\DOCUME~1\LongName.txt.
  2. You can drop down into Win32 API calls and unmanaged code. Certain file-handling functions accept the “\\?\” prefix, which lets you use the UNIX-style names. Naturally, this comes with additional baggage that I can’t even begin to describe.
  3. You can map a drive to a folder deep in your tree. This effectively fakes out the Windows shell into thinking the path is much shorter than it really is.
  4. You can create a shared folder and bypass the length restriction. This works for the same reason that mapping a drive does.

The most common case where these hacks crop up is when a Windows server administrator maps network drives. While end users can create file hierarchies nested right up to the 260 character limit on their H: drives, the administrator of the file server can’t actually navigate that deeply. That’s right: The file server administrator can’t actually reach the deepest files on the machine because he sees C:\File Shares\Shared Drives\ where end users see H:. As you can imagine, this makes backing up network drives no fun at all.

But there’s a fix, right?

Not really. You’ll have to figure out how to deal with all these exceptions and hacks yourself. The Base Class Library (BCL) Team at Microsoft is working on both temporary and long-term solutions for the .NET Framework. You can read Part 1, Part 2 and Part 3 on the BCL Team Blog for all the gory details.

The core issue is the trade-off between consistency and backward compatibility. It’s a challenge that becomes tougher every year, both for Microsoft itself and for developers using the MS platform. It’s amazing how backward-compatible Microsoft is, given its quarter century accumulation of legacy code. It’s great for businesses and consumers, but from a developer’s perspective, all those gotchas add up over time. I don’t want to have the institutional memory of Raymond Chen just to navigate the filesystem. And is 256 characters a reasonable limit for filenames in a modern OS?

Oh, in case you were wondering, the maximum URL length is 2,083 characters in Internet Explorer.

Swapping NUnit for xUnit?

In our previous programming jobs, neither Gordon nor I had to do much testing. We tended to focus on architecture and development issues. Just prior to forming Infovark, we were acting as consultants for an ECM vendor that has since been snapped up by HP. So while we were aware of things like unit testing and test-driven development, we’d never had to do much of it ourselves.

We knew it was important, though. We’d seen too many things go wrong in software projects that didn’t make testing a core part of their standard operating procedures. So we incorporated testing into our product development and we’ve been learning about this testing stuff as we go. (It always happens that way, doesn’t it?)

We began with NUnit, the gold standard for unit testing on the .NET Platform, and quickly discovered some things that everyone else already knew:

  1. Good unit tests are essential for refactoring. It’s the only way you can make big OO changes with confidence that you haven’t broken anything that used to work.
  2. Good unit tests are are essential for troubleshooting. It makes it much easier to pinpoint the source of the errors when things go wrong.
  3. Good unit tests are essential for making your product fail gracefully. Every program will fail at some point. It’s how a program reacts to a failure condition that counts.
  4. You can’t write good test suites without good specifications. You’ll forget to test key assumptions and dependencies. You won’t confirm error conditions. You’ll couple your tests to your current object model, rather than to program requirements.
  5. If you’re doing it right, you’ll write far more code for testing than for the product itself. It’s much harder to verify correctness than it is to write correct code.
  6. Unit testing is different from integration testing. Integration testing is different from system testing. All three are important.

In addition to these common nuggets of TDD wisdom, we also discovered something else: It’s dreadfully easy to shoot yourself in the foot with NUnit tests. It’s easy to write tests that aren’t atomic, for example, or tests that expect errors to occur in one line of code that pass because another line of code throws a similar error.

To address some of these quibbles, a splinter group has launched xUnit on CodePlex. It aims to simplify testing syntax, incorporate more .NET 3.5 constructs, and prevent some of the more common NUnit anti-patterns that testing neophytes (like us) regularly develop.

We’ve messed around with XUnit a bit, and so far we like what we see. It’s not enough yet to make us abandon NUnit though, especially with NUnit 2.5 on the horizon. Besides, the NUnit framework is integrated with a variety of other development tools we’ve found indispensable. (I.e., everything made by JetBrains.)

But that’s just our opinion. What do you think? How do these things stack up? And what about the testing framework built in to Visual Studio Team System? We’d love to hear your experiences with these tools.

Using WCF for REST, Part 3

The easiest way to explain the issues we encountered when implementing REST is to work through the design principles we followed. I think much of our trouble came from the fact that we come from a web applications background, not a SOAP services background. I’m hoping that by laying out our REST design, some of the Microsoft gurus can help us do things “the WCF way.” And perhaps we can help the WCF team out by highlighting a handful of places where we found WCF unintuitive.

The Importance of Being Addressable

Fundamentally, the REST pattern is about making resources available. This means that each item stored within your system can be accessed by someone with the correct permissions. Every one of these items has its own unique address, and its address should not change. This consistency is important, because it allows both people and computer programs to remember and reference items in your system.

Note that we’re talking about resources or items. In contrast with the SOAP model of web services, which allows programmers to invoke procedures on remote computers, REST is about providing data. SOAP is about verbs, while REST is about nouns. A SOAP service might CalculateTotalSale(); a REST service provides CustomerRecieptNo_12345. The kind of web services architecture you use will depend on the kind of application you’re building. The choice has major implications for the other components of your system.

REST imposes restrictions on what sort of things you can do, because it supports only a handful of actions: GET, POST, PUT, and DELETE. (There are a few other HTTP methods, but these four are the most important.) Fortunately, with these four actions, you can accomplish most basic programming tasks. There’s a close parallel to these actions and create/read/update/delete, or CRUD, the building blocks of data storage systems.

UriTemplate

Since the address, or URI, is the primary way to access information in your system, it’s effectively part of your user interface. All the principles of good user interface design apply. So when designing a REST service, you need fine control over the structure of these identifiers.

SOAP, by contrast, typically has just one endpoint. The address itself conveys no information about what services are provided — that’s why SOAP services require a separate WSDL file to tell folks what’s possible. With REST, it should be easy to discover the extent of the system by looking at the URIs alone.

Coming up with good REST URI patterns can be tricky. Using short, descriptive naming conventions for your resources makes them easier to type. But URI patterns must also be distinct and unambiguous.

In the .NET framework, you use the UriTemplate class to define patterns. The UriTemplate implementation that shipped with .NET 3.5 allowed you to define variables that fit into slots in your URI. A typical UriTemplate might look like this: http://restserver/{object}/{id}?view={viewname}.

WCF looks for incoming URIs that match the patterns you define. The pattern defined above would match the following URIs:

  • http://restserver/customer/5?view=profile
  • http://restserver/article/how_to_do_stuff?view=print
  • http://restserver/author/John-Smith?view=1

Once you’ve defined a UriTemplate, you bind it to a method that has the same number of parameters. (I won’t go into the ABCs of WCF here, but you can check out this MSDN Introduction to WCF if you need a refresher.)

In WCF 3.5, you could only define a variable for a whole segment. A segment is basically the bit between the one forward slash and another, or one querystring parameter. A few bloggers requested more flexibility in UriTemplates, and the WCF team answered with the soon-to-be-released 3.5 SP1. The ability to define variables for partial segments was crucial for our URI design.

Representation Matters

Most books about web services, including RESTful Web Services, advocate leaving off file extensions from your URIs. This makes sense for SOAP, where you’re accessing methods and all responses are transmitted in XML. But in REST, you’re serving up items.

In our case, some of these items being served were files and some were records from a database. It seemed inconsistent to have some endpoints that had file extensions and others that didn’t. And we also wanted to be able to serve up different representations for our database records. Our REST service supports both JSON, XML, and HTML. It made sense to use a file extension to distinguish between the different representations.

One workaround would have been to create endpoints like http://restserver/form/1040/xml but that looked funny next to URIs like http://resterver/file/documentation.pdf. True RESTafarians would point out that neither the “/xml” or the “.pdf” are needed, since you can request an appropriate representation using the HTTP ACCEPT header. We decided against the header approach because not all browsers use the ACCEPT HTTP header. Besides, it might be useful for us humans to be able to reach alternate formats by simply changing the URL in the browser address bar.

In WCF 3.5, this required us to create three times the number of endpoints, with a separate method to handle each. We can’t wait for the official release of 3.5 SP1 to make UriTemplates like http://restserver/user/{id}.{ext} possible.

The Final Slash

Another source of endpoint duplication was the need to have two different endpoints for http://restserver/folder and http://restserver/folder/. Because the slash is used as a segment delimiter, the dispatcher in WCF 3.5 saw these two URLs as fundamentally different.

So handling what we thought were fairly trivial cases in URI patterns led us to create FIVE TIMES the number of endpoints we wanted. It’s a maintenance nightmare. SP1 can’t get here soon enough.