Enumerations are incredibly useful in Microsoft .NET, but they can be odd to work with at times. While researching something to do with the new System.Addin namespace in C# 3.5, I was reminded of some enum craziness I’d forgotten.
Enumerations are implemented as collection of integer constants. You can cast any integer to an enum type, regardless of whether it’s been defined in the collection. That makes the code below legal, despite the fact that no item in our enum has a value of 55.
You can’t rely on the compiler to enforce legal enum values. This means if you’re using enums in case statements, you ought to include a default statement to catch those cases where an unexpected value gets passed. It’s always a good defensive coding measure, but I’d mistakenly assumed I could skip it in the case of enums. Not anymore.
[Edit: See this old post from Greg Vaughn for some other examples of Enum wackiness. Weird.]
I needed a regular expression to correctly parse ISO 8601 format dates and times. The standard includes many alternative representations, but I was particularly concerned about the subset of ISO 8601 formats allowed in XML.
Edit 26 May 2009: If you want full ISO compliance, check out the expression Cameron Brooks lists in his comment below.
Paul Ward had posted a date parsing expression on the RegExLib site. I extended it a bit to handle times and time zones. I’m pretty sure they work, though it can be hard to tell sometimes. (See Jeff Atwood’s Now You Have Two Problems.)
ISO date
^(\d{4})\D?(0[1-9]|1[0-2])\D?([12]\d|0[1-9]|3[01])$
ISO time
^([01]\d|2[0-3])\D?([0-5]\d)\D?([0-5]\d)?\D?(\d{3})?$
ISO offset
^([zZ]|([\+-])([01]\d|2[0-3])\D?([0-5]\d)?)?$
ISO date and time
^(\d{4})\D?(0[1-9]|1[0-2])\D?([12]\d|0[1-9]|3[01])(\D?([01]\d|2[0-3])\D?([0-5]\d)\D?([0-5]\d)?\D?(\d{3})?)?$
ISO date, time, and offset (the works)
^(\d{4})\D?(0[1-9]|1[0-2])\D?([12]\d|0[1-9]|3[01])(\D?([01]\d|2[0-3])\D?([0-5]\d)\D?([0-5]\d)?\D?(\d{3})?([zZ]|([\+-])([01]\d|2[0-3])\D?([0-5]\d)?)?)?$
If you find it useful — or find a bug — let me know.
Edit 21 Jan 2008: Stan James sent us some enhancements to the validation routines. Here’s what he had to say:
I needed a RegExp that could detect ISO dates with varying precision. (e.g. “1945″, “1945-12″, “1945-12-01″, “1945-12-01T12:15″ etc..)
For future readers, here’s what I came up with:
Date only
^[0-9][0-9][0-9][0-9](-[0-1][0-9](-[0-3][0-9])?)?$Date and Time
^[0-9][0-9][0-9][0-9](-[0-1][0-9](-[0-3][0-9](T[0-9][0-9](:[0-9][0-9](:[0-9][0-9])?)?)?)?)?$
Thanks, Stan!
One of the key components of Infovark is a file crawler. We monitor specified folders for additions, updates and deletes, so that we can let users know what changes have occurred. I figured that making a recursive descent through files and folders in Windows would be a snap. And it would be, if not for the details.
Walking a file hierarchy is one of those basic examples included in just about every programming book, just like “Hello World!”. You can find sample code everywhere, but here’s the MSDN article on Iterating Through a Directory Tree. Notice how it recommends that you read about how NTFS works, at the end of the article? That’s where the important details hide.
Implement the naive version of file recursion, and you’ll likely get a System.IO.PathTooLongException in fairly short order. This is because Windows filenames have a maximum limit of 260 characters. Most of the time. There are a few gotchas. Check out the Naming a File article. After reading it, you’ll probably have similar reactions to the folks on this Joel on Software forum thread about Windows MAX_PATH.
Here’s the gist: The Windows shell has a 260 character limit on its filenames. This is the MAX_PATH constant. However, the OS Kernel itself supports filenames with up to 32,000 characters (for compatibility with UNIX systems). So it’s trivially easy to work around this “constant” using a variety of hacks. For example:
C:\PROGRA~1\DOCUME~1\LongName.txt.The most common case where these hacks crop up is when a Windows server administrator maps network drives. While end users can create file hierarchies nested right up to the 260 character limit on their H: drives, the administrator of the file server can’t actually navigate that deeply. That’s right: The file server administrator can’t actually reach the deepest files on the machine because he sees C:\File Shares\Shared Drives\ where end users see H:. As you can imagine, this makes backing up network drives no fun at all.
Not really. You’ll have to figure out how to deal with all these exceptions and hacks yourself. The Base Class Library (BCL) Team at Microsoft is working on both temporary and long-term solutions for the .NET Framework. You can read Part 1, Part 2 and Part 3 on the BCL Team Blog for all the gory details.
The core issue is the trade-off between consistency and backward compatibility. It’s a challenge that becomes tougher every year, both for Microsoft itself and for developers using the MS platform. It’s amazing how backward-compatible Microsoft is, given its quarter century accumulation of legacy code. It’s great for businesses and consumers, but from a developer’s perspective, all those gotchas add up over time. I don’t want to have the institutional memory of Raymond Chen just to navigate the filesystem. And is 256 characters a reasonable limit for filenames in a modern OS?
Oh, in case you were wondering, the maximum URL length is 2,083 characters in Internet Explorer.