256 Character Filenames Should be Enough for Anybody

One of the key components of Infovark is a file crawler. We monitor specified folders for additions, updates and deletes, so that we can let users know what changes have occurred. I figured that making a recursive descent through files and folders in Windows would be a snap. And it would be, if not for the details.

Walking a file hierarchy is one of those basic examples included in just about every programming book, just like “Hello World!”. You can find sample code everywhere, but here’s the MSDN article on Iterating Through a Directory Tree. Notice how it recommends that you read about how NTFS works, at the end of the article? That’s where the important details hide.

Inconstant Constants

Implement the naive version of file recursion, and you’ll likely get a System.IO.PathTooLongException in fairly short order. This is because Windows filenames have a maximum limit of 260 characters. Most of the time. There are a few gotchas. Check out the Naming a File article. After reading it, you’ll probably have similar reactions to the folks on this Joel on Software forum thread about Windows MAX_PATH.

Here’s the gist: The Windows shell has a 260 character limit on its filenames. This is the MAX_PATH constant. However, the OS Kernel itself supports filenames with up to 32,000 characters (for compatibility with UNIX systems). So it’s trivially easy to work around this “constant” using a variety of hacks. For example:

  1. Sometimes you can squeeze in a few extra characters by using the DOS 8.3 short name format. This gives your files bizarre names like C:\PROGRA~1\DOCUME~1\LongName.txt.
  2. You can drop down into Win32 API calls and unmanaged code. Certain file-handling functions accept the “\\?\” prefix, which lets you use the UNIX-style names. Naturally, this comes with additional baggage that I can’t even begin to describe.
  3. You can map a drive to a folder deep in your tree. This effectively fakes out the Windows shell into thinking the path is much shorter than it really is.
  4. You can create a shared folder and bypass the length restriction. This works for the same reason that mapping a drive does.

The most common case where these hacks crop up is when a Windows server administrator maps network drives. While end users can create file hierarchies nested right up to the 260 character limit on their H: drives, the administrator of the file server can’t actually navigate that deeply. That’s right: The file server administrator can’t actually reach the deepest files on the machine because he sees C:\File Shares\Shared Drives\ where end users see H:. As you can imagine, this makes backing up network drives no fun at all.

But there’s a fix, right?

Not really. You’ll have to figure out how to deal with all these exceptions and hacks yourself. The Base Class Library (BCL) Team at Microsoft is working on both temporary and long-term solutions for the .NET Framework. You can read Part 1, Part 2 and Part 3 on the BCL Team Blog for all the gory details.

The core issue is the trade-off between consistency and backward compatibility. It’s a challenge that becomes tougher every year, both for Microsoft itself and for developers using the MS platform. It’s amazing how backward-compatible Microsoft is, given its quarter century accumulation of legacy code. It’s great for businesses and consumers, but from a developer’s perspective, all those gotchas add up over time. I don’t want to have the institutional memory of Raymond Chen just to navigate the filesystem. And is 256 characters a reasonable limit for filenames in a modern OS?

Oh, in case you were wondering, the maximum URL length is 2,083 characters in Internet Explorer.

Related Posts

  1. Creating Dummy Targets for Configuration Objects
  2. Getting Up to Speed on Windows Installer
  3. Visual Studio 2008 and its CopyLocal setting

Leave a Reply