Archive for REST

Using WCF for REST, Part 3

The easiest way to explain the issues we encountered when implementing REST is to work through the design principles we followed. I think much of our trouble came from the fact that we come from a web applications background, not a SOAP services background. I’m hoping that by laying out our REST design, some of the Microsoft gurus can help us do things “the WCF way.” And perhaps we can help the WCF team out by highlighting a handful of places where we found WCF unintuitive.

The Importance of Being Addressable

Fundamentally, the REST pattern is about making resources available. This means that each item stored within your system can be accessed by someone with the correct permissions. Every one of these items has its own unique address, and its address should not change. This consistency is important, because it allows both people and computer programs to remember and reference items in your system.

Note that we’re talking about resources or items. In contrast with the SOAP model of web services, which allows programmers to invoke procedures on remote computers, REST is about providing data. SOAP is about verbs, while REST is about nouns. A SOAP service might CalculateTotalSale(); a REST service provides CustomerRecieptNo_12345. The kind of web services architecture you use will depend on the kind of application you’re building. The choice has major implications for the other components of your system.

REST imposes restrictions on what sort of things you can do, because it supports only a handful of actions: GET, POST, PUT, and DELETE. (There are a few other HTTP methods, but these four are the most important.) Fortunately, with these four actions, you can accomplish most basic programming tasks. There’s a close parallel to these actions and create/read/update/delete, or CRUD, the building blocks of data storage systems.

UriTemplate

Since the address, or URI, is the primary way to access information in your system, it’s effectively part of your user interface. All the principles of good user interface design apply. So when designing a REST service, you need fine control over the structure of these identifiers.

SOAP, by contrast, typically has just one endpoint. The address itself conveys no information about what services are provided — that’s why SOAP services require a separate WSDL file to tell folks what’s possible. With REST, it should be easy to discover the extent of the system by looking at the URIs alone.

Coming up with good REST URI patterns can be tricky. Using short, descriptive naming conventions for your resources makes them easier to type. But URI patterns must also be distinct and unambiguous.

In the .NET framework, you use the UriTemplate class to define patterns. The UriTemplate implementation that shipped with .NET 3.5 allowed you to define variables that fit into slots in your URI. A typical UriTemplate might look like this: http://restserver/{object}/{id}?view={viewname}.

WCF looks for incoming URIs that match the patterns you define. The pattern defined above would match the following URIs:

  • http://restserver/customer/5?view=profile
  • http://restserver/article/how_to_do_stuff?view=print
  • http://restserver/author/John-Smith?view=1

Once you’ve defined a UriTemplate, you bind it to a method that has the same number of parameters. (I won’t go into the ABCs of WCF here, but you can check out this MSDN Introduction to WCF if you need a refresher.)

In WCF 3.5, you could only define a variable for a whole segment. A segment is basically the bit between the one forward slash and another, or one querystring parameter. A few bloggers requested more flexibility in UriTemplates, and the WCF team answered with the soon-to-be-released 3.5 SP1. The ability to define variables for partial segments was crucial for our URI design.

Representation Matters

Most books about web services, including RESTful Web Services, advocate leaving off file extensions from your URIs. This makes sense for SOAP, where you’re accessing methods and all responses are transmitted in XML. But in REST, you’re serving up items.

In our case, some of these items being served were files and some were records from a database. It seemed inconsistent to have some endpoints that had file extensions and others that didn’t. And we also wanted to be able to serve up different representations for our database records. Our REST service supports both JSON, XML, and HTML. It made sense to use a file extension to distinguish between the different representations.

One workaround would have been to create endpoints like http://restserver/form/1040/xml but that looked funny next to URIs like http://resterver/file/documentation.pdf. True RESTafarians would point out that neither the “/xml” or the “.pdf” are needed, since you can request an appropriate representation using the HTTP ACCEPT header. We decided against the header approach because not all browsers use the ACCEPT HTTP header. Besides, it might be useful for us humans to be able to reach alternate formats by simply changing the URL in the browser address bar.

In WCF 3.5, this required us to create three times the number of endpoints, with a separate method to handle each. We can’t wait for the official release of 3.5 SP1 to make UriTemplates like http://restserver/user/{id}.{ext} possible.

The Final Slash

Another source of endpoint duplication was the need to have two different endpoints for http://restserver/folder and http://restserver/folder/. Because the slash is used as a segment delimiter, the dispatcher in WCF 3.5 saw these two URLs as fundamentally different.

So handling what we thought were fairly trivial cases in URI patterns led us to create FIVE TIMES the number of endpoints we wanted. It’s a maintenance nightmare. SP1 can’t get here soon enough.

WCF Instance Context

I finally figured out the source of my HTTP 400 problem. Apparently the Windows Communication Foundation deals with exceptions differently depending on your InstanceContextMode settings. I had been using the Single setting but I should have used the PerCall setting. In PerCall mode, the try/catch block works as expected.

I think it has something to do with the way that WCF distinguishes between channel exceptions and message faults.

Anyway, if you’re building a REST web service, you’ll want to make sure your class is decorated with the following ServiceBehavior attribute.

  1. [ServiceBehavior(InstanceContextMode = InstanceContextMode.PerCall)]

Using WCF for REST, Part 2

In part one of this series, I listed several websites and blogs that had useful information on the Windows Communication Foundation (WCF) and REST. I also mentioned that if I was stating again, I’d probably use something other than WCF. Perhaps deriving my own REST server from System.Net.HTTPListener, for example.

Vish asked for some additional details in his comment to that post. He works on the Microsoft WCF development team and was curious about our experience.

I had just begun putting together my response when I noticed Scott Guthrie’s post about Service Pack 1 for the .NET Framework 3.5 beta release. Steve Maine also posted specifics about the ADO.Net Data Services and WCF changes.

So, Vish, it seems your team’s beaten me to the punch on some of these issues! Many of the difficulties I was having with WCF and REST were addressed by the service pack. Here’s an overview of our key stumbling blocks:

  1. REST requires much greater control over the URI than SOAP does, and the URITemplate class just wasn’t up to the task. We had to hardcode most of our endpoints to compensate. (Fixed in SP1. Hooray!)
  2. Supporting multiple formats, such as serving both XML and JSON, either require you to program against Stream or require twice the number of endpoints.
  3. Existing serializers had trouble with complicated object graphs, forcing us to perform serialization/deserialization ourselves. (This seems greatly improved in SP1.)
  4. WCF allows only one contract/interface per endpoint. This makes it tricky to factor out common contract patterns.
  5. Good REST practice would have you return many kinds of errors as HTTP status messages. SOAP embeds all error information in the returned XML. WCF is closely aligned with the SOAP approach, which means that you’ve got to be very careful distinguishing exceptions from faults when implementing REST in WCF. It was an unpleasant surprise, and we had to do quite a bit of work to deal with that.

I’ll talk about all five of these areas in more detail in upcoming posts in this series. And I’ll be sure check out the SP1 beta once we get our Infovark Alpha release out the door.

Using WCF for REST, Part 1

Just because you can do something doesn’t mean it’s a good idea.

We decided to use the Windows Communication Foundation to drive our REST-based web service. In hindsight, it was a poor choice. REST support in WCF seems like it was a last-minute addition to .NET 3.5. You can certainly hack something together, but I’ve found few real-world examples on the Internet, and most of those sidestep the tricky issues.

Here’s the short of it: WCF was designed for RPC-SOAP. More importantly, it was designed to SOA-enable legacy services that used older communications channels like DCOM. If you’re starting from scratch, and have full control over the output of your web service and the design of your object model, I’d recommend using a different (and simpler) framework.

We’ve gotten a good bit of blog traffic from people looking for help with Windows Communication Foundation and the REST architectural pattern. (It’s good to know that we’re not the only ones needing advice.) Here are the better sources we’ve found so far.

Windows Communication Foundation documentation on MSDN

Good overview presentation on REST and Syndication using WCF

Microsoft’s Picture Services Sample

Justin Smith’s WCF articles on Cybertopian Chronicles

Nicholas Allen’s Indigo blog

Steve Maine’s blog

Assorted posts on Rick Strahl’s blog

That Indigo Girl

If you find other useful places to look, let us know!

Review:Restful Web Services

As we mentioned recently, we’re building the infovark server using the REST pattern. Since REST is more a loose set of guidelines than a strict series of rules, it’s hard for implementers to know where to begin.

OK, you could go to the source, chapter 5 of Roy Fielding’s dissertation. Or you could check out the somewhat academic discussions on the REST wiki, though there hasn’t been much activity lately. You can occasionally find good advice from the odd blog post, like the REST for the Rest of Us article at Open Garden. But ironically, there’s not a whole lot of material about implementing REST web services available on the web yet. (If you know of good links, leave a comment.)

RESTful Web Services

For the practical, gritty details of how it’s actually done, you’ll need the RESTful web services book by Sam Ruby and Leonard Richardson. They describe the principles that inform REST-ian (RESTafarian?) design in detail, taking you step-by-step through two different sample applications. If you’re a Ruby programmer using Rails, you’ll find the book especially valuable, since that’s the language and framework in which most of the examples are done. For those of us using different technology, it’s the thought process behind the examples that is most illuminating.

This is because the key challenge of the REST paradigm is the fact that it can’t really be implemented on today’s web without some workarounds. REST will come into its own with HTML 5. The book steers an interesting course between how REST web services might be done in HTML 5 with how they must be done today. I think the authors get the balance right, but at times it can make for a frustrating read for someone wanting practical advice about building a REST service right now.

But that’s less a criticism of the book than of the openness of the REST concept itself. The occasional what-if digression the authors make is a small price to pay for the amount of sound guidance you get. The appendixes alone, which discuss things like which HTTP status codes and headers are worth implementing and which are worth forgetting, will save you far more time than you’ll lose in reading how great things will be when HTML forms finally support the PUT method.

Until that day comes, keep this book handy.

WCF Bad Request

I’ve just identified a horrible bug in WCF for the .NET Framework 3.5.

A caught exception in a WebInvoke operation will cause WCF to return an HTTP 400 Bad Request status code to the client. Any caught exception. Every time. Regardless of whatever error code you might want to send back.

I found the error by mistake. I’d used “BadGateway” instead of “BadRequest” in my code. If it weren’t for other odd WCF behavior, I wouldn’t have noticed that my status code was being ignored.

Consider the following example:

  1.  // Read the Xml into our object and save.
  2.  try
  3.  {
  4.   // The following line triggered the error.
  5.   obj.ReadXml(reader);
  6.   obj.Save();
  7.   // Set HTTP Cache Options and MIME Type.
  8.   Utilities.SetCaching(WebOperationContext.Current, obj.DateModified, 60);
  9.   Utilities.SetMimeType(Format.Xml);
  10.   return Utilities.GetXmlStream(obj);
  11.  }
  12.  catch (Exception e)
  13.  {
  14.   // Was it a schema validation error? If so, provide detail.
  15.   if (!string.IsNullOrEmpty(_XmlValidationErrors))
  16.   {
  17.    // I slipped here, using BadGateway 502 instead of Bad Request 400.
  18.    // But WCF doesn't care. If you enter the catch block it's _always_ 400.
  19.    WebOperationContext.Current.OutgoingResponse.StatusCode = HttpStatusCode.BadGateway;
  20.    WebOperationContext.Current.OutgoingResponse.StatusDescription = _XmlValidationErrors;
  21.    WebOperationContext.Current.OutgoingResponse.SuppressEntityBody = false;
  22.    return null;
  23.   }
  24.  }

If no error occurs, WCF will return the status code you specify. A try/finally block will work just fine; WCF returns whatever status code you specify. Enter a catch block, though, and it’s nothing but 400 Bad Request.

Hey, if there’s an error, it must be the client’s fault, right?

The Address Bar as the New Command Line

It’s been a common meme over the past few years that command-line interfaces (CLIs) are making a resurgence. Jeff Atwood noted in 2005 that the Google search box is a command line of sorts. In early 2007, Lifehacker’s Gina Trapani listed several examples of CLIs showing up in a variety of applications.

The return of the command-line interface is striking. It was pronounced dead, replaced by the graphical user interface (GUI) more than two decades ago, after usability research discovered that novice users were much more comfortable with GUIs. But a CLI has three big advantages over a GUI.

The first advantage is speed. Expert users find that command-line interfaces are significantly faster than GUIs. People can type much faster than they can identify a link or button, move the mouse to it, and click. This is especially true if the user’s hands are already positioned above the keyboard. People that do a significant amount of word processing (or writing code) often memorize keyboard shortcuts to avoid having to grab the mouse. It might save only a fraction of second, but if you do it many, many times over the course of the day, those split-second differences add up.

The second advantage is that a command line interface is a linear interface. Neither the user nor the application have to worry about the exact position of the pointing device over a two-dimensional surface. This makes command-line interfaces ideal for devices that have tiny screens, or whose pointing devices aren’t very accurate. It’s much easier to text a few commands on a cell phone or PDA than it is to use a stylus or manipulate a touch screen.

These first two advantages of CLIs are are well-known. The third advantage is less obvious: It turns out that CLIs are also much easier for computers to use.

Why do we care about making things easier on our silicon-based friends? Well, one reason is that it indirectly makes things easier on those of us humans that give instructions to computers. (As a programmer myself, that’s a big selling point.) Another reason is that if it’s easy enough for a computer to figure out, then it ought to be brain-dead simple for a human to understand. As I’ve pointed out before, computers just aren’t very bright.

So the advantage of having a command-line interface is that if you, a human, get bored of typing the same silly commands over and over, you can easily write a small program that instructs a computer to do it for you. This is the reason why every office suite has some kind of macro language. It allows humans to hand off repetitive tasks to a machine so that people can get on with the important creative stuff.

So what’s this about using a browser’s address bar as a command line? With AJAX technologies, the URI is no longer merely something that a human types into the address bar to get a web page. It’s also an interface that computers use to interact with data. This means the humble web URL shares most of the key characteristics and advantages of a command line interface.

It’s this insight that’s led to increased interest in the REST architectural style. REST has begun moving into the mainstream because the patterns it promotes allow the URI to serve an audience of computers as well as an audience of humans.

REST for the Weary

Those of you with a technical background may have noticed a close correspondence between the Web 2.0 principles I described in our design series and Representational State Transfer, or REST. This is no coincidence. Gordon’s been a backer of RESTful approaches to web application design for some time now; I’m a more recent convert. More importantly, the REST architectural pattern fit what we were trying to do with our infovark project.

REST is a design pattern used to create Internet applications. It’s been growing in popularity, but hasn’t been fully adopted by any of the major vendors yet. (Microsoft’s efforts to lump REST into the Windows Communication Framework notwithstanding.) This is probably due to the fact that the World Wide Web Consortium, or W3C, put its weight behind an earlier, competing design philosophy called SOAP. (SOAP used to stand for Simple Object Access Protocol, but the “simple” part was dropped long ago.)

SOAP was designed to help loosely connected computer systems communicate with each other. Many previous frameworks and standards had attempted to do the same thing, but with so many different hardware and software vendors building systems, most were doomed to fail. SOAP is likely to stick around for a while due to its close association with with Web Services and Service Oriented Architectures. As a practical matter, however, the class of problems that SOAP solves are actually rather limited. Strike that; the class of problems that only SOAP can solve are rather limited. For most applications, there’s an easier web services alternative: REST.

An Illustration

Pardon me while I geek out for a moment.

When I was a kid, me and my friend Rajeev would dial each other up — yes, literally dial each other — with our 300 baud modems. I know, I know, you young ‘uns are thinking, “What’s baud? What’s a modem?” Suffice it to say that it was a slow way to get two computers to talk with each other. And when I say slow, I mean S… L… O… W. You could literally watch the letters appear one by one in your monochrome terminal window. If you can imagine sending a message via Twitter one letter at a time, you’ve got the idea.

It was so painfully slow that there really wasn’t much point in sending messages back and forth. Other than the nerd-cool factor of making two computers located in different parts of town communicate, there wasn’t much to do. So Rajeev and I hit upon an idea. We’d play a game online. Being nerds, we naturally picked Chess.

Chess was actually a great application for modem-to-modem communication. There was a well-known initial starting state in the traditional arrangement of the pieces. There was an established protocol: white moves, then black moves. And there was even a short messaging format: chess notation.

So we started playing Chess online. Each of us kept a small chess board by the computer. We slowly took turns typing out our moves to each other and updating our game boards: “P-K4″, “P-K4″, “Kt-KB3″, “Kt-QB3″ and so on. Not exactly riveting entertainment, but hey, we were doing something new and different.

Every now and then, we’d run into a problem. I’d get a message from Rajeev with a nonsensical move, or he’d get a message from me that moved a nonexistent piece. Then one or the other of us would see these letters slowly print across the screen: P… I… C… K… U… P… T… H… E… P… H… O… N… E.

We’d then try to figure out what had happened, based on the log of all the messages that went back and forth and the current position of the pieces on each of our game boards. Sometimes we were able to figure out the mistake. Sometimes we agreed to go back to the last time we picked up the phone to reconcile our respective chess boards. Sometimes we started over.

As you can imagine, this sort of troubleshooting got old fast. And it happened all the time. Eventually we gave up on trying to play by computer, and we just bugged our Moms to drive us over so we could play using the same board.

REST in a Nutshell

The point of the story above is not to establish my geek cred, but to offer an analogy.

In the early days of network computing, bandwidth was low, latency was high, and it was vitally important to make your messages as concise as possible. A wide variety of message formats and protocols evolved to respect the limits of early computers and networking technologies, all designed to get as much useful data packed into as little space as possible. It’s exactly like the chess notation Rajeev and I used. We could have sent snapshots of the chess board after each turn, but it would have taken hours to transmit a single move that way. All we really needed to know was which piece needed to move where. Using the shorthand, a single move — one procedure — could be described in just a few letters.

Though network computing technology has come a long way since then, the most common way for computers to talk with each other is still via Remote Procedure Calls, or RPC. Rather than describe the entire gameboard, computers just tell each other how to move the pieces.

If you’re wondering how computers handle mistakes or lost transmissions, well, a staggering amount of effort in computer science has focused on error detection and correction algorithms and secure transaction processing. Believe me, the last thing your credit card company and your bank want to do is pick up the phone to work out whose set of accounts is more accurate.

This is why data replication is such a huge problem. If you only transmit the moves to each other, you have to start from a known initial state. The chessboard at Rajeev’s house and my house had to match at the start of the game. It’s also why synchronization is a big deal. If the moves are sent out of order, all sorts of problems occur.

If, on the other hand, you could send pictures of the game board back and forth, a seasoned player could probably reassemble the images in something close to the right order. Better yet, if both players could look at the same board at the same time, then you’d never get out of synch.

This is the essence of the REST architectural pattern. It’s a little less respectful of network resources, but by transmitting the current state of the game at any point in time, you can simplify the amount of work you need to do to get two players to agree. Most of the transaction issues and handshaking protocols become unnecessary. The World Wide Web — hypertext over HTTP — works in a RESTful way, and it’s the most successful computer application ever built.

Enterprise 2.0 is about applying the lessons from the Web to the enterprise, so it makes sense that we should start with the core design principles.