Archive for the ‘Programming’ Category
WCF WebInvoke Body Format Error
I spent Sunday afternoon battling an odd WCF error.
System.InvalidOperationException: Incoming message for operation [your operation here] contains an unrecognized http body format value ‘Xml’. The expected body format value is ‘Raw’. This can be because a WebContentTypeMapper has not been configured on the binding. See the documentation of WebContentTypeMapper for more details.
Troubleshooting this issue sucked. Here’s the deal: WCF helpfully attempts to parse any incoming “text/xml” requests automatically. If you’ve defined the input parameter on a WebInvoke operation to be a Stream, WCF can’t bind to the method and returns an HTTP 400.
There’s two ways to solve this problem: Change the input parameter from Stream to XmlElement, or configure WCF to treat this request as Raw. I picked the former method. Carlos Figueira explains the latter method.
If folks are interested, I can post some more detail about the problem and the resolution. For now, I have to finish making things work.
Edit: So here’s the rest of the story, since Brad asked.
I wasn’t sure exactly what triggered the issue. I got this behavior with just one WebInvoke and one WebGet operation using the same URI template. What I’d done was to create a generic ObjectService that exposed the same RESTian operations for several different types of objects. The particular operation in question looked something like this:
-
[OperationContract]
-
[WebGet(UriTemplate = ItemUris.IndexXml, RequestFormat = WebMessageFormat.Xml, ResponseFormat = WebMessageFormat.Xml)]
-
Stream GetListAsXml();
-
-
[OperationContract]
-
[WebInvoke(UriTemplate = ItemUris.IndexXml, Method = "POST", BodyStyle = WebMessageBodyStyle.Bare, RequestFormat = WebMessageFormat.Xml, ResponseFormat = WebMessageFormat.Xml)]
-
Stream PostListAsXml(XmlElement input);
Originally, PostListAsXml had accepted a Stream. That seemed to work in other places, but then I started noticing the InvalidOperationException messages.
I think the right way to solve this problem is to follow Carlos’ advice, and create a new WebContentTypeMapper-derived class. But I didn’t have time to figure out exactly where to plug it in, and I was afraid that I might introduce other problems. I just didn’t know enough about the inner workings of WCF to know whether that was a safe operation.
Since we hadn’t shipped the interface yet, I was free to change the return type from Stream to XmlElement for the one or two WebInvoke operations that were returning errors. Fewer lines of code needed to change, and I knew I wouldn’t break anything else.
Of course, I’m probably just setting myself up for more pain down the road, but sometimes you just need to get things done, y’know?
Anyway, since I’ll likely revisit this decision in a later release, I’d love to hear what other folks did in this situation.
An Outlook Conversation
One of the things that I had to do in Outlook this week was to determine if an outlook MailItem is part of a conversation.
After much googlework, I discovered two properties on the Outlook MailItem – ConversationIndex and ConversationTopic.
To determine if an Outlook Item is part of a Conversation, you need to look at the first 22 bytes of the hex string reported for ConversationIndex. If they are the same, then the message is part of the same conversation.
-
-
public static bool SameConversation(MailItem item1, MailItem item2)
-
-
{
-
return item1.ConversationIndex.Substring(0, 22) ==
-
item2.ConversationIndex.Substring(0, 22);
-
}
That’s all well and good, but note that this property is somewhat unreliable. For starters, in versions of outlook prior to 2003, it returned binary data, instead of a hex string (but if you’re working with versions of Outlook prior to 2003, you probably have other problems…) The other reason this property is unreliable is because it is set by the client – Outlook appends a 5 byte timestamp to the ConversationIndex when you reply. Which is cool, as long as you reply through Outlook.
But – our Infovark mail server is hosted by Google, and I occasionally use the gmail web client to reply to mail, instead of Outlook. For these conversations, when the replies were eventually retrieved into Outlook via IMAP, they ended up with unique conversationindexes, and so I couldn’t identfy them as being part of the same conversation.
In these cases, that’s where the ConversationTopic Property can help give you a clue. The ConversationTopic is the normalized subject of the message, that is, the subject without all the prefix strings (“Re:Re:” etc.) By comparing ConversationTopics, you can usually piece together the conversation, even if the ConversationIndex is not correct.
Microformats Introduction
I’m on record as being skeptical of the semantic web. Or rather, I’m skeptical of much of the marketing hype around the semantic web. That’s not to say that semantic technologies won’t be useful.
I still believe that both Resource Description Framework (RDF) and the Web Ontology Language (OWL) are too complicated to gain widespread adoption. But maybe we don’t need their academic rigor. Microformats offer a way to get some of the benefits of the semantic web using plain ol’ HTML.
What are microformats? How do they work? Emily Lewis wrote a great series of blog posts introducing microformats. You can also go direct to the source, the microformats homepage, at microformats.org.
An example
Here’s an example of our company address in hCard format.
The address above is marked up in such a way that (some) web browsers can identify it as a street address. But it’s nothing more than ordinary HTML. Here’s what the code looks like:
-
<div id="infovark_vcard" class="vcard">
-
<a class="url fn n" href="http://www.infovark.com"> <span class="given-name">Infovark</span>
-
<span class="additional-name"></span>
-
<span class="family-name"></span>
-
</a>
-
<div class="org">Infovark</div>
-
<a class="email" href="mailto:info@infovark.com">info@infovark.com</a>
-
<div class="adr">
-
<div class="street-address">10104 Bushman Dr.</div>
-
<span class="locality">Oakton</span>,
-
<span class="region">VA</span>,
-
<span class="postal-code">22124</span>
-
<span class="country-name">USA</span>
-
</div>
-
<div class="tel">800-833-9796</div>
-
</div>
It’s simple enough that it just might deliver where RDF and OWL fail, becoming part of every web developer’s toolkit.
Get started
You can experiment by creating your own hCards using the hCard creator.
And if you’re using Mozilla Firefox, you can download the Operator add-in to see — and use — microformatted data embedded in ordinary web pages.
Hat tip: Ajaxian for Getting Semantic with Mircoformats
JQuery Turns 3
When we began work on the Infovark user interface, we decided to base it on HTML and JavaScript. Both Gordon and I are very comfortable with web development, so it was a natural choice. We also felt this would give us the most flexibility to run on different platforms with different screen sizes. For better or worse, HTML and JavaScript have together become the lingua franca of interactive design.
Both have their drawbacks, of course. HTML and JavaScript have evolved over time. They each have quirks, particularly with regard to the Document Object Model (DOM). Fortunately there are a wide variety of JavaScript libraries that help programmers working with HTML and JavaScript.
We love JQuery
After doing a little research, we settled on on JQuery. JQuery makes us love JavaScript again. It’s a simple, small library that works across all major browsers. It deals with all the inconsistencies that emerge from the last decade of tinkering with web standards. Most importantly, it helps us get things done.
Microsoft has decided that they love JQuery, too. John Resig, the progenitor of the JQuery project, announced in September that jQuery will be distributed with Visual Studio. Two prominent Microsoft bloggers, Scott Guthrie and Scott Hanselman, also discussed the news.
It keeps getting better
Momentum around the project continues to build. The JQuery blog just posted news about the JQuery 1.3 release and the JQuery Foundation. Most exciting of all (from our perspective as developers) is the release of revamped JQuery API documentation.
Congratulations to the JQuery team! It’s come a long way in three short years.
Digital Watches are a Pretty Neat Idea
The Hitchhiker’s Guide to the Galaxy opens with the following quote:
Far out in the uncharted backwaters of the unfashionable end of the western spiral arm of the Galaxy lies a small unregarded yellow sun.
Orbiting this at a distance of roughly ninety-two million miles is an utterly insignificant little blue green planet whose ape-descended life forms are so amazingly primitive that they still think digital watches are a pretty neat idea.
Though it was published three decades ago, the quote is relevant today. Not only because of our continuing enthusiasm for digital watches, in all their forms, but because our neat little clocks are still amazingly, stunningly primitive.
Computers Can’t Tell Time
From the Y2K bug to the recent Zune bug, the inability of computers to tell time properly — and the inability of programmers to process dates and times correctly — leads to recurring problems.
While some tech bloggers feel the issue is poor datetime programming practices, I think the issue is more fundamental than that. Why else would an established software platform vendor like Microsoft still be struggling to come up with a decent datetime implementation?
The problem is simply that computers can’t tell time. Or rather, that computers worldwide can’t agree on what time it is.
A history lesson
In the early prehistory of the Information Age — prior to 1970 — computers were large, unwieldy assemblages of equipment. ENIAC filled an entire room. Mainframes and supercomputers, descendants of those early digital beasts, share two characteristics with the early computing devices.
- They are not portable.
- They do not communicate with other computers, or if they do so, they usually determine the operating environment of the other terminal computers.
Personal computers, even today, have yet to shake some of this legacy. The upshot of this is, your computer thinks its clock is always right. It never checks the position of the sun, or better yet, checks with the International Bureau of Weights and Measures to get the Universal Coordinated Time. As a result, most computers today rely on the accuracy of their internal clocks, which are typically set to local time or the time at the chip factory.
Some operating systems will adjust this for you. If you’re running Microsoft Windows, for example, you’ll default to U.S. Pacific Standard time. Which is wrong for most of us on the planet, but you have to pick somewhere to start, right?
All your base are belong to us
I know, you’re thinking, “All that crappy legacy left over from the mainframe days. Can’t we scrap it and move on?” Well, buckle up, friends, because there’s far more legacy thinking involved in timekeeping than that! Our system for time is based on work originally done in ancient Sumeria.
The Sumerians used a sexagesimal number system, which is almost as interesting as it sounds. Programmers are used to the everyday decimal number system, the binary number system, and hexadecimal. But base-60 is a complicated system indeed, with many fascinating properties, including the fact that it’s evenly divisible by the first six counting numbers.
So the reason why hours have 60 minutes and minutes have 60 seconds is all the Sumerians’ fault. As is the convention that a circle has 360 degrees.
And yes, they tried hard, but 360 is not quite the number of days in a calendar year. But they felt it was close enough. Though after two millennia, they’d made April Fools of many Catholics.
A modest proposal
My point? Math is hard, and converting between different numeric systems makes it harder, and when real life calendars don’t quite fit into a neat mathematical box, you’ve got the makings of a complicated system that’s ripe for failure. So fail we do — and often!
In the interests of making one tiny, tiny optimization in this all-too-complicated matter of time (And we haven’t even factored in general or special relativity yet!), I propose that we all adopt UTC immediately. Forget about daylight savings, forget about time zones: UTC and datetime calculations are troublesome enough without having to worry about localization.
We now have a globe-spanning GPS system, constantly beaming timing signals to a wide array of portable devices. Our laptops, workstations, minicomputers and mainframes ought to use these as well. Let’s all synchronize our neat digital watches.
Ready? Mark!
The Loop Snooper
Some programmers claim that their code is poetry, but there are few programmers whose poetry might count as code. Recently, Phil Windley posted A Halting Problem in Verse to his Technometria blog. It contains a link to a proof that the halting problem is insolvable, written by Geoffrey Pullum in the style of Dr. Suess.
Been Caught Stealing
Jeremy Miller writes, “if you’re writing ADO.Net code by hand, you’re stealing from your employer or client,” in his How to Design your Data Connectivity Strategy post last month on CodeBetter.com.
Guilty as charged
When we first started laying the groundwork for Infovark, we assumed that our back end would be a full-fledged Enterprise Content Management system. It was only later, once we realized that Infovark required a separate object persistence layer on the client side, that we began thinking about data storage.
Now, there were dozens of object-relational mapping tools available. From our perspective, though, they all shared a common flaw: We didn’t know how to use any of them.
Old dogs avoiding new tricks
Gordon and I got started in web development back in the “classic ASP” days. We knew how to work with ADO.NET. We had battled with object-relational impedance before, and had the scars to prove it.
So after gazing longingly at ActiveRecord and NHibernate, we decided to roll our own data access layer.
Months later, after much refactoring, we finally have a reasonably solid platform on which to build our application. It’s something we might have had in the first six weeks, had we done our homework.
But Infovark is an unusual project. We were having to learn many, many new things at the start. The thought of adding to the pile of books to read and websites to scan… especially when it was something we actually knew how to do…
What can we say? We fell prey to temptation. We promise we won’t do it again. It’s the straight and narrow from now on.
Coding Aphorisms
Nat Pryce posted part of his collection of Laws of Software Development to his blog, Mistaeks I Hav Made.
My, my, we programmers are a cynical lot.
The Language of Programming
Scott Hanselman asks, Do you have to know English to be a Programmer?
No. But you might find it easier to learn programming — and to keep up with the state of the art — if you did.
Knowledge of English is helpful
This has nothing to do with merits of English as a language. The use of English in programming and computer science is an artifact of history. Much of the early work in computation was done in English, and the transistor and integrated circuit were invented in English-speaking countries. It became an informal convention. Most of today’s programming resources are available first in English, and many resources are available only in English.
For similar reasons, it’s good for programmers to familiar with more than one programming language. Most of the code we write at Infovark is written in C#. But we look at many open source projects written in Java, Ruby, or PHP for inspiration. If we relied solely on the information and code available to us in C#, or in the .NET framework, or on the Microsoft platform, we’d be limiting ourselves.
Internationalization still matters
While I think programmers ought to have a working knowledge of English, that’s not an excuse for software companies to produce English-only products.
Today, software companies that want to reach the broadest audience of programmers should provide English documentation and samples for their software. But they shouldn’t stop at just one language. There are large communities of programmers — and customers — that speak German, Hindi, Chinese, French, Spanish, and Japanese and Russian. The languages chosen will depend on the particular software application or market. English may be necessary, but supporting it alone is not sufficient.
The Curse of the Singleton
It took us six weeks to break the curse of the singleton. Six weeks! By the end of it, we’d rewritten most of our data access layer.
We began the process of removing singletons innocently enough. I thought I was well prepared for the task. I’d just finished reading The Pragmatic Programmer (my review of The Pragmatic Programmer) and Working Effectively with Legacy Code (my review of Legacy Code). I remember telling Gordon I’d tackle the problem over the weekend…
What’s a singleton?
The Singleton Design Pattern is one of the first patterns introduced in many software design books. But don’t let this fool you like it did me. Its prominent position has nothing to do with its importance. The Singleton is usually listed first because it’s the easiest pattern to explain and implement. It made a convenient place for the author to start, but the Singleton’s real uses are very limited.
Which is appropriate, actually, since the real use of the singleton is to limit usage. A class that implements the Singleton pattern allows only one object to be instantiated at a time. There are a few cases where this is desirable. For example, classes that control access to a single hardware device or that set up global variables. But the danger of the Singleton is that there are many cases where you’ll want to misuse it.
Why are they bad?
Scott Densmore lists the four key characteristics of the Singleton and how each can get you into trouble in his Why Singletons are Evil blog post.
For another cautionary tale of the cycle attraction, infatuation, disappointment, and rejection, read Singleton, I love you, but you’re bringing me down.
In our case, we’d gleefully implemented Singletons for database access, content indexing, security and access control, and in a few other places where we thought we needed just one instance. If Steve Yegge were here, he’d call what we’d done an instance of the Simpleton pattern — a failure to grasp basic principles of object-oriented programming. You can read more about Yegge’s thoughts on the singleton and design patterns for dummies.
Our automated tests were running slowly because we had to set up and tear down the database for every test. Making a change to one component would frequently cause several tests to fail. Everything was tied together at the hip — at the Singleton classes — and it was impossible to disentangle our code to test particular items in isolation. We had tests, but not unit tests. They were integration tests, and the points of integration were the handful of singleton classes we’d built.
Worse, our database performance was lousy. Since we had a global variable for our database object, we could sprinkle database access code throughout the rest of our object model. We discovered that we were opening and closing database connections all the time. And we’d had to implement tricky locking code to guarantee that our SQL statements would get executed in the right order.
What did we do about them?
The Singleton let us be lazy about our programming habits. It allowed us to make assumptions we shouldn’t have. You can call it premature optimization or a retreat into procedural programming techniques from an earlier era. Ultimately, we’d found that it allowed us to cut too many corners.
So we slowly rooted out each Singleton class from our API and reimplemented the functionality in other ways. Fortunately, we had a large battery of integration tests to help guide us. And luckily, we’d decided to tackle the problem during our first Alpha test, when we could still afford to make sweeping changes. But correcting bad design takes much longer than avoiding it in the first place — even if you’ve read all the right books.
Six weeks later, we finally sorted out the mess we’d made for ourselves. There’s a handful of odds and ends left to do, but the design feels better. My gut tells me it’s an improvement, and our tests — now we have both unit and integration tests — show that we’ve almost tripled the speed of the data access layer.
It was worth our time to break the Curse of the Singleton. Beware lest ye, too, fall under its spell!