Infovark Underground

  • news
    • infoblog
    • underground
  • product
  • download
  • buy
  • support
  • about
  • architecture

    • Getting It Out of Our System

      28 Oct 2009 by Dean / 1 Comment

      On the main Infovark blog — the “business” blog — I talked about how we threw the first version of Infovark away. Not the core idea, of course, but we dumped our initial database schema and restructured most of the code. Here’s the technical story behind that tough call.

      Fighting the last war

      We’d originally intended to build Infovark on top of an existing Enterprise Content Management platform. At the time, Gordon and I thought of Infovark mostly as an alternative ECM interface targeted at knowledge workers.

      When we decided to strike out on our own, we realized that we couldn’t take the underlying ECM services for granted anymore. We’d have to build things like object storage and version control ourselves. And if we were going to create a mini-platform for our own use, we were going to do it right, by golly.

      Drawing on my background with real estate MLS systems, database reporting tools, and electronic records management, I began to work on an amazingly flexible data storage tier. Gordon dove into the data access classes. We began creating the sort of system we’d always wanted to use as highfalutin IT consultants.

      The system allowed us to do neat things like define metadata for an object on the fly and roundtrip it to the database. We could access those objects in code, or via XML or JSON web services. We could define multiple views for each object. Every object was searchable using a full text index or via SQL. We were quite proud of ourselves.

      It was an utter waste of time.

      Deep magic

      Technically, what we’d done was construct an entity-attribute-value database model. EAV systems are technically complex beasts designed to solve one tricky problem: client domains that don’t have well-defined metadata.

      The EAV model was initially developed for clinical records systems. These needed to accommodate a huge array of possible symptoms, complaints, effects, diagnoses, and interactions. A particular patient is unlikely to have more than a handful of issues at a time, however. It’s a rich set of metadata with a sparse array of values.

      Think of those questionnaires you get at your doctor’s office. They normally give you a long form with lots of checkboxes on it. You tick a few boxes here and there to record your or your family’s medical history.

      An EAV model is designed to store that type of information efficiently in a standard relational database, at the cost of doing some major code gymnastics.

      Except in very limited circumstances, an EAV design is considered a database smell. Some go so far as to list it as a SQL design error. Joe Celko, author of several books on SQL, has an article on how to avoid the EAV of destruction.

      EAV remains popular though, perhaps because of its close ties to Steve Yegge’s universal Properties pattern.

      In fact, there’s a whole slew of alternative databases designed specifically to help with EAV problems: Column-based databases like Vertica, meant for data warehousing; XML databases like Mark Logic, for structured documents; CouchDB for unstructured content; and key-value stores like MongoDB. But I digress.

      All of the hoops we jumped through to help store arbitrary data was overkill. We were making the problem harder than it needed to be.

      The only positive thing I can say about the effort was that it was an itch we had to scratch. We had to get the old stuff from our previous jobs out of our system before we could focus on Infovark.

      What we really needed

      While the features we built were exactly the sorts of things a consultant or systems integrator might want, end users couldn’t care less about them. We’d unconsciously built a product for ourselves, not for our customers.

      Our customers didn’t want to define their own data structures. They don’t want to learn about metadata or record types. They just want a product that helps them remember stuff. Figuring out what data to store or columns to index was our job.

      So while the Alpha build was incredibly cool from a techie perspective, it wasn’t easy or fun for the typical knowledge worker to use.

      We needed to do our homework. What do our customers need to get out of a personal information wiki? What items will they want to reference later?

      How we manage that information under the hood should be completely invisible to them. As far as they’re concerned, Infovark is an actual animal that lives inside their computer that helps them find interesting things.

      Back to the drawing board

      Once we started looking at the problem from the user’s perspective, things got much simpler. We threw out the EAV approach and went with a much simpler data model. We gathered requirements to figure out what were the bare minimum number of data types that a typical knowledge worker would need. Then we began defining templates that gave users the ability to interact with these types in (what we hope) will be natural ways.

      I guess it’s another example of the write big to write small principle. We built a general framework at first, capable of handling nearly any sort of object we threw at it, then drastically edited it back to hold the bare minimum needed.

      Conclusion

      It wasn’t that the EAV approach was wrong. It worked. We could have built on it. But it was a huge framework and it consumed a lot of our engineering effort. That’s time much better spent on things that our customers actually care about.

      I wish we’d started with the simple solution. But I’m not sure we would have understood or appreciated it without trying the EAV approach first. We needed to get it out of our system.

      And then we needed to get it out of our system.

      Continue Reading

    • Domain Models in High Performance Systems

      23 Jul 2009 by Dean / No Comments

      I skipped today’s DC Alt.NET meeting on JavaScript. With the other half of the Infovark tech team on vacation, I’m holding down the fort.

      Fortunately, I was able to expand my programming knowledge by catching up on my blog reading, and particularly by watching Greg Young of IMIS give a presentation called Unshackle Your Domain at QCon in June.

      If you’ve ever had to built a high-performance system or one that has strict auditing and reporting requirements, this presentation is for you. Greg’s company deals with financial systems, and you can tell he’s learned many best practices the hard way.

      While I doubt we’ll need an architecture as robust as he describes for Infovark, I recognize many of the the problems and patterns he describes from my old jobs in software companies making records management software (auditing) and real estate systems (transactions and reporting).

      The key insight is that for certain software solutions, it’s important to model state transitions as part of the problem domain.

      But what I found most interesting was how his example system combined the principles of Domain-Driven Design with the older notion of Command-Query Separation.

      I’d explain in more detail, but it’d probably be easier to just watch the presentation yourself.

      Continue Reading

    • Review: Framework Design Guidelines

      11 May 2009 by Dean / No Comments

      Framework Design Guidelines

      I’m almost embarrassed to admit that I really enjoyed Framework Design Guidelines by Krzysztof Cwalina and Brad Abrams. I mean, it’s a book about coding standards.

      Maybe I ought to get out more… but before I leave the glare of my monitor behind, I’ll type up my review.

      Code is literature, not language

      Computing languages, just like human languages, have grammar and syntax. There are correct ways to form sentences and paragraphs, with well-defined rules (and exceptions). Just like word processors can check spelling and verify basic sentence structure, most IDEs today can ensure your code will compile and run.

      That doesn’t mean that your story or program is an easy or enjoyable read, though. Most newspapers have accumulated extensive guidelines for matters of style and substance, and most software companies have their own guides as well. If you’re writing as part of a team project, or writing programs intended to be used by other programmers, it’s important to make your code consistent, clear, and direct.

      Just like many journalists keep a copy of The Associated Press Stylebook or the New York Times Manual of Style and Usage handy — even if they don’t actually work for the New York Times or the AP — lots of programmers keep a copy of Microsoft’s Framework Design Guidelines as reference.

      Or they should. That’s probably my roots as a maintenance programmer showing.

      Know your genre

      Ideally, you’d want any code you write for other Windows programmers’ use to look as if it came from Microsoft itself. That is, you want it to feel like a natural extension of the .NET framework and not some third-party bolt-on with odd stylistic touches. You’d also like your code to use the full power and expressiveness of .NET, and not appear like some watered-down Java-esque port. (Far too many open source projects retain awkward Java-isms after being converted to .NET, in my opinion. NUnit is a notable exception.)

      This helps your fellow programmers gain a better understanding of your code in less time. And it can also make your programming tasks easier, too. Just like design patterns can help you lay out your application architecture, programming guidelines can help you structure your code at the class or method level.

      About the book

      The Framework Design Guidelines covers a lot of ground in its 400 pages. It describes what conventions Microsoft uses when designing types, methods, and exceptions. It also describes the naming and design patterns that Microsoft uses in their public APIs. The topics are grouped by category and heavily cross referenced, making it easy to find your way around. The rationale of each guideline is explained, and the authors indicate the strength of each recommendation by marking it with the terms Do, Consider, Avoid or Do Not.

      But the best part of the book is the stories and comments given by members of the Microsoft team. These are sprinkled throughout the book and give insight into why the guideline exists. Some of these discuss lessons Microsoft learned the hard way, places in the .NET Framework where the rules are violated, and how real-world programmers feel about certain guidelines. You can get a flavor of these by checking out the Framework Design Guidelines section of Brad Abrams‘ blog.

      If you find his posts interesting or helpful, you’ll feel the same way about the book. Highly recommended.

      Continue Reading

    • Categories

      • .NET (41)
      • AJAX (3)
      • Books (7)
      • HTML (9)
      • Infovark (8)
      • Programming (48)
      • REST (11)
      • SQL (3)
      • Testing (3)
      • Tools (13)
      • UI (3)
      • WCF (11)
      • Web Services (8)
      • WPF (4)
      • XML (4)
    • Archives

    • Get future articles


       

    • Blogroll

      • Ajaxian
      • Anne Van Kesteren
      • Brain.Save()
      • Coding Horror
      • Eric Sink
      • Joel Spolsky
      • John Resig
      • Mark Pilgrim
      • Raymond Chen
      • Scott Hansleman
      • Secret Geek
      • Steve Yegge
      • The Daily WTF
      • The Database Programmer
    • Meta

      • Log in
      • Entries RSS
      • Comments RSS
      • WordPress.org
  • Site map

    • News
    • Product
    • Download
    • Buy
    • Support
    • About
  • Recent Posts

    • Review: Brownfield Application Development in .NET
    • Using Modal Dialogs with a Splash Screen in WPF
    • Highlighting query terms in a WPF TextBlock
    • Getting XAML Hyperlink text to wrap
    • How to format the XAML Hyperlink NavigateUri
  • Twitter

    Copyright 2011 Infovark, Inc. All rights reserved.