A Warning About the Real Cost of Microformats

I’m done with microformats. From now on, i’m either building separate developer tools and relationship, or i’m not. I say that having been through the cycle of adopting hCalendar and hCard a few times, not just as an industry commentator. My reasons are threefold. First, in the real world, publishing microformats requires you to rewrite DOM structures, publish extraneous invisible elements, adopt new schemas, and adopt data publishing-like structures on frontend pages intended for the browser. Second, the relationship between publisher and developer is not significantly improved by microformats, and would be better served by a separate pathway for developers. Third, its proclaimed benefits as a standard are extraneous because hCard, hCal, etc. are valiantly attempted, but incomplete redefinitions of existing standards like vCard and iCalendar in XHTML-like formats.

The idea of microformats sounds swell, inexpensive, and easy. Take your existing data, and surround the data-ish bits with tags that separate it into parts with semantic meaning. Unfortunately, saying that something is easy does not make it so. I don’t mean to disparage the work that microformats mavens have done, but my experience with being a microformat publisher has shown that things are exponentially more complex than they let on in the “sales pitch.” I don’t think they realized what they were getting into when they started on the process of actually getting publishers to conform to this stuff.

Surrounding your existing markup sounds simple enough, right? But consider how you’ll have to nest things together. Should you publish microformats on both listings and detail pages? Are there required fields that aren’t present in the content you already have on the page? Do you just publish lots of invisible XHTML content to the page to fill in the missing stuff? Do you dare to deal with recursions? What if your data is split up in separate divs? What happens when your data model does not fit with the standard? What if you are not a very good HTML contortionist? What happens when your presentation is different based on varying pieces of available data? Is hCard publishing even useful at all on public pages without private, uniquely identifiable information? Oops, there are no microformat validators, either. This is especially difficult for something with as many ominous implications as a stepchild of iCalendar.

Much like an oral agreement, publishing microformats is an informal agreement between you and (hopefully) a developer community that sets up a relationship with plenty of vagueness, inertial resistance to change, and potential landmines to step on. Would you create a real developer API without a TOS, agreement, or at the very least, guidelines? Are you prepared to deal with objections if, when cutting costs, you rev a frontend design and lose some important aspect of microformat structure on the page (or, god forbid, you just don’t bring microformats over at all). Alternatively, are you prepared to announce all frontend markup changes? Does publishing a microformat without a special agreement mean that you are implicitly allowing comprehensive scraping of your web data? If you spend an hour seriously considering the costs of treating your frontend interface as a programming API for the sake of your relationship with developers, would you then rather spend those costs there, or on proper versioning, documentation, and communication with a developer community over a real publishing protocol or API? Publishing microformats while not having formal consumer support is a commonly what happens, but it is a poor midway point to leave yourself at.

The only place I can still imagine microformats surviving a cost/benefit analysis is in the case of preparing for search engine crawlers. In theory, if everyone publishes their websites according to a few semantic standards, the Big SE’s can embed structured data in their search results and act as aggregators of real data. There are a couple practical issues that you’d have to ignore in order to go for this pipe dream, though. You’d have to hope that the structure of microformats is fault-tolerant enough to survive the endless mangling of random developers trying to publish junk data, and that whatever was clean enough to make it through would be parseable, high quality, not eliminated in dupe checks, and relevant to a big enough segment of searches to justify the cost for the search engines to link to it. Second, you’d have to accept the fact that you probably wouldn’t get any permanent special treatment in search results, lest microformats become a new meta tag for SEO. Third, you’d have to assume that the big SE’s wouldn’t take these handful of high-order data entities and send it to their big topical datastores for aggregation and republishing themselves. I realize the benefit of a “standard.” However, why wouldn’t big aggregators crawl the web for content that conforms to existing, more mature standards that microformats are based on? Especially when these have often been passed through working, real world consumers of vCard/iCalendar before making it to production.

Anyway, here’s the question I want to put into the reader’s mind: should one spend time and effort making a frontend into an informal API through microformats, or to instead spend it on building a fully supported API or data publishing system that exists and operates separately? I think my stance is clear – i’m not against the theory of microformats, but i’m certainly going to differ with anyone who thinks it’s practical. If you can really think all that through, and still think microformats are a good thing to spend your resources on, then by all means, give it a shot. Just don’t say you weren’t warned.

8 thoughts on “A Warning About the Real Cost of Microformats

  1. Thank you for the comment. I’ve known of the hCard validator for some time, but don’t consider that a complete testing suite.

    However, I didn’t know about the general Optimus validator that can validate hCalendar, which someone just pointed out to me. I’ve reported a bug in its hCalendar handling, so I’m going to consider that about as good as Sunbird (mozilla calendar) was at version ~0.3. It wasn’t until Sunbird 0.5 that I felt I had a good debugging companion for publishing iCalendar, and that took a very long time.

    Anyway, i’m totally willing to accept that if a rock-solid validator were available, it would make development go more quickly, but I would still classify it as difficult.

  2. I feel your pain, but those arguments you’ve presented are more tied-in with your publishing platform than the concept itself. Still, they shouldn’t be ignored. 😉 I’ve seen people implementing hatom in front of me, using Smarty templating engine, in a matter of minutes. Of course it varies from project to project, but keep the eye on the ball. 😉

    These things are based in (X)HTML and they aim at embedding semantics where they’re being shown to the user… so external files representing data has other places, and noone said you can’t use them all at the same time (microformats + ical + vcard + rdf + owl + ___)

    As for validating… As a rule-of-thumb, I’ve used Operator a lot as validator and it’s treated me good, so far.

    This brings me to yet another point. I think you’ve overlooked the added value to the user. I know user-agent support isn’t half as good as it should be by now, but there are efforts both on the Mozilla camp as well as in the Redmond team. And why not allow power-users to act on your data much quicker and, even, use it to tie in their data spread across the web? I think the usage of XFN on Google Profiles are a good example of how simple implementations of microformats can add a good deal of value.

    Given all this, I think if you simplify and apply them where they’re most suitable, you’ll run into few shortcomings. For instance, recurring events are a bit of a stretch for some parsers, but that doesn’t render the actual event useless.

  3. It seems to me that all the problems you’ve listed could be avoided once the detailed specifications are agreed on. The microformats project is work-in-progress and most of the issues listed here are ones that are being discussed and addressed. It will of course take time but the issues you’ve raised don’t strike me as sufficient enough reasons to abandon what is essentially a good idea.

  4. I think I agree with you. I’ve always supported Microformats and will continue to do so. However I think that adding any sort of semantics to your markup is beneficial, even if those semantics are broken Microformats.

    I’ve always taken the stance to use useful Microformats. By this I mean, can a Microformat be used by someone of average computer skill. The only one that could fit into this category is hCard. Still, though, the user would have to know it exists and use the appropriate browser plug-in.

    Microformats aren’t for general consumption yet and who knows if they ever will be.

Comments are closed.