I just released Freetag v0.210 – i’ve released mostly minor improvements / bugfixes so far, but this one also has a new function that is pretty cool.
Myles Grant contributed a similar_tags() function that accepts a tag, and spits back at you other related tags in descending order of how good a relation there is. It’s pretty awesome, and i’ve got it working on eatlunch.at and in beta code for Upcoming at the moment. It really makes browsing the tagged world of your database pretty compelling. Brian Del Vecchio recently explained the difficulty of stemming tags on the system level instead of extracting related tags from the correlations in the database. Well, here’s one way of doing it. 🙂
Think of it as an Amazon-like “Other users who bought this also bought…” kind of feature. Except this is more like, “Many other objects tagged with <Source Tag> were also tagged with <Related Tag>.”
Brian’s original posting was in response to the new blog Tagsonomy’s discussion of overcoming tagging interface challenges. The key problems it describes are identifying similar tags, and providing standardized, usable delimiters for user input. I’ll have more of a discussion of this in a separate post, as this one is already getting kinda long. But if you’re wondering about how these problems are addressed in Freetag, here’s a quick summary:
Freetag’s current delimiters are spaces, with quoted phrases accepted. All raw tag items are normalized into alphanumeric-only strings. Most API functions return the raw tag and normalized tag in each tag entry for display in either format. Related tags are now dynamically extrapolated from correlations via the similar_tags() function.
I believe the original algorithm idea is from Snippets here: http://www.bigbold.com/snippets/posts/show/34. The query was modified from subquery syntax to a self-join on the freetagged_objects table in order to support MySQL 3.X, and it seems to work pretty fast on Upcoming’s database with the indexes from v0.202.
The concept of finding related tags by seeing what tags occur in relation with each other is not particularly new, but I think having that available in an open-source tagging platform is kinda neat.