Freetag v0.210 Released! Now with Related Tags ability!

I just released Freetag v0.210 – i’ve released mostly minor improvements / bugfixes so far, but this one also has a new function that is pretty cool.

Myles Grant contributed a similar_tags() function that accepts a tag, and spits back at you other related tags in descending order of how good a relation there is. It’s pretty awesome, and i’ve got it working on and in beta code for Upcoming at the moment. It really makes browsing the tagged world of your database pretty compelling. Brian Del Vecchio recently explained the difficulty of stemming tags on the system level instead of extracting related tags from the correlations in the database. Well, here’s one way of doing it. 🙂

Think of it as an Amazon-like “Other users who bought this also bought…” kind of feature. Except this is more like, “Many other objects tagged with <Source Tag> were also tagged with <Related Tag>.”

Brian’s original posting was in response to the new blog Tagsonomy’s discussion of overcoming tagging interface challenges. The key problems it describes are identifying similar tags, and providing standardized, usable delimiters for user input. I’ll have more of a discussion of this in a separate post, as this one is already getting kinda long. But if you’re wondering about how these problems are addressed in Freetag, here’s a quick summary:

Freetag’s current delimiters are spaces, with quoted phrases accepted. All raw tag items are normalized into alphanumeric-only strings. Most API functions return the raw tag and normalized tag in each tag entry for display in either format. Related tags are now dynamically extrapolated from correlations via the similar_tags() function.

I believe the original algorithm idea is from Snippets here: The query was modified from subquery syntax to a self-join on the freetagged_objects table in order to support MySQL 3.X, and it seems to work pretty fast on Upcoming’s database with the indexes from v0.202.

The concept of finding related tags by seeing what tags occur in relation with each other is not particularly new, but I think having that available in an open-source tagging platform is kinda neat.


» Freetag project page, with Download link.

2 thoughts on “Freetag v0.210 Released! Now with Related Tags ability!

  1. Peter (Snippets) here. Interesting work! I actually don’t use the subselect version myself now, as it was found to be inefficient. If you split it up into two separate queries and then use your language’s logic to tie them together, it’s a lot faster anyway 🙂 Not sure why you’re supporting MySQL 3.x though.. that’s just scary 😉

  2. Thanks, Peter! MySQL 3.X is being used for the same reason that PHP4 is being used – it’s a stable config that a lot of people (RHEL 3 users) are still provided by default. I wanted Freetag to be used by as many web devs as possible, so I built it for an old platform, plus I distributed it with a dual BSD / LGPL license to support that. Also, subselect query performance in MySQL still leaves much to be desired! If I run into scaling issues with this function i’ll certainly try out populating the original IN clause with results from another query.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.