Folksonomies Grow Out of Personal Value

I saw a well-investigated article on tagsonomy which had a tiny snippet of a quote that i’ve personally been wanting to expound upon for quite some time.

“Provided a tagging system is mainly for personal value, with social value as a seocond-order benefit (as del.icio.us is), then scale increases varibility and reduces the constraints of consensus.” – http://tagsonomy.com/index.php/dynamic-growth-of-tag-clouds/

It’s my personal belief that tagging systems which provide personal mnemonic value are inherently more likely to succeed in scaling into a community tool of usefulness. I personally love del.icio.us because I bookmark things with my own personal view of the world, rendered through tags in my own words. Browsing through my tagspace is similar to browsing through my personal memory while having helpful navigation aids. Whenever I need to find a link, I can typically remember at least one tag that I used when I filed it away.

In my opinion, trying to provide users with some of the more advanced tagging features, such as boolean searching , full tag stemming, or tag categorization, does not promote growth in the early phases of a folksonomy, and hence, is not as useful for the creation of a folksonomy. It’s not to say that these features would not be useful on a full, lively, and robust tagged database. But there’s a reason you’re only seeing related tags and tag combos on del.icio.us recently.

For that initial phase, herd mentality is enough to get people into consensus about sharing certain tags, and people are more likely to tag at that first moment that they sense their data will have a long, useful shelf life on a particular database. That’s when you see tagging systems succeed – when people care enough about their metadata to really work hard on it. At that stage, when a user carries a long-term view of their data, it makes sense to tag as a mnemonic device. Once a critical mass has been reached, you start to see more socially-aware tag groups arise, such as the mattsrecumbentbike fad.

Do you know the way that little kids try to jog the memory of others? I grew up with three younger siblings, and i’d constantly hear social requests for memory retrieval (wow, i’m a nerd) that might go something like this:

“Do you remember when we saw that thing that went ‘boom?!’”

And, as an older sibling, you’d have to remind the younger kid that she wasn’t specific enough, and that you’d need more details to know what she was talking about. The younger kid might not fully remember the thing in question, but they remember at least one characteristic of it, even if outsiders have no way to discover the content from the tag context. The way that mnemonic tagging systems work is very similar.

The mnemonic tags that you use are absolutely going to be different than the ones that I use for a particular item. These personally-oriented mnemonic hooks into long-term data dumps of our lives are what form the basis of tagging systems in social software. It’s my opinion that any project that wishes to implement Freetag, or a similar tagging feature, should plan on taking a deep sip of the web 2.0 kool-aid, and figure out how to get their users to take a long term view of their personal identity and data stored within the project.

Freetag v0.220 Released! Tag combos, here we come!

It’s been a few weeks since an update of Freetag, and I’ve got a couple of really nice features built into this one. If you’re just hearing about this now, Freetag is an open-source tagging and folksonomy module that you can use to retrofit existing PHP/MySQL applications with tagging functionality like del.icio.us and flickr.

Powerful constructor options

The first, and really most important feature, is one that should have been in there from the beginning. You can now pass in an array of options to the constructor of Freetag, to customize the instance to your database parameters. It allows you to essentially build all of the instance-specific stuff into your own Freetag instance, instead of having to re-edit the class file every time you download a new version.

In addition to making upgrades work more smoothly, I’ve abstracted out the characters that the normalize_tag function uses to one of these constructor parameters. That means you can pass along your own preg_replace-compatible normalization filter. This allows you to effectively allow underscores, spaces, etc. in your normalized tags, and theoretically will also work for including high-ASCII from other charsets, like accents in Spanish and French.

Tag Combos

The second major feature is one i’ve been sitting on for a while, because I wanted to get it working nicely with the user-specific tagging schema in Freetag. Tag combos in Freetag allow you to get all objects tagged with a combination of an arbitrary number of tags. This allows Freetag-based sites to allow ‘drilldown’ browsing through tags – so you’ll be able to browse your tags for something like “la+dining”, which is super useful.

I owe Kellan some thanks for pointing out the link to Peter Cooper’s snippets sample of tag combos. I took his code as a starting point, and worked it into the Freetag schema, so that now it works well, plus it supports restricting to a particular user ID’s set. That should be very useful for browsing into especially complicated personally tagged datasets by highly metadata-prolific users.

What’s next?

I’ve received some requests for performance numbers, so i’ll be spending my next chunk of time prepping a stress testing suite that should give people a good idea of how Freetag will perform on your environment. If you’ve got some ideas about this, please post your suggestions to the Freetag mailing list!

Other than that, now that tag combos are part of Freetag, I’ll start brainstorming about a full boolean tag logic function that Leonard and I were tossing ideas back and forth about tonight. I can think of one or two ways to implement this with poor performance, but i’m really looking for a MySQL 3.X-compatible solution that will have reasonable performance. I may just include the ‘dumb’ versions in the next release, and save the optimizations for later. If you’ve got an opinion, chime in on the mailing list and let me know what you think.

Freetag v0.210 Released! Now with Related Tags ability!

I just released Freetag v0.210 – i’ve released mostly minor improvements / bugfixes so far, but this one also has a new function that is pretty cool.

Myles Grant contributed a similar_tags() function that accepts a tag, and spits back at you other related tags in descending order of how good a relation there is. It’s pretty awesome, and i’ve got it working on eatlunch.at and in beta code for Upcoming at the moment. It really makes browsing the tagged world of your database pretty compelling. Brian Del Vecchio recently explained the difficulty of stemming tags on the system level instead of extracting related tags from the correlations in the database. Well, here’s one way of doing it. :)

Think of it as an Amazon-like “Other users who bought this also bought…” kind of feature. Except this is more like, “Many other objects tagged with <Source Tag> were also tagged with <Related Tag>.”

Brian’s original posting was in response to the new blog Tagsonomy’s discussion of overcoming tagging interface challenges. The key problems it describes are identifying similar tags, and providing standardized, usable delimiters for user input. I’ll have more of a discussion of this in a separate post, as this one is already getting kinda long. But if you’re wondering about how these problems are addressed in Freetag, here’s a quick summary:

Freetag’s current delimiters are spaces, with quoted phrases accepted. All raw tag items are normalized into alphanumeric-only strings. Most API functions return the raw tag and normalized tag in each tag entry for display in either format. Related tags are now dynamically extrapolated from correlations via the similar_tags() function.

I believe the original algorithm idea is from Snippets here: http://www.bigbold.com/snippets/posts/show/34. The query was modified from subquery syntax to a self-join on the freetagged_objects table in order to support MySQL 3.X, and it seems to work pretty fast on Upcoming’s database with the indexes from v0.202.

The concept of finding related tags by seeing what tags occur in relation with each other is not particularly new, but I think having that available in an open-source tagging platform is kinda neat.

Enjoy!

» Freetag project page, with Download link.

New Freetag Implementation Guide

It seems like the #1 problem with releasing Freetag is that it’s confusing to understand exactly what Freetag does (or is meant to do).

I’ve taken a stab at writing up a more gentle introduction to Freetag, along with some sample code that I kinda wrote without testing (but it’s derived from production code, I promise). :)

It explains the basic concepts behind attaching Freetag to an existing schema, and implementing the a basic tag and display page on your application details pages. It also has a nice discussion of Raw Tags vs. Normalized Tags that you might not get elsewhere.

The new Freetag Implementation Guide

I’m looking forward to hearing your feedback on this doc, as I spent a few nights getting the formatting nice and pretty, and trying to keep the explanations simple. Feel free to leave your comments on this post. If you would like to see the sample app released in code format, put your voices together and i’ll see what the demand looks like. I’m just super busy lately, but it might be worthwhile if it can really help you implementers out.

Grodon out.

Freetag – an Open Source Tagging / Folksonomy module for PHP/MySQL applications

Tagging and folksonomy are pretty popular terms nowadays. In a nutshell, the concept is simple – allow users to describe content with descriptive words and phrases, or tags, that can then be used as an informal categorization system known as a folksonomy. The model of tagging has really caught on strong with the blogosphere zeitgeist (as well as tech investment), and for good reason: it works, so far.

Freetag is a module that implements a simple, fairly robust beginning of a tagging and folksonomy system. It works with PHP4 and MySQL 3.23 so far, and i imagine that if it gets popular, it should be easy to port to additional databases and/or languages.

You point it at your users, then you point it at your objects, and use the API to allow your users to tag your objects.

Eventually, it will be extended with a RESTful interface, so that you can plop a gateway PHP script somewhere and get instant access and inter-operation between any applications that use Freetag.

I’ve wanted to do this since my star-studded lunch group started thinking about integrating tagging between sites, and I started off on my crazy plan of creating a lightweight way to decide where to go to lunch. I felt that any site I was going to spend effort in developing should have some sort of tagging / folksonomy support, so i happily generalized the Freetag PHP class in preparation for a time to use it in other projects.

Freetag is now in use in at least one high-profile project – Upcoming.org, where I integrated Freetag within about an evening and a half. Since then, there have been some big improvements to the class, and you can see it in action behind the scenes over there.

For the technical details, head over to the Freetag project page. Or browse through the Freetag API Documentation.

It’s my first ever big contribution to the world of open source, as Freetag is released with a dual BSD and LGPL license. I hope you get some mileage out of it!

Thanks go out to all the people who I bounced ideas off of – Andy Baio, Phil Fibiger, Greg Knauss, Leonard Lin, Christian Newton, the users of eatlunch.at, as well as John Lim, on whose ADOdb for PHP layer I always rely.

Update! I had to make a small change to the library to fix a problem with quoted tags that someone found on Upcoming. I’m sure there will be many more problems found, now that the code is out there, but you should grab the tarball again to get version 0.201, because this is a really annoying bug. Sorry about that! – Sunday 4/17/2005, 10:00 AM.