301 Redirects: The Horror That Cannot Be Uncached

If you’re not a web developer, please ignore this post, as it won’t make any sense to you. :)

It never occurred to me that 301 on-domain redirects might be a bad idea for passing along browsers/search indexers to new content URLs, but here’s a use case where it becomes a big problem: when the person in charge changes their mind and wants to revert anything that you modified with a 301. I felt like an idiot for getting stuck in this technical snafu, and the workaround feels horrible, so let me explain.

Let’s take a simple example. Say you’ve got a page with information about widgets, and your task is to redesign the widget page. You see that it’s called content_123.html. Obviously, this isn’t a good URL, so you would like to guide clients correctly to a new URL, like “/products/widgets”. You heard that 301 redirects pass along PageRank after a little delay, and it’s better than serving a 404, right?

Well, here’s where it gets hairy. A few weeks later, the person in charge changes their mind and instructs you to revert the site to its original form. Easy, right? Just remove the 301 redirect from .htaccess, restore the old files, and you should be good to go, right?

Wrong. 301 redirects are cached for certain browsers, such as Firefox 3.5+ and Chrome. That means that a large set of users that visited your site, along with search engines, cached that mapping of content_123.html to “/products/widgets”. Even though your webserver is no longer instructing clients about any 301 redirect, any browser (and likely all search engines) have now saved your 301 redirect. You can’t create the reverse mapping either, because browsers and search engines are usually smart enough to avoid infinite loops, so they’ll just ignore the new instructions. God forbid you redirected anything to “/”, too.

If you’re lucky, this is an intranet app and you can instruct all users to manually clear their browser caches. If it’s public and this was a big redesign, your users may never even be able to get to a valid page if you just dump the old files on the webserver and they’re all redirected via 301s.

The big issue is this: there is no way to tell a browser to clear out or undo a 301 redirect. You have to wait for a browser user to clear the cache, or have to wait for it to expire. This is totally unacceptable from a user’s point of view, so here’s a super-ugly way to workaround the problem.

  1. Put legacy content back.
  2. Eliminate all 301 redirects from your .htaccess / mod_rewrite config. Might as well stop causing damage first.
  3. Rename legacy file (perhaps append something standard), like content-123-orig.html
  4. Create new mod_rewrite rules to do 302 redirects from the original legacy URL to the new renamed URL. This will redirect all existing links from the legacy site to the old URLs, for any browsers without the cached 301 redirects, such as new visitors or users who clear caches.
  5. Create more mod_rewrite rules that do 302 redirects from the 301 redirect targets (the “new” urls that are being moved away from). This will redirect clients that were using the new site, and also will serve the correct page for clients with a cached 301 redirect – for example, browser A cached the 301 redirect, and so when you type in /content-123.html in its address bar, it instead tries to load “/products/widgets.html”. Because of the new 302 rule, it will report that “/products/widgets.html” has been moved temporarily to “/content-123-orig.html” and the user will load the legacy page contents.

This is obviously a really horrible workaround, but changing your mind is something that normal humans do in the real world, and irreversible changes deserve more attention. It’s embarrassing to post a “solution” like this, but if you run into this problem I’d rather save you some time than save me some face.

If you’re considering going with 301 redirects in a move to a new URL structure, be aware that you’re moving down a one-way street. It makes HTTP 301 seem awfully out of place in a spec that is otherwise quite comprehensive about downstream caching of content. Caching redirects seems like a logical behavior when the spec says that the change is permanent, and I suspect the main reason that this hasn’t caused more visible problems yet is that the “permanent” part has been ignored by browsers until recently.

Going back

In case you haven’t heard, I’ll be starting to work remotely from the Los Angeles area in only another week. I’ll be continuing to work for Yahoo!, as there are still plenty of projects that I want to work on. Unfortunately for me, this happens to be an extraordinarily busy time. However, i’ve got to follow my priorities, and it’s time for me to move closer to my long-distance girlfriend and try out close-distance for a while. :) It’s going to be hard to stay on top of everything, but knowing myself, a good challenge and probably a beneficial change of scenery.

I’ll be back pretty frequently, so hope to see all my Bay Area people all the time. Zankou Chicken, here I come!

Yahoo Hack Day is on!

Yesterday was the beginning of Yahoo!’s first open hack day. There are so many good things about this, that it’s hard to describe. The essence of the invitation is that we’ll invite hackers to our campus for 24 hours, feed them, show them a good time, and teach them about the services we offer to programmers, in exchange for them working on a project that involves at least one of those services. I do believe that many of the higher-level folks at Yahoo see it as sort of an experiment, to see what good comes out of the process of inviting hackers from all over the world to come and play with Yahoo API’s.

This could be the first time a lot of folks take the time to play with a lot of our API’s, which basically means that the stuff that we do better than our competitors may have a chance to shine in the spotlight, in front of the audience that matters most. Much of the mindshare that Google has captured through applications like the GMaps API, etc. has been held because of the nature of convenience. Once a coder builds an application on top of a specific interface, switching to another API requires some real motivation. So, open hack day could result in much more experience using Y! API’s for the hackers that build stuff, and that is something that has long-lasting effects for web-as-platform.

Beyond that, it also shows that Y! is an active patron and benefactor of hack culture. Open Hack Day couldn’t exist if it weren’t for the unprecedented coordination and hard work between all the groups within Yahoo (all corporate folks, people!) that made it happen. Hack culture often begins with a single idea being so strong that a small group of people can build up a tidal wave of support and interest in a short amount of time.

Oh, and did I mention that Yahoo! had Beck over as a secret musical guest? It was an unbelievably good concert, and it made me really happy to see this blog post from one of the puppeteers. Apparently he got a little confused over the purpose of the hack day, but I was stoked to see that he thought it was one of their best shows ever.

A few things that I found myself thinking: for one, I can’t really see any other company than Yahoo! ever doing something like this. Sure, I hear about firms like Genentech putting on holiday parties with huge summer festival lineups, but that’s all for its employees. The openness required for a company to release huge amounts of data and functionality via its API’s is rarely mirrorred in a corporate culture that also invites non-employees over to hang out, code, and dream for a weekend with them.

It can sometimes be tough to work in a corporate environment. But it’s things like Open Hack Day that make me glad that environment is at Yahoo.

Upcoming.org API at Yahoo! Open Hack Day

If you’re living in the Bay Area, and you’re interested in web development, you really owe it to yourself to make it to my employer’s Open Hack Day. In the afternoon, I’ll be giving a brief talk about how to build out quick hacks using the Upcoming.org Events API. You can check out the full schedule of Friday presentations, and if you decide to come, send a request to attend.

I’ll be around for most of Friday evening to help out with any Upcoming.org hacks, and also probably trying to play some Guitar Hero from time to time. :)

Dimensional Fund Advisors

So, I started my job today at Dimensional Fund Advisors. The first day was both overwhelming and exciting, and I had a nice coffee break tonight thinking about what I intend to accomplish while i’m there.

My last position at Sequoia Broadband gave me the space to fully develop my work ethic and work a great deal on practical solution building. Not ‘solution’ in the BigTime Consulting way of speech; instead, ‘solution’ meaning engineering working systems that actually do what the client needs them to do. I explored and got heavily involved in the core technology at Sequoia, and lent contributions to the software and process based upon direct experience using and operating it. It was a valuable, mutually beneficial experience.

At Dimensional, I intend to take the development of my personal services a step beyond solid work ethic and the ‘practical solutions’ attitude, and truly develop a stateable career philosophy of personal service. As a standalone phrase, it sounds like a freshly pinched loaf, but let me elaborate.

A strong work ethic does not necessarily indicate a strong sense of direction. As a Horse in the Chinese zodiac, i’m supposedly inclined towards hard work, inner strength, and dedication. That’s part of my nature anyway. What’s not so easy is for me to make business, career, and even life decisions based upon a defined core philosophy. By identifying and stating the axioms of my own personal philosophy in words, I aim to begin the process of tackling this problem and thereby increase the value of my personal services.

Rhetoric aside, it’s an aggressive move for me. I believe that the value of being able to honestly state my own philosophy and then build my career around its principles is tremendous. The opportunity to accelerate that process is what drew me strongly to Dimensional. The rhetoric leads me to believe that its founders began the company from a philosophical premise, and built the company and its products around that philosophy. This is the effect I desire to learn and replicate for myself.

I find that it’s better to proactively set the expectations on oneself and then rise to meet them, than to wait for someone else to “set the bar” for oneself based on some myth of an industry average. Being of an entrepreneurial mindset, it’s the closest mental model that I can build to the thrill and potential of starting my own company.

Although I most certainly won’t discuss my new job directly, I hope to use this blog as a professional journal of the abovementioned process. It goes without saying that all opinions on this website are my own only, and do not reflect in any way the opinions of any of my employers, current or past (Note to self, it might be time for a disclaimer footer). It would be interesting to see how it could be a tool to accumulate and refine my philosophy. I’ve tried journals before, but you don’t get a search engine on a steno pad.