Small change to URL processing

As many of our members know, our network uses edge technology (reverse proxies) to deliver static content at high speeds.

One thing we have always done is ensured that URLs that contain ? or .php or .cgi could not be processed by the reverse proxies. In order to provide more robust support for very active PHP-driven sites that need to integrate with caching functionality, we have elected to stop doing this. By default, the output of PHP and CGI scripts will still not be considered cache-eligible, but you now have the option to override this using an Expires: or Cache-Control: header if you want.

In the bad old days, the list of “don’t ever cache URLs that look like this” was very necessary to deal with badly broken browsers and web servers and scripting languages that were often written without so much as a passing glance at the RFCs.

Things have improved a lot since then, so we’re dropping this seven-year-old restriction to let people make the most of the Internet as it is now, not just as it was then. The most visible effect of this change is that it will eliminate the case where might get cached and might not, even though the same process generates the same output for both URLs.

In other words, this isn’t enabling anything that wasn’t previously possible, it’s just that you used to have to use mod_rewrite to conceal the fact that you were running a script. You can still do that, of course, you just don’t have to anymore.

We do strongly recommend the generation of caching headers in your PHP scripts where appropriate. Integration with our network edge can make an enormous difference in the performance of your site. (The edge servers exist because they are fast, fast, fast at serving static content.) This is critically important for our members with large, high-traffic sites and anyone who is willing to tinker to get the most out of their site.

As with any new and bigger gun, this change does provide an additional opportunity for people to shoot themselves in the foot. If you have manually added an ExpiresDefault setting in your .htaccess file, that setting now applies to scripts. (Well, it always has, but it was previously ignored by us, if not the browser at the far end.) If you use this setting and you haven’t overridden it and your site depends on GET requests that have side effects (which is forbidden by the HTTP standard), undesirable behavior may result.

This change tested very well, but if we find any particularly common problems cropping up related to it, we will post a followup blog entry about how to deal with them.

1 Comment

RSS feed for comments on this post.

  1. Thanks for the heads up! I always wondered how my web host was determining what to cache and what not, so it’s nice that you lay it out so plainly.

    Comment by TOGoS — April 4, 2009 #

Sorry, the comment form is closed at this time.

Entries Feed and comments Feed feeds. Valid XHTML and CSS.
Powered by WordPress. Hosted by NearlyFreeSpeech.NET.