Forwarding sites & URL rewriting

Note: the content of this blog entry is deprecated. Please search our member FAQ for “canonical” for more information about replacement techniques.

We recently got a support inquiry about alternate methods of forwarding visitors from one URL to another.

We have a FAQ entry about using “decoy” sites to forward alternate URLs. This is one method, and probably the easiest, but there are many others that can be useful in different circumstances.

What I’d like to do here is talk about the reasons why we recommend this one and discuss some of the alternatives and when they might be more useful.

The premise of URL forwarding is that a web site should have a single canonical name. Other names might exist, but they’re usually alternate names, or variant spellings, or similar. These names might be common typos you want to catch, easier to type, or whatever. But the actual site accesses should use the canonical name, because that’s best for search engine rankings and long-term recognition of your site.

By far the most common case is when you have a domain name, such as example.com, and you want to use http://www.example.com/ and http://example.com/ to access the site, so that’s the case I’ll focus on here. I’ll also assume that www.example.com is the canonical name of choice. (The temptation to use the bare domain can be very high for some people, but see this FAQ entry for a discussion of reasons why we recommend against it.)

Our FAQ entry suggests creating a forwarding site, assigning example.com to it as an alias, and creating a one-line .htaccess file in its htdocs directory:

RedirectPermanent / http://www.example.com/

We generally recommend creating an alternate site because this is the most efficient approach. With this method, people who enter one of the “wrong” names hit the decoy site and get redirected to the canonical name right up front, and there’s zero extra overhead on any subsequent requests. It’s also really easy to do, and it’s impossible to screw up your working site while setting it up.

Second, we recommend the RedirectPermanent Apache directive for two reasons. The “permanent” redirect code (301) helps keep search engines and the like from continuously trying to index the “wrong” name instead of the right one. Also, the Apache Redirect family works for subordinate URLs. One of the most common redirection mistakes is a setup that will correctly redirect http://example.com/ to http://www.example.com/, but won’t redirect http://example.com/something.html to http://www.example.com/something.html. RedirectPermanent is a one-line solution that handles such cases flawlessly.

One thing that’s come up a couple of times is that once you have a forward for one alternate name, you can add as many other alternate names for the same site as you want. This is easily done by just adding the alternate aliases to the forwarding site.

Now, let’s take a look at an entirely different approach. The ultimate tool for URL rewriting is, of course, the Apache rewrite module, mod_rewrite. It’s possible to use it to accomplish the exact same thing, but without the use of a second site. To do so, you just create an .htaccess file in the site root containing the following:

RewriteEngine on
RewriteCond %{HTTP_HOST} !^www.example.com$ [NC]
RewriteRule ^.*$ http://www.example.com%{REQUEST_URI} [R=301,L]

To summarize how this works in a nutshell:

1) “RewriteEngine on” is needed to enable mod_rewrite.

2) RewriteCond applies a condition under which rewriting will occur. In this case, we want to rewrite if the %{HTTP_HOST} (the content of the HTTP Host: header, which is where the requested domain name lives) does not (the !) match www.example.com from beginning to end (the ^ and $, respectively), without regard to uppercase or lowercase (the [NC]). So this will skip www.example.com and www.ExAmPlE.com but not example.com or any other alias you might want to add.

3) This RewriteRule applies to the whole URL, no matter what it is (the ^.*$) and changes it to the correct name (the http://www.example.com) while keeping the same URI path (the %{REQUEST_URI}) and sends it back to the client as a 301 redirect (the R=301) and skip any other rewrite rules (the L).

The possible big advantage for this method is that it effectively suppresses the use of the default site names we provide (example.nfshost.com), if that’s important to you. It also doesn’t require the existance of an alternate site, if you find that distasteful for some reason. The disadvantage is the extra overhead, which applies to absolutely every request, whether it’s using a URL that needs rewriting or not. (Also, as I personally don’t care for mod_rewrite I consider the use of it a fundamental drawback, but that is more bias on my part than a viable objection.)

The specific question we got was about creating a rewrite site that can handle multiple independent destinations. In this situation, you have a large number of names to rewrite to as well as from. In other words, not just redirecting http://example.com/ to http://www.example.com/ but also redirecting http://other.com/ to http://www.other.com/, all from a single site. RedirectPermanent isn’t smart enough to handle that, but mod_rewrite is.

For this approach, create one “generic” forwarding site from the “sites” tab, and create an .htaccess file in that site’s htdocs directory that looks like this:

RewriteEngine on
RewriteCond %{HTTP_HOST} ^example.com$ [NC]
RewriteRule ^.*$ http://www.example.com%{REQUEST_URI} [R=301,L]
RewriteCond %{HTTP_HOST} ^other.com$ [NC]
RewriteRule ^.*$ http://www.other.com%{REQUEST_URI} [R=301,L]

You can add more sites (to and from) by repeating the RewriteCond and RewriteRule lines. The only real difference between this and the previous example is that we ditched the ! in the RewriteCond lines, meaning that instead of applying to names that don’t match, we apply to names that do.

So that’s three different approaches to the same problem. All work, but they are useful in different circumstances. We can’t document everything in our FAQ, but I’m hoping that one of the uses of our new blog will be the opportunity to explore alternatives like this that might be helpful to our members.

2 Comments

RSS feed for comments on this post.

  1. A third rather useful option is to use a regex in the mod_rewrite that takes any non-www host name and rewrites it with the www (or vice-versa, depending on personal preferences). Then point all of your non-www aliases at that site, and it will automatically forward to the www version. This saves you having 20 different redirect sites for 20 different real sites, and maintains the advantage of invoking mod_rewrite only when required.

    A suitable expression would be:
    RewriteEngine on
    RewriteCond %{HTTP_HOST} !^www\.(.+\..+)$ [NC]
    RewriteRule ^.*$ http://www.%1%{REQUEST_URI} [R=301,L]

    Given that this can be applied to all of your sites, one wonders if NFSN could make two sites that we could all CNAME to, one that adds “www” if required, and one that removes “www” if present, to match the most common of these cases. Then the really simple thing to suggest people do is point the alias that they want changing at the appropriate NFSN-wide site.

    Comment by mojo — November 30, 2006 #

  2. I don’t think that your rewrite example will work, because you’ll only get a populated %1 only if your condition matches, which it won’t unless it doesn’t match www.(something), but the (something) implies the www.

    Also, CNAMEs won’t work for this, because cannot create a CNAME on a bare domain name because a CNAME must be the only record for a given name. Bare domain names always have an SOA record, which rules out the use of CNAME.

    It’s probably possible to do what you describe entirely in mod_rewrite, but there’s some kind of special psychological term for people who hurt themselves on purpose.

    So I did it using a combo of mod_rewrite and PHP:

    RewriteEngine on
    RewriteRule ^.*$ /index.php

    $strURL = "http://www.{$_SERVER['HTTP_HOST']}${_SERVER['REQUEST_URI']}";
    header("Location: {$strURL}");
    print "Redirecting to <a href=\"{$strURL}\">{$strURL}</a>";

    You can use this functionality yourself, but I also liked the idea that we could just provide it since it’s so common. So, we’ve added an “Add Bare Domain Forward” option to our DNS management panel. It’ll appear if the bare domain name (e.g. example.com) is not in use and the www version (e.g. http://www.example.com) is.

    I don’t see us doing the reverse, as it would eat another IP address and it’s far less common. But it’s still easy to do on your own, using any of the techniques described here.

    Thanks for the feedback!

    Comment by jdw — December 1, 2006 #

Sorry, the comment form is closed at this time.

Entries Feed and comments Feed feeds. Valid XHTML and CSS.
Powered by WordPress. Hosted by NearlyFreeSpeech.NET.

NFSN