Comments on: A PHP Include Exploit Explained

By: James

James — Sat, 28 Nov 2009 09:29:44 +0000

There’s at least two reasons in my mind *for some people* not to use an existing CMS. The first is that a lot of the existing ones are fairly monolithic and therefore on NFSN will take up more disk space and so forth, increasing costs. The second is that some of us write our own because that’s what we want to do – whether as a learning experience or just because we can.

As demonstrated in this blog post, though, if any of us intend to roll our own we really need to pay attention to things like this!

By: Douglas Muth

Douglas Muth — Tue, 17 Nov 2009 20:49:47 +0000

I've been a full-time PHP programmer for the last 10 years or so. What it comes down to is that writing "infrastructure" is hard. There are lots of ways to screw it up and have Bad Things happen. Not to mention that solving the same problem over and over for every new project gets old quickly. This is one of the reasons I advocate using frameworks or CMSes for websites. Let an existing product do the heavy lifting so that you can concentrate on writing your business logic. There are literally dozens of free CMS packages out there, and no reason not to start using one right away. (my personal favorite is Drupal, YMMV)

By: Don Delp

Don Delp — Wed, 11 Nov 2009 17:34:49 +0000

Jdw, thank you for posting this.

I used to work in web hosting and it breaks my heart to see exploited code like this everywhere.

It got to the point that our sysadmins modified the servers to include the url of the script that sent any mail message, making it easier to find and disable. Any mail sent from php included a header similar to:
X-scripted-mail: web42.example.com/~nesman/htdocs/includes/forms.php

I’ve seen so much of this that it makes me a very careful coder but I always worry that my best efforts won’t be enough. After all, the people that get exploited thought they had done enough as well.

By: Will

Will — Tue, 10 Nov 2009 16:36:56 +0000

While the principle of validation over “fixing” is certainly the most security-rigorous solution, insisting on it ignores the previous acknowledgement that the exploit as a whole is something that small-site owners should be equally concerned about fixing. If being probed for the exploit is a web inevitability then you probably don’t want to bother with logs telling you that it’s happened. Early home desktop firewalls used to alert you every time someone on the internet probed you for an exploit you were protected against, but we quickly learned that most users have no use for this information.

Reading this blog post has drawn my attention to one old site that “validates” using:
if(file_exists(‘./include/’ . $_GET[‘page’] . ‘.php’))
Now it turns out that this is vulnerable as far as traversing to the parent directory and try to execute index.php. But since this is harmless, unless there’s a counterexample that allows remote inclusion even if the include path starts with ‘./’ then there’s not much incentive for me to fix it.

Again, the purpose of our example is to be a good example of good security to people who may not have seen one before. If you look at it with the attitude of “security — why bother?” you probably won’t get much out of it. It only takes a couple of extra lines written one time to get it right.

With respect to your example, the exploit comes not from remote code, but local uploads. Suppose you have a forum website that allows users to upload avatars. Someone uploads their PHP code as myavatar.gif and then calls your script with &page=../../forum/images/avatars/myavatar.gif and 10,000 people you’ve never heard of get phishing mails queued from your site.

As far as incentive, you are absolutely right. The people who make this mistake aren’t the ones that suffer for it; we are. If “it’s the responsible thing to do” isn’t a good enough reason, I expect we will have to find a way to start charging for the hours we spend each week cleaning up the resulting messes. -jdw

By: dfl

dfl — Sun, 08 Nov 2009 22:51:40 +0000

A couple comments about this:

I always validate user input to include() [although I don’t have anything hosted on NFSN that needs that right now], but typically my validation consists of looking for slashes rather than your more proactive solution; since you can’t do malicious URL fetches without slashes, what benefit is there to getting rid of anything with any characters you might not want at the risk of characters (underscores, dashes, Unicode) that you might need later?

Also, if you’re just using a PHP script to put headers and footers onto otherwise static HTML, it might be more worthwhile to do the HTML generation statically, with (say) a Python-script that takes in a folder of flat files and spits out static HTML files. (This is what I do.) Not only does it entirely remove the risk of making a code goof and ending up with an exploitable site, but it also saves a few cents by allowing you to deploy your site as static.

The include($_GET[“…”]) approach is based on mapping a portion of a URI to a portion of a filename, so you want to make sure you limit yourself to a subset of characters that work well in both places. Sure, if you want to allow other characters, go ahead. Underscore and dash probably won’t cause havoc. Unicode certainly might.

Static generation is always a great approach, but the include($_GET[“…”]) approach only has any appeal at all because it’s so easy. Static generation tends not to be, hence doesn’t appeal to the same crowd. -jdw

By: Paul Grill

Paul Grill — Sun, 08 Nov 2009 16:21:01 +0000

I usually have an array like this:


 'home.php',
		'about' => 'about.php',
		'events' => array(
			'announcements.php',
			'calendar.php'
		),
		'removeCookies' => 'functionA'
	);

if(empty($_GET['page']))
	$_GET['page'] = 'home';

if(isset($pages[$_GET['page']])) {
	$actions = $pages[$_GET['page']];
	if(!is_array($actions))
		$actions = array($actions);

	foreach($actions as $action) {
		if(is_callable($action))
			call_user_func($action);
		else
			@include("{$_SERVER['NFSN_SITE_ROOT']}/protected/pages/{$action}.php");
	}
}else{
	echo 'The command you requested is not allowed.';
}

## PUT YOUR SITE FOOTER HERE
?>

This allows you to keep an "instant mini-CMS" while circumventing the include exploit. Moreover, you are way more flexible -- when you take a closer look at the $pages array you'll see that $pages['events'] has two files associated with it and $pages['removeCookies'] refers to a function (which will be called if it exists). What I'm doing is following a strict whitelisting policy - The user only gets to do what has been whitelisted. This way you don't have to worry about attackers breaking out of folders or injecting malicious scripts. I'm a big fan of the whitelist approach, and we almost mentioned it, but it does take extra effort on an ongoing basis since you have to touch that file every time you add a page. NearlyFreeSpeech uses a similar approach internally, and it's well worth it, but the extra work still makes it a "no sale" for a lot of people. -jdw

By: Brad

Brad — Sat, 07 Nov 2009 01:02:53 +0000

While I don't do any custom PHP code on my websites, I really appreciate your blog posts like this one. They are always insightful, and it's nice to know not only a bit of what you guys are seeing behind the scenes but also that there are actually people behind the scenes doing these types of things. I know the second point seems like it might be obvious, but it can be easier than you think to forget. I hope you post more blog entries in the future.

By: jdw

jdw — Fri, 06 Nov 2009 00:30:33 +0000

Eric, I disagree with the design of your "straightforward" example so fundamentally that I almost didn't approve your comment; I feel it's bad advice and would hate to see anybody say "well that example is shorter and he says it's just as good, I will use that." However, I did approve it so I could go over why I consider this line of thinking is flawed, because I recognize that a lot of people do think that way. Maybe I can talk some of them out of it. :-) First, to the extent your approach is "straightforward," that's because you omitted any form of error detection, validity checking, or problem reporting. All you've done is changed a call to preg_match() to basename() and deleted basically everything else. Properly implemented, your example would be exactly the same size and complexity as ours. Conversely, the example code can be equivalently oversimplified:

$template = isset($_GET["page"]) ? preg_replace("/[^A-Za-z0-9_]/", "", $_GET["page"]) : "default";
include "../protected/templates/$template.php"; // You should run file_exists() on this path, and 404 out if it returns false

However, my position is that your example cannot be properly implemented because it discards vital information. basename() is designed to return the filename component of a partial or full pathname, not to validate user input (which is ultimately what a URL is). That you get a "safe" result is kind of a side effect of misusing this function. Stylistically, I have a problem with that, but that's a matter of opinion and I respect that others' may differ. In any case, you're really just using basename() as a substitute for preg_replace() to save a few characters. But notice again that the example does not use preg_replace(), it uses preg_match(). That is because the goal of that step is not to fix the input, it is to validate it, so appropriate action (e.g. logging, alerting) can be taken if a problem is found. basename() and preg_replace() throw away that information; they don't validate anything, they just silently pave over a lot of problems, both accidental and malicious. That basically helps people who attack your site hide their tracks. Which is bad. There are other problems too, like the way that approach creates an infinite number of "valid" alternate URLs that all produce the same content, but for me, voluntarily cooperating with attackers is the big one. You're in a situation where you can look at the input and know positively that someone is screwing with you, why on earth would you say "I know they're screwing with me, now how can I turn their malicious nonsense into a possibly valid input on their behalf?" My web sites are important to me. When people attack them, I don't want to help, but I do want to know about it. The "blindly pave over problems" approach makes that impossible, and consequently it's a style of programming I consider fundamentally problematic. -jdw

By: Middlerun

Middlerun — Thu, 05 Nov 2009 23:10:31 +0000

But why’s it such a problem if people can access your content files over the web? Aren’t we concerned here with stopping people from including scripts from external websites? Sorry if I’m being thick here, I just don’t see the connection between the two issues.

The two issues (securing include() and properly locating secondary files) aren’t directly related, but we wanted to provide a “good role model” example of the right way to do it, and part of the right way to do it is to keep the secondary files out of the web-accessible tree. That’s just good site design and good security. -jdw

By: Eric Stern

Eric Stern — Thu, 05 Nov 2009 16:49:04 +0000

While your solution of not creating the problem in the first place is certainly the most effective, it's not always practical - especially if you have a lot of pages to manage (if you have a major page layout change, for example, you have to change dozens of files with includes rather than the one index.php file). Handily, PHP includes the basename() and pathinfo() functions that all but eliminate this issue - not only for remote file inclusion attacks as demonstrated above, but also directory traversal attacks (i.e., vulnerable.php?page=../protected/hiddenFile.txt) and other similar exploits. You can then do something very simple like the following:

It's not the cleanest thing in the world (no SEO-friendly URLs, for example), but it's relatively straightforward, safe, and foolproof.