Network Status – NearlyFreeSpeech.NET Blog https://blog.nearlyfreespeech.net A blog from the staff at NearlyFreeSpeech.NET. Wed, 28 Feb 2018 21:09:18 +0000 en-US hourly 1 Upcoming updates, upgrades, and maintenance https://blog.nearlyfreespeech.net/2015/03/03/upcoming-updates-upgrades-and-maintenance/ https://blog.nearlyfreespeech.net/2015/03/03/upcoming-updates-upgrades-and-maintenance/#comments Tue, 03 Mar 2015 23:45:24 +0000 https://blog.nearlyfreespeech.net/?p=545 We have accumulated some housekeeping tasks that we’ll be taking care of over the next couple of months. They’re all necessary things to make sure our service keeps running at its best, and though we work hard to prevent these types of things from impacting services, occasionally they do intrude. As a result, we want to let everyone know what we’re up to and what the effects will be.

Retiring file server f2

We still have quite a few sites using the file server designated as “f2.” This is the oldest file server still in service, and although it has been a great performer for many years, it is reaching the end of its useful life. It is also one of two remaining file servers (and the only one that holds member site files) that has a single point of failure. Our newer file servers use different technology; they are faster (100% SSD), have no single points of failure, allow hardware maintenance while they are running, and allow us to make major changes (like adding capacity or rebalancing files) behind the scenes without you having to change the configuration of your site.

So, we are quite anxious to get rid of f2. We’ve been offering voluntary upgrades for some time now, but it’s time to move things along. We’ve set an upgrade date and time for every site on f2 in April. If you have a site on this file server, you can see your upgrade time in our member interface and, if it doesn’t suit you, upgrade at any earlier time or postpone it closer to the end of April.

Please note, the file server f2 is distinct from and has no relation to site storage nodes that contain the text fs2. If your site’s file storage tag contains fs2, you are not affected by this.

Migrating a site does entail placing it into maintenance mode briefly, for a period proportional to the size of the site. Beyond that it usually has no ill effects. Some sites do have complications, especially if they have hardcoded paths in their .htaccess files. After our system migrates your site, it will attempt to scan the site for affected files and send you an email listing them if it finds any. This isn’t 100% foolproof, but we previously did it for a lot more sites under considerably greater pressure with the f5 server, and problems were relatively few and far between.

Discontinuing PHP Flex

As part of our continued (slow) migration away from Apache 2.2, we will be discontinuing PHP Flex. PHP Flex refers to running PHP as a CGI script, which is a terrible way to do things. In the bad old days, it was useful in some cases for compatibility with PHP applications that didn’t work with safe_mode, if you didn’t mind the horrible performance. But, even in the bad old days, it mostly ended up being used not because it was necessary, but because it was easier than dealing with safe_mode.

These days, PHP safe_mode is long gone, so there’s no real reason to have PHP Flex anymore. Our new PHP types are highly compatible with (and much faster than) PHP Flex, and most people have already happily upgraded. However, there are still some stragglers out there and, as time goes by, they are starting to have problems. Those problems often completely go away simply by switching to a currently-supported version of PHP. Thus, we feel it’s time to phase out PHP Flex. In the month of April, we will auto-migrate PHP Flex sites (which mostly run PHP 5.3 and in some cases 5.2) to PHP 5.5.

MySQL software upgrades

We are currently working on both long-term and short-term upgrades for MySQL. In the short term, we need to perform a series of OS and MySQL server updates on existing MySQL processes to keep them up-to-date and secure. This will require either one or two brief downtimes for each MySQL node, typically about 5-10 minutes. We will be performing these updates throughout the month of March, and we will announce them on our network status feed (viewable on our site and Twitter).

In the long term, MariaDB 5.3 is getting a bit long in the tooth, so we are working to jump straight to MariaDB 10 and all its great new functionality, as well as offering better scalability and configuration flexibility. This is likely to be somewhat more resource intensive, and hence more expensive, so it will be optional for people who are perfectly happy with the way things are. (If you like your MySQL plan, you can keep it!) More on this as it gets closer to release.

Physical maintenance

We also need to do some maintenance on the power feeds to one of our server shelves. Ordinarily that isn’t an issue that affects our members, but in this case it’s being converted between 120V and 208V. Hypothetically that can be done while the equipment is running, but doing so entails a nonzero risk of death by electrocution and after careful consideration we’ve decided that none of the current field techs are expendable at this time. Also, it could burn down the datacenter. So, we’re going to go ahead and do it by the book, which means shutting it off.

That’s a few dozen CPU cores and hundreds of gigs of RAM we need to take offline for a little while. In a real disaster, our infrastructure could survive, but there would be a period of degraded service while things balance out on the remaining hardware. That period would be significantly longer and affect significantly more people than the actual maintenance. So, we feel our best course of action is just to shut it off for the few minutes it will take to rewire the power feeds. The service impact should be low, but will probably not be zero.

We want to complete the MySQL maintenance listed above first, so we are likely to do this toward the end of March. We will post updates on our network status feed with more precise timing as we get closer.

Realm upgrade reminder

We have finally finished rolling sites off of the dreaded “legacy” realms (freebsd6, freebsd72, and 2011Q4). Every site is now on a “color” realm. This means that, for people who have selected late realm upgrades for their site in our UI and who are currently running on the red realm, they will receive an automatic upgrade to violet in April, after quarterly realm rotation has occurred. Compatibility between the two is excellent and we anticipate very few problems.

That’s all for now. All in all, the upgrades and maintenance shouldn’t affect too many people, but we regret and apologize in advance for any problems they do cause. These steps are part of a process designed to eliminate some very old stuff that causes stuff like this to be intrusive. In other words, the goal is to do this maintenance is in large part so that the next time we do it, you’ll be even less likely to notice.

Thanks for reading!

]]>
https://blog.nearlyfreespeech.net/2015/03/03/upcoming-updates-upgrades-and-maintenance/feed/ 6
Automatic file server upgrades https://blog.nearlyfreespeech.net/2014/08/01/automatic-file-server-upgrades/ https://blog.nearlyfreespeech.net/2014/08/01/automatic-file-server-upgrades/#comments Fri, 01 Aug 2014 06:28:21 +0000 https://blog.nearlyfreespeech.net/?p=432 As most of our members are aware, one of our older file servers, f5, has been causing intermittent problems. The time has come to move the sites still using it to newer, faster, more reliable equipment. The ability to do that manually has been available in our UI for about a week now, and it has not surprisingly been pretty popular. But after that server caused additional downtime this past week, we’re moving to the next phase: moving sites automatically.

We’ve been testing the replacement file servers for some time now, with hundreds of test sites and various use cases, and they have done very well. Naturally, we’re still paranoid that something will go wrong, but in addition to the testing we have an aggressive backup and replication schedule. So it’s time to move ahead.

Beginning August 4th and continuing through the end of the month, we will start automatically migrating affected sites. If you have any, they are marked with an asterisk on the Sites tab in our UI, with more details on the Site Info Panel for each affected site. The Site Info Panel will also let you adjust the scheduled upgraded, allowing you to migrate a site at any time or (to an extent) postpone an upgrade that is scheduled at a bad time for you.

Most sites don’t need to make any changes as a result of this migration. Based on our testing and the sites that have voluntarily migrated thus far, less than 1% of sites need anything modified to continue working after the upgrade. These changes are related to hardcoding absolute paths that won’t be valid after the migration. I.e. anything starting with /f5/sitename/. These fall into two broad categories.

First, .htaccess files. If you’re using HTTP basic authentication or something similar in your .htaccess file that uses absolute pathnames, those will have to be changed after the migration. You’ll be able to get the new path to use from your site info panel after the migration.

Second, if you’re still using PHP 5.3 Fast and you have hardcoded paths in your PHP code, those will also need to be updated. Using hardcoded paths in this situation was never recommended; it’s always preferable to use a preset variable like $_SERVER[‘DOCUMENT_ROOT’] or $_SERVER[‘NFSN_SITE_ROOT’] if at all possible. PHP 5.3 has also been obsolete for a long time. So if you find yourself in that situation, this is a great time to upgrade that from our UI as well. You’ll still have to change the paths, but this will be the last time. All the currently-supported versions of PHP (5.4 and later) use /home-based paths, just like CGI and ssh, and those never change.

To help you find out if your site needs to be modified, we’ve developed a scan which is run during the migration. When the migration is finished, it will email you to let you know it’s done and whether or not it found any potential problems. It may not catch every possible issue, but it does a very very good job.

Once f5 is no longer in use, it’ll be tempting to give it the full Office Space treatment due to the problems it has caused, but the truth is that it served us incredibly well for a long time, so giving it a salute as it is ejected into space in a decaying orbit into the sun would better fit the totality of its service. (Although that’s admittedly not in the budget, so recycling is a more likely outcome unless the console prints “Will I dream?” as we shut it down for the last time, in which case we probably won’t have the heart.)

Although only a tiny fraction of our members will have even minor problems with this upgrade, each and every one of our members and each and every one of their sites is important to us. If you do run into any snags related to migrated sites (or, really, anything else), please feel free to post on our forum and we’ll do what we can to help you out. (But please don’t post about them here; blog comments are a terrible venue for providing support, second only to Twitter in sheer awfulness and unnecessary difficulty.)

]]>
https://blog.nearlyfreespeech.net/2014/08/01/automatic-file-server-upgrades/feed/ 10
Post-mortem report of Saturday’s file server failure https://blog.nearlyfreespeech.net/2014/04/03/post-mortem-report-of-saturdays-file-server-failure/ https://blog.nearlyfreespeech.net/2014/04/03/post-mortem-report-of-saturdays-file-server-failure/#comments Thu, 03 Apr 2014 02:16:39 +0000 https://blog.nearlyfreespeech.net/?p=421 On Saturday, March 29 at about 4pm US Eastern time, we rebooted one of our file servers that hosts content for member sites. It experienced a critical hardware failure and did not come back online. It took about 28 hours to get things back into service. We’re going to talk briefly about why that happened, and what we’ll be doing differently in the future.

ZFS in one paragraph

This issue has a lot to do with ZFS, so I’ll talk very briefly about what that is and how we use it. ZFS is an advanced filesystem, originally developed by Sun Microsystems back before they got devoured by Oracle. When you upload files to our service, ZFS is what keeps track of them. It performs very, very well on hardware attainable without an IPO, and we’ve been using it for many years because we need stuff that performs very, very well to keep up with you guys. It also has features that we and you are fond of, like snapshots, so if something of yours gets accidentally deleted, we can (almost always) get it back for you. The downside to ZFS is that is not cluster-able. That means that no matter what we do, there will always be at least one single point of failure somewhere in the system. If we do any maintenance, or if it fails, an outage will result.

What happened

Prior to Saturday’s issue, that file server (f5) had caused problems twice in the past two weeks that caused slow performance. We’ve seen a very similar problem with ZFS-based file servers in the past; when they accumulate a lot of uptime they start to slow down until rebooted. Because it involves downtime, member file servers don’t get rebooted very often; not unless they are having a problem. This one was having a problem we believed would be resolved by rebooting, so we rebooted it. However, at that point, it suffered a hardware failure. Although there’s no direct evidence of a connection, it’s hard to believe that’s a coincidence.

We did have two backup servers available to address this situation, one of which was intended for that purpose. It is based on new technology that we will discuss in more detail later, but what we discovered when we attempted to restore to it is that it misreports its available space. It said it had three and half times more space than we needed, but it really only had a few hundred gigabytes; nowhere near enough. (Fortunately we now understand why it reports what it does and how to determine what’s really available.) The second option had the space, but was always intended only to be a standby copy to guard against data loss, not as production storage. We determined pretty quickly that it could not sustain the activity necessary.

As a result, we were forced to focus on either fixing the existing server or obtaining a suitable replacement. Unfortunately, Saturday evening is not a good time to be looking for high-performance server components. We do have a service for that, and they eventually came through for us, but it did take until Sunday afternoon to obtain and install the replacement parts. Once that was resolved, we were able to get it back online relatively quickly and get everyone back in service.

What will happen next

As mentioned above, the big problem with ZFS is that it cannot be configured with no single point of failure. This basically makes it the core around which the rest of the service updates. We’ve done everything possible always to get as close as we can; the server that failed has multiple controllers, mirrored drives, redundant power supplies. Pretty much everything but the motherboard was redundant or had a backup. And, of course, the motherboard is the component that failed.

That’s not a small problem. Nor is it a new one. Single points of failure are bad, and we’ve been struggling for a long time to get rid of this one. We’ve tried a lot of different things, some of them pretty exotic. But what we have found for the past several years is that there’s really not a magic bullet. The list of storage options that genuinely have no single point of failure is pretty short. (There are several more that claim to, but don’t live up to it when we test it.) We have consistently found that the alternatives are some combination of:

– terrible performance (doesn’t matter how reliably it doesn’t get the job done)
– lack of POSIX compatibility (great for static files, but forget running WordPress on it)
– track record of instability or data loss (We’re not trusting your files to SuperMagiCluster v0.0.0.1-alpha. Or btrfs.)
– long rebuild (down)times after crash or failure
– (for commercial hardware solutions) a price tag so high that it is simply incompatible with our business model

The end result is that for the past few years, we have backed ourselves into something of a ZFS-addicted corner. However, what makes Saturday’s failure particularly frustrating is that we actually solved this problem. We’ve been rolling that solution out over the past couple of months. What’s left to be moved at this point is member site content and member MySQL data. The hardware to do that is already on order; it may even arrive this week. Once it does, there will be a week or two of setup and testing, and then we will start moving content. That will involve a brief downtime for each site and MySQL process while it’s moved, and may require a few sites with hardcoded paths to make some updates. We’ll post more about that when we are ready to proceed.

The new fileserver setup has no single points of failure, is scalable, serviceable, and expandable without downtime, preserves our ability to make snapshots, and performs like we need it to. And (crucially) although it is still cripplingly expensive, we could afford it. This is an area where we’ve been working very hard for a very long time, and it simply wasn’t possible to get all the requirements in one solution until recently.

To be perfectly clear, this doesn’t mean our service will never have any more problems. No one can promise that. File server problems were already incredibly rare, but since our service design makes them so catastrophic for so many people (at many hosts, such failures are a lot more common, but don’t affect nearly as many sites at once), we have to do as much as we can to make them incredibly rarer.

There are also plenty of other things besides file servers that can go wrong at a web host, and we continue to work on improving our service in all those areas. We’ll have more to say on that subject as the year progresses, but really, there’s no such thing as “good enough” for us, so that work will never end.

For now, we’re very sorry this happened. As we said during the downtime, there is nothing we hate more than letting you guys down, and we did that here. It’s no more acceptable to us than to anyone else for something like this to happen. What we can tell you is that before this happened we were executing a plan that, if it had been completed, would have prevented this. Completing that plan as quickly as possible is our next step.

Thanks for your time and your support. Problems like this are physically sickening, and seeing that so many of our members were so supportive really helped carry us through.

]]>
https://blog.nearlyfreespeech.net/2014/04/03/post-mortem-report-of-saturdays-file-server-failure/feed/ 29
IPv6, SSL, scheduled tasks, storage discounts & bulk bandwidth https://blog.nearlyfreespeech.net/2012/12/26/ipv6-ssl-scheduled-tasks-storage-discounts-bulk-bandwidth/ https://blog.nearlyfreespeech.net/2012/12/26/ipv6-ssl-scheduled-tasks-storage-discounts-bulk-bandwidth/#comments Wed, 26 Dec 2012 05:23:59 +0000 http://blog.nearlyfreespeech.net/?p=284 We have quite a few feature announcements to bring you this holiday season. We’ve added support for several features, some of which have been requested for years like IPv6, SSL, and scheduled tasks (AKA cron jobs). We’ve also introduced new billing options that make our service pricing fairer and more scalable; these options will help a broad variety of our members save money.

IPv6 support for hosted sites

We’ve added IPv6 support for hosted sites. Just select the “Manage IPv6” action from the site info panel and enable it. Each site that enables IPv6 is currently assigned a unique IPv6 address. There is no extra cost for IPv6.

IPv6 isn’t fully deployed on the Internet yet, and consumer ISPs are the worst laggards, so IPv6 isn’t enabled by default and you may want to consider whether it’s right for you. More info is available in our UI.

SSL support

We always said that we would focus on SSL support as soon as IPv6 was done. And that’s just what we did. SSL support is now available in two forms.

First, you can obtain (or generate) an SSL certificate for any alias on your site, upload the certificate, the key, and the chain file (if applicable) to your site, and then request SSL be enabled for that alias through our UI. This is a great option if you want to secure traffic to your site for all visitors.

Second, generic support for securing the (shortname).nfshost.com alias of each site is available without the need for your own certificate. Use of this option is also requested through our UI, but it’s our domain name, so its use is subject to our pre-approval. (For example, we won’t be approving any sites with names like “securebanklogin.”) This option is good if you just want to administer your site through a web UI securely.

Our SSL implementation depends on the SNI feature of the TLS standard, which is now available in all modern browsers, so we are comfortable deploying it. SSL currently does not have any extra cost associated while it retains experimental status. It may get a nominal fee in the future to cover the added CPU cost of encryption once we get a better idea about how tightly we can pack certificates without causing problems.

Scheduled tasks

We’ve added the long-requested ability to run scheduled tasks on specific sites at regular intervals. Great for processing log files or refreshing content, scheduled tasks can be set to run hourly, daily, weekly, or monthly. There’s currently no extra charge for this feature, but we’ll keep an eye on the resources it uses.

Storage discounts and resource-based billing

Most people (including us) will acknowledge that for a long time, the greatest flaw of our service is that it bills a large amount of the cost based on how much disk space a site uses. This charge then pays for all the CPU and RAM it takes to host sites. This works well enough in terms of covering our costs, but forces sites that are very large to subsidize sites that are very small but resource-intensive. That’s not fair, so we’ve made two changes.

First, we’ve cut the storage charge for static sites, which by definition use few resources. Our published rate is $0.01 per megabyte per month, but with this change, static sites are now charged at $0.01 per 5 megabytes per month. That’s an automatic across-the-board 80% cut for all static sites.

Second, we’ve introduced a new option for dynamic sites called “stochastic billing.” If selected, this option cuts the storage costs for dynamic sites by 90% to $0.01 per 10 megabytes per month. In its place, it divides sites into groups and once per minute, selects a web request at random and bills the associated site for the resource usage attributed to that group. The likelihood of a given request being selected is proportional to the resources it uses, so over time the random sample converges to a very accurate representation of which sites are using which resources, and everyone who participates is billed fairly for the share of resources they used with a very high degree of accuracy.

We’ve set the pricing for stochastic billing in such a way that if everybody switched tomorrow, our bottom line wouldn’t change at all, so this is not a price increase. Most people will actually pay less. Sites that use above-average resources — the ones subsidized under the current plan by sites that use tons of disk space — will naturally cost more if they switch over. But we don’t plan to force anyone to switch. Instead, we intend to preserve both options and allocate to each the hardware resources it is paying for. Over the long term, we expect resource-heavy sites on the old plan will find fewer and fewer disk-heavy sites willing to subsidize them, which may lead them to resource shortages way down the road if they choose not to migrate and pay their own way. But we don’t anticipate any dramatic changes in the short term.

More information about this is available in our FAQ and our forum, and the option is available by changing your site’s server type in our member UI.

In the coming days, we’ll be adding a $0.01/10MiB/month + stochastic billing option for static sites as well. That’ll be better than the $0.01/5MiB/month plan for most but not all static sites, and we understand some people won’t want anything to do with a billing scheme with a random element, so it will be optional.

Bulk bandwidth option

One of the things we do that’s a little unusual is that we demand very high quality bandwidth for our member sites; there are a number of lower-priced providers commonly used by web hosts for connectivity that we don’t consider good enough. A consequence of this is that the bandwidth costs we pay are relatively expensive compared to some of our competitors of a similar size, and of course we pass that along. We feel it’s well worth it.

At the same time, we have wound up connecting to cheaper providers from time to time. This is not to serve member sites, but rather because cheaper providers — combined with clever routing and network management — can be a good way for us to soak off huge surges of incoming traffic associated with DDOS attacks without affecting the rest of our network. However, even though the bandwidth offered by these providers is relatively cheap, DDOS attacks consume a lot of it, and the more of it we have, the more resilient we are, so the overall bill is not trivial. And at the same time, DDOS attacks generate only inbound traffic, so we’re only using the inbound half of those connections (and then only when we’re being attacked).

So, we’ve got a bunch of unused outbound capacity. We’ve made the decision internally that the price/quality tradeoff is not worth using those connections for our regular traffic, but we respect that not everyone’s site is the same and that in light of the pretty significant cost difference, some people might prefer to make that decision for themselves.

As a result, we’re offering a new class of “bulk” bandwidth that will use our excess outbound capacity on a per-site basis. Instead of being priced per byte transferred like our regular offering, bulk bandwidth is priced per megabit per second (Mbps) per month. You select the amount you want and then pay $5.00/Mbps/mo. (But like most of the rest of our services, it is charged one penny at a time and can be added or removed at any time.) Your actual usage is unmetered. It’s also burst-able, meaning it groups sites together and if another site in the group isn’t using its share at any given moment, your site can borrow it at no extra charge.

Bulk bandwidth is typically best suited to sites that steadily use a lot of bandwidth, for example to distribute large files to the general public. Our regular bandwidth plan will still generally provide higher per-connection speeds, better routing and resiliency, and probably slightly better latency.

To determine if bulk bandwidth is right for a site, first figure out if it’s currently spending less than $5.00/mo on bandwidth under our standard plan. If it is, bulk bandwidth is a bad deal: pay more, get less. But if a site’s bandwidth costs more than $5.00/mo, the answer is maybe. Next, you would look at the nature of the site. If the priority is to deliver the most overall bandwidth per dollar, then bulk bandwidth might be a good choice. If the priority is to provide the fastest individual downloads, or if the site has significant interactive elements — particularly stuff like AJAX — it’s better to stick with the standard plan.

In short, the bulk bandwidth option is the freight truck to our standard plan’s sportscar. Both can move a lot of data very quickly, but in very different ways.

Final thoughts

Whew. Densest. Blog post. Ever.

If you follow our Twitter feed, you already know about these updates. But judging by the follower numbers, most people don’t, so we thought we’d mention it.

We think these changes are huge. They address a lot of the pain points that many of our members have been feeling for a long time, both in terms of features and cost. And they represent a mountain of work, especially these past few weeks to carry them over the finish line in time for Christmas.

Going forward, the biggest question will probably be about PHP 5.4. That’s the big one we weren’t able to make happen in time for this announcement. It remains available in Flex mode if you select the 2011Q4 realm, but 5.4 removes safe_mode support and hence there won’t be a “PHP 5.4 Fast.” Instead, “PHP 5.4 Full” is coming, which combines a lot of the best features of Flex (consistent paths, ability to execute external programs) with performance comparable to existing Fast sites. That’s our top feature development priority, and we’re keeping a close eye on the March 2013 timeframe that the PHP developers have announced for phasing out non-critical updates to PHP 5.3, but we can’t offer an ETA at this time. We also have some internal maintenance to do to keep things running smoothly and fix bugs.

Thanks everyone! We never lose sight that our incredible members make our service not just possible but everything it is. (And I allow myself a bit of a smug grin, secure in the knowledge that we have the hands-down smartest member base of any web host, which is the only reason we have the courage to do something as exotic as stochastic billing.)

(Updated 2012-12-26 to reflect that *.nfshost.com no longer uses a self-signed certificate for SSL.)

]]>
https://blog.nearlyfreespeech.net/2012/12/26/ipv6-ssl-scheduled-tasks-storage-discounts-bulk-bandwidth/feed/ 27
Security flaw with login corrected https://blog.nearlyfreespeech.net/2011/08/02/security-flaw-with-login-corrected/ https://blog.nearlyfreespeech.net/2011/08/02/security-flaw-with-login-corrected/#comments Tue, 02 Aug 2011 17:18:07 +0000 http://blog.nearlyfreespeech.net/?p=223 One of our members informed us a couple of days ago that due to a strange combination of actions and circumstances, he hit a flaw in our login system that enabled him to access the membership of another member with a similar name.

Of course we promptly investigated; the problem has been permanently fixed.

After that, we turned our attention to finding out if that particular flaw had been exploited in any other cases. It does have a very distinctive pattern, part of which is failing to log in as the first person, successfully logging in as the second person, and then “reappearing” as the first person. (That’s sufficient “signature” to detect it in our records, but there’s actually more internally required for it to happen — related to cookies and PHP session handling.) We’ve been over the logs back to the point where the problem was introduced and we’re happy to report that we were not able to find any previous similar incidents. So, if you needed any reassurance that most people are basically good, the first person to find this problem reported it to us within minutes.

Obviously the person who did this is aware of it, and we have already notified the person affected. So if you haven’t already heard from us about this, it doesn’t affect you and you don’t need to take any steps. We are posting this anyway simply because it’s security related. Security is our top priority; it’s the foundation upon which the rest of our service has to be built. So, as transparent and forthright as we try to be when we have service problems and downtime, I feel we need to be twice as forthcoming when we have problems like these, however small.

I also feel it’s appropriate to personally apologize to all of our members because this was a security problem and it was caused by a coding error introduced by me. This is an area where only perfection is acceptable; falling short even a little bit is not. I’m sorry, and I will work hard to keep it from happening again.

(Ironically, we are already developing a new certificate-based backend that is so secure, the goal is to open-source our entire UI when it is complete.)

]]>
https://blog.nearlyfreespeech.net/2011/08/02/security-flaw-with-login-corrected/feed/ 10
Scheduled maintenance November 22 and December 15 https://blog.nearlyfreespeech.net/2010/11/16/scheduled-maintenance-november-22-and-december-15/ https://blog.nearlyfreespeech.net/2010/11/16/scheduled-maintenance-november-22-and-december-15/#comments Tue, 16 Nov 2010 18:46:42 +0000 http://blog.nearlyfreespeech.net/?p=199 We are scheduling two maintenance windows in the next month to move some equipment:

Date: November 22nd, 2010
Window: 9am to noon UTC (4am to 7am US Eastern, 1am to 4am US Pacific)
Affecting: MySQL nodes m2, m3, and m21

Date: December 15th, 2010
Window: 8am to 1pm UTC (3am to 8am US Eastern, midnight to 5am US Pacific)
Affecting: File servers f2 and f5

Each server should be offline for about one hour, not the whole window. This will cause some downtime. While the MySQL nodes are offline, those MySQL processes hosted on them will be unreachable. While the file servers are offline, sites hosted by those file servers will show an official maintenance page.

Due to the nature of our network, we can generally (and frequently do) move stuff around without disrupting services. Unfortunately, we regret that’s not true in this case. These servers will actually be physically moving to a bigger space in a new facility with more (and more reliable) power so that will we have room to continue to grow for the next several years. We’ve listened to past comments on these types of issues, and we’ve endeavored to give you as much advance notice as possible, especially on the file server moves, so that you’ll have time to make any plans or announcements you feel are appropriate.

Thanks very much, and we apologize for any inconvenience as we (as always) continue to work tirelessly to make our service better for our members.

Update: The first batch of maintenance is long since completed, and we’re on schedule for the second, but we noticed that this was in the wrong blog category. We’ve now moved it.

]]>
https://blog.nearlyfreespeech.net/2010/11/16/scheduled-maintenance-november-22-and-december-15/feed/ 1
Removing deprecated IP block https://blog.nearlyfreespeech.net/2010/11/16/removing-deprecated-ip-block/ https://blog.nearlyfreespeech.net/2010/11/16/removing-deprecated-ip-block/#comments Tue, 16 Nov 2010 18:44:59 +0000 http://blog.nearlyfreespeech.net/?p=206 Many years ago, we were assigned the IP address block 64.238.220.0/23 by one of our upstream network providers. We officially deprecated the use of that block way back in 2008, and we will be returning it on December 1st, 2010, so it will not work after that point.

The only possible way this could affect you is:

1) You use third-party DNS to point at a site hosted here.
2) You hardcoded A records in the 64.38.220.0/23 range (against our advice) over two years ago.
3) You haven’t checked on / updated those settings in the past two years.

In other words, literally only a handful of people will be affected by this change. Nonetheless, we wanted to have at least some public warning before moving forward. The few people affected know a lot about DNS and went to the trouble to make a custom setup at another provider; they’ll know exactly what this post means and what to do about it. So, if you have to ask “does this affect me?” the answer is almost certainly no.

]]>
https://blog.nearlyfreespeech.net/2010/11/16/removing-deprecated-ip-block/feed/ 2
Brief Network Maintenance July 20-22 https://blog.nearlyfreespeech.net/2010/07/20/brief-network-maintenance-july-20-22/ https://blog.nearlyfreespeech.net/2010/07/20/brief-network-maintenance-july-20-22/#comments Tue, 20 Jul 2010 04:38:35 +0000 http://blog.nearlyfreespeech.net/?p=196 This is just a quick announcement about some upcoming network maintenance.

Due to our load-balancing capabilities, most of this will be done with no disruption to our services. There are a few exceptions, though. We’ll be doing maintenance over the next few days in the early morning hours (between 1am and 5am US Eastern time — 5am and 9am UTC), the following services will be briefly disrupted (should be about 5-10 minutes each):

Tuesday Morning
– www.nearlyfreespeech.net
– members.nearlyfreespeech.net
– various error pages
– Member MySQL Node m9

Wednesday Morning
– Our email (e.g. support@nearlyfreespeech.net)
– Outbound mail sent from CGI scripts
– Member ssh & FTP
– phpMyAdmin server
– Email forwarding
– Member MySQL Node m4
– Member MySQL Node m6
– Member MySQL Node m7
– Member MySQL Node m8
– Member MySQL Node m23
– Member MySQL Node m25
– Most pools (a beta feature in limited testing — you know if you have one)

Thursday Morning
– Member MySQL Node m1
– Member MySQL Node m10
– Member MySQL Node m11
– Member MySQL Node m12
– Member MySQL Node m13
– Member MySQL Node m14
– Member MySQL Node m15
– Member MySQL Node m16
– Member MySQL Node m17
– Member MySQL Node m18
– Member MySQL Node m19
– Member MySQL Node m20
– Member MySQL Node m22
– Member MySQL Node m24
– Member MySQL Node m26

To reiterate, these disruptions will be very short (<10 minutes); you probably won't even notice them, but we wanted to let you know what's going on anyway. Although it's only a few minutes, we'll be changing a lot of stuff under the hood, which is an important step in helping us build a more reliable network. We wish it were possible to build a network with 100% availability, but we're still working on that. 🙂 In the meantime, this scheduled maintenance should definitely help us get a little closer. Thanks!

]]>
https://blog.nearlyfreespeech.net/2010/07/20/brief-network-maintenance-july-20-22/feed/ 1
File server “f1” replacement https://blog.nearlyfreespeech.net/2010/03/15/file-server-f1-replacement/ https://blog.nearlyfreespeech.net/2010/03/15/file-server-f1-replacement/#comments Mon, 15 Mar 2010 04:56:43 +0000 http://blog.nearlyfreespeech.net/?p=175 Our venerable old file server “f1” had some problems last month that left us with some doubt as to the viability of its redundant power supplies over the long term. Since then, we’ve been planning and preparing to migrate all the sites it handles to other, newer file servers.

That’s all been prepped now, and what we’re going to do is automatically migrate everyone during the month of April. If you have affected sites, you can get a specific time for each site from our member interface, and the main sites page will star any site scheduled for an upgrade on your list of sites so you can see at a glance which sites are affected.

There are two drawbacks to this upgrade.

The first is that the upgrade involves minor downtime since the files need to be transferred from one file server to the other, which requires log files and any dynamic data files be closed and held inactive while they’re moved. The downtime will be roughly proportional to the size of the site, so this is a great time to peek in and check for any files you’re not using or log files you want to trim; it’ll make the upgrade go faster and you’ll save money on storage.

In addition, while most sites will be moved transparently, sites that use absolute paths in .htaccess files (or in PHP scripts, although we don’t advise that) will need to be updated to work properly after the move. If this is applicable at all (I personally moved all my sites without needing to change anything), it should just take a few minutes to update.

For those two reasons, we understand that you may want some control over upgrade schedule. To that end, you can log in and upgrade any of your eligible sites at any time between now and its scheduled date. You can also postpone a site’s upgrade if the scheduled time doesn’t work for you, within reason. You can’t postpone an upgrade past April 20, and postponements are for random intervals just so we don’t end up with an unmanageable number of sites to move on April 30th at 11pm.

The target file servers for this upgrade, f2 and f5, are a lot faster, have SSD-based L2 caches for busy files, superior network performance, and never require painfully long filesystem checks after an unclean shutdown, so we think people will be really happy with the change. Oh, and they have much newer and better redundant power supply setups!

We hope that the preparations we’ve made will make the transition as easy and painless as possible, and that everyone affected will agree that the end results are well worth having.

]]>
https://blog.nearlyfreespeech.net/2010/03/15/file-server-f1-replacement/feed/ 15
Scheduled Downtime for Friday, November 20 https://blog.nearlyfreespeech.net/2009/11/18/scheduled-downtime-for-friday-november-20/ https://blog.nearlyfreespeech.net/2009/11/18/scheduled-downtime-for-friday-november-20/#comments Wed, 18 Nov 2009 03:00:43 +0000 http://blog.nearlyfreespeech.net/?p=170 We have some facilities maintenance scheduled for this Friday. As part of this maintenance, we will need to physically move a handful of critical file and database servers between racks in our Phoenix datacenter. Since that equipment forms the heart of our hosting service, we’ll need to shut almost everything down briefly, just long enough to move it.

The maintenance window will be from 10am to 4pm MST (5pm to 11pm UTC) on Friday, November 20th, 2009.

This will affect web hosting, email forwarding, and “support services” (i.e. our sites, SSH, FTP) for all members.

We don’t expect the actual downtime to be anything close to the whole time. In an ideal case, it would take us about an hour to prep and and hour to take everything down, move it, and bring it back up. However, this is the real world, not the ideal one, so we’re giving ourselves some additional room to maneuver.

I’m sure there’s someone out there for whom this is a spectacularly inconvenient time, and to them we sincerely apologize. Any time we picked for maintenance of this sort would be bad for somebody. We did strive to pick a low-usage time when we could guarantee the manpower we needed.

We also would have liked to provide more notice, but up until this evening any announcement would have been pretty much content-free. (“We will be scheduling a maintenance downtime of unknown length at an unknown point in the future.”) As of right now, we have a schedule we believe can be met (the original proposed date has already passed, so any earlier specific announcement would have turned out to be wrong), and so we’re bringing it to you as quickly as we can.

We apologize again for the inconvenience. On Friday, as every day, we’ll all be working hard to bring you the best, most reliable hosting service we can. “NEITHER RAIN NOR SNOW NOR GLOM OF NIT.”

]]>
https://blog.nearlyfreespeech.net/2009/11/18/scheduled-downtime-for-friday-november-20/feed/ 12