Upcoming updates, upgrades, and maintenance

We have accumulated some housekeeping tasks that we’ll be taking care of over the next couple of months. They’re all necessary things to make sure our service keeps running at its best, and though we work hard to prevent these types of things from impacting services, occasionally they do intrude. As a result, we want to let everyone know what we’re up to and what the effects will be.

Retiring file server f2

We still have quite a few sites using the file server designated as “f2.” This is the oldest file server still in service, and although it has been a great performer for many years, it is reaching the end of its useful life. It is also one of two remaining file servers (and the only one that holds member site files) that has a single point of failure. Our newer file servers use different technology; they are faster (100% SSD), have no single points of failure, allow hardware maintenance while they are running, and allow us to make major changes (like adding capacity or rebalancing files) behind the scenes without you having to change the configuration of your site.

So, we are quite anxious to get rid of f2. We’ve been offering voluntary upgrades for some time now, but it’s time to move things along. We’ve set an upgrade date and time for every site on f2 in April. If you have a site on this file server, you can see your upgrade time in our member interface and, if it doesn’t suit you, upgrade at any earlier time or postpone it closer to the end of April.

Please note, the file server f2 is distinct from and has no relation to site storage nodes that contain the text fs2. If your site’s file storage tag contains fs2, you are not affected by this.

Migrating a site does entail placing it into maintenance mode briefly, for a period proportional to the size of the site. Beyond that it usually has no ill effects. Some sites do have complications, especially if they have hardcoded paths in their .htaccess files. After our system migrates your site, it will attempt to scan the site for affected files and send you an email listing them if it finds any. This isn’t 100% foolproof, but we previously did it for a lot more sites under considerably greater pressure with the f5 server, and problems were relatively few and far between.

Discontinuing PHP Flex

As part of our continued (slow) migration away from Apache 2.2, we will be discontinuing PHP Flex. PHP Flex refers to running PHP as a CGI script, which is a terrible way to do things. In the bad old days, it was useful in some cases for compatibility with PHP applications that didn’t work with safe_mode, if you didn’t mind the horrible performance. But, even in the bad old days, it mostly ended up being used not because it was necessary, but because it was easier than dealing with safe_mode.

These days, PHP safe_mode is long gone, so there’s no real reason to have PHP Flex anymore. Our new PHP types are highly compatible with (and much faster than) PHP Flex, and most people have already happily upgraded. However, there are still some stragglers out there and, as time goes by, they are starting to have problems. Those problems often completely go away simply by switching to a currently-supported version of PHP. Thus, we feel it’s time to phase out PHP Flex. In the month of April, we will auto-migrate PHP Flex sites (which mostly run PHP 5.3 and in some cases 5.2) to PHP 5.5.

MySQL software upgrades

We are currently working on both long-term and short-term upgrades for MySQL. In the short term, we need to perform a series of OS and MySQL server updates on existing MySQL processes to keep them up-to-date and secure. This will require either one or two brief downtimes for each MySQL node, typically about 5-10 minutes. We will be performing these updates throughout the month of March, and we will announce them on our network status feed (viewable on our site and Twitter).

In the long term, MariaDB 5.3 is getting a bit long in the tooth, so we are working to jump straight to MariaDB 10 and all its great new functionality, as well as offering better scalability and configuration flexibility. This is likely to be somewhat more resource intensive, and hence more expensive, so it will be optional for people who are perfectly happy with the way things are. (If you like your MySQL plan, you can keep it!) More on this as it gets closer to release.

Physical maintenance

We also need to do some maintenance on the power feeds to one of our server shelves. Ordinarily that isn’t an issue that affects our members, but in this case it’s being converted between 120V and 208V. Hypothetically that can be done while the equipment is running, but doing so entails a nonzero risk of death by electrocution and after careful consideration we’ve decided that none of the current field techs are expendable at this time. Also, it could burn down the datacenter. So, we’re going to go ahead and do it by the book, which means shutting it off.

That’s a few dozen CPU cores and hundreds of gigs of RAM we need to take offline for a little while. In a real disaster, our infrastructure could survive, but there would be a period of degraded service while things balance out on the remaining hardware. That period would be significantly longer and affect significantly more people than the actual maintenance. So, we feel our best course of action is just to shut it off for the few minutes it will take to rewire the power feeds. The service impact should be low, but will probably not be zero.

We want to complete the MySQL maintenance listed above first, so we are likely to do this toward the end of March. We will post updates on our network status feed with more precise timing as we get closer.

Realm upgrade reminder

We have finally finished rolling sites off of the dreaded “legacy” realms (freebsd6, freebsd72, and 2011Q4). Every site is now on a “color” realm. This means that, for people who have selected late realm upgrades for their site in our UI and who are currently running on the red realm, they will receive an automatic upgrade to violet in April, after quarterly realm rotation has occurred. Compatibility between the two is excellent and we anticipate very few problems.

That’s all for now. All in all, the upgrades and maintenance shouldn’t affect too many people, but we regret and apologize in advance for any problems they do cause. These steps are part of a process designed to eliminate some very old stuff that causes stuff like this to be intrusive. In other words, the goal is to do this maintenance is in large part so that the next time we do it, you’ll be even less likely to notice.

Thanks for reading!

6 Comments

RSS feed for comments on this post.

  1. > Our newer file servers use different technology; they are faster (100% SSD), have no single points of failure, allow hardware maintenance while they are running, and allow us to make major changes (like adding capacity or rebalancing files) behind the scenes without you having to change the configuration of your site.

    Octopus based, or proprietary?

    Comment by devicenull — March 4, 2015 #

  2. True. -jdw

    Comment by jdw — March 4, 2015 #

  3. Wouldn’t it make more sense to scan files for hardcoded paths before the upgrade, so as to allow us to preempt these problems?

    Comment by Daran — March 8, 2015 #

  4. Although I can see why you would think it would, for technical reasons it does not. It is much more scalable and efficient to run the check on the new file server than on the old one, especially while almost everything is still in RAM from being copied.

    However if you want to run a similar check on your own from the shell, it is very easy to do. It would look something like:

    find -P '/home/public/' -type f -size -1M -exec fgrep -l '/f2/example/' {} \;

    Where “example” is replaced with the short name of the site. This will check all reasonably-sized files for anything that looks like an absolute path and list any filenames that might contain one.

    -jdw

    Comment by jdw — March 8, 2015 #

  5. > Ordinarily that isn’t an issue that affects our members, but in this case it’s being converted between 120V and 208V. Hypothetically that can be done while the equipment is running, but doing so entails a nonzero risk of death by electrocution and after careful consideration we’ve decided that none of the current field techs are expendable at this time.

    As an electrical engineer, I am pleased that your priorities are in correct order. Live work is not justified for the sake of a web hosting service. I would rather a period of degraded service than a crispy electrician.

    Comment by Li-aung Yip — March 26, 2015 #

  6. We were really curious if live-plugging between the two voltages would have worked, but, well, nobody was dying to know. So, after holding that possibility in reserve as punishment for bad behavior for as long as possible, we did that a few days ago. We were able to balance around it after all and minimize disruption. Just a few more updates to go this month! (Mainly MySQL.)

    In all seriousness, yes. We love all of our members and their sites, but safety still comes first.

    -jdw

    Comment by jdw — March 26, 2015 #

Sorry, the comment form is closed at this time.

Entries Feed and comments Feed feeds. Valid XHTML and CSS.
Powered by WordPress. Hosted by NearlyFreeSpeech.NET.

NFSN