New offsite dead-drop backup service

It’s often hard to think about disaster planning. The thing about all disasters is that they’re really unlikely, but the consequences of winning the disaster lotto are, well, disastrous.

We don’t want anything horrible to happen to our service, we don’t expect anything horrible to happen, and (for the paranoid among us) aren’t aware of any horrible about to happen. Prudence simply dictates that we acknowledge that disasters are possible and take reasonable precautions to ensure that, were our disaster ticket to get punched that we would be eventually able to recover. (We’re talking about large scale the-entire-datacenter-is-permanently-gone-or-unusable disasters here.)

For us, this means keeping heavily encrypted offsite copies of our key databases, all of our custom server source code, and a lot of configuration information. That’s all the stuff we would need to rebuild our service, member and account balance records from scratch. What it does not include is offsite copies of our members’ content. While we would love to be able to do that automatically, there’s so much of it that the expense would be considerable. We’ve chosen instead to allow people to choose for themselves whether they feel that level of additional protection (and cost) is justified. In a lot of cases, it probably won’t be, but it’s something we need to do for our data so we want you to have the option to do it for yours as well.

Therefore, we’ve entered into a relationship with highly-regarded backup provider rsync.net to offer an innovative (we think) kind of offsite backup service for hosted sites and MySQL processes.

Here’s how it works:

We set you up with a numbered “primary” user account on their system. This is a full-featured rsync.net account that you can use for whatever they normally allow. But our backup server gets a second “subaccount” that lives in a subdirectory of your rsync.net account. Our system then uses this subaccount to make automatic daily backups of sites or MySQL processes that you designate during our periods of lower bandwidth utilization. If you use the primary account for other purposes, you can set the permissions such that you can read the subaccount’s backup files, but that the subaccount can’t read yours.

Once you’re set up, we provide two different backup mechanisms. The recommended method uses a neat open-source tool called duplicity to make incremental backups. These backups are digitally signed by our backup server and we support (optional) encryption of the backup files using a GnuPG public key of your designation. You can retrieve the files we upload using your own account at any time without any dependency on us or our system. Since this is an incremental backup system, you can restore not only the most recent backup, but any successful backup from the preceding 14 days.

(If encrypted, the files will also be decryptable with our backup key, which is necessary for the “incremental” part to work. This is not an increased security risk, because if someone were hypothetically to gain access to our key, they would also have sufficient privileges to read your unencrypted files directly from our local servers. The goal of encryption in this case is only to ensure that a compromise restricted to rsync.net isn’t sufficient to compromsie your files.)

For those who prefer convenience and simplicity over maximum security and incremental backups, our system can also perform more traditional daily rsync backups instead. With this approach, a copy of your filesystem is simply mirrored to your backup account once per day.

It’s important to point out that to keep the number of “superuser” processes to a minimum, our backup process also runs with reduced privileges; it can only access files readable by the “me” or “web” groups. Files and directories that are not group accessible (those with 700 or 600 permissions) cannot be backed up. This typically includes the “private” directory unless you have changed the default permissions. We also exclude the site’s logs by default to save space, but this is optional.

rsync.net, as you might expect, has a flexible terms of service that sound a lot like something we might have written. Their respect for privacy is also evident, but it pretty much doesn’t matter because no one but us will know who a given numbered backup account belongs to, or what specific sites and MySQL process backups an account contains. This, combined with the use of one set of credentials to drop off the backups and another, independent set to pick them up is the reason we’re quasi-seriously calling this a “dead drop” service.

For the backup accounts we provide, Rsync.net storage billing is calculated daily in kilobytes and billed directly to the NearlyFreeSpeech.NET personal bandwidth account of your choice using our usual brand of pay-as-you-go calculation. The pricing for this service is 225,000 kilobyte-days per $0.01, which works out to be just under $1.40 per rsync.net-stored gigabyte per (30-day) month. Naturally there are no minimum commitments or filesystem quotas to deal with, and this price includes the cost for the daily backup transfers between their service and ours.

To sign up for this service, let us know what site(s) and/or MySQL process(es) you want set up with it via a secure support request. Don’t forget to provide your GPG public key if you want the backups to be encrypted!

I imagine one of the first questions we’ll get is why we didn’t use Amazon S3 as the backend for this service instead, given that it’s “so much cheaper.” There are several reasons for this:

1) We don’t feel Amazon S3 shares our views on issues like customer privacy as well as rsync.net does.

2) rsync.net has been very willing to work with us to support a solution that meets all of our objectives; Amazon S3 couldn’t care less.

3) While the storage fee at Amazon S3 is much less, they charge a fee every single time you read or write that data, which we calculated was not conducive to keeping that data up-to-date, and a backup that is not up-to-date is worthless.

Concurrent with this new feature, we have made a couple of changes to our local backups to run more frequently. We have also taken this opportunity to streamline our storage accounting in anticipation of future changes in that area, while also fixing a bug that was causing some sites with large numbers of directories or very small files to under-report their actual storage usage.

Update: By default, you will get an account at the San Diego rsync.net location, which is the fastest for us. If you prefer, you can now request their Zurich location instead when you contact us to set it up, at the same price. We plan to offer their geo-redundant option later at a (significantly) higher price at some point in the future, possibly as soon as next week.

Update:As of August 2009, this experimental feature is temporarily on hold for new signups while we figure out scaling and automation issues. When it returns for additional signups, it will be appropriately documented as a service on our redesigned public site.

9 Comments

RSS feed for comments on this post.

  1. Oooooooooooooooooooooooooooooooh!

    [this is good]

    🙂 🙂 🙂 🙂 🙂

    Comment by Kris — May 8, 2008 #

  2. Let me say this is A W E S O M E 🙂

    1.40$/mo/gb (if correct) including transfer is really cheap.

    I don’t understand the comparison against S3 as this is a very different service, is not designed for backups but to provide cloud storage, and is really complicated and ugly for backup uses compared to the beauty of a standard rsync.

    I didn’t know about rsync.net, at the moment I’ve a bqbackup rsync account (similar service), but they’re going downhill in the past months with a lot of overselling and “no space left on device” errors, so I am planning to cancel them and trying rsync.net …

    If I understand correctly (English is not my first language, I live in southern europe, sorry :-]) you do provide a full rsync.net backup account (as a “reseller”), and I can use this account both for backing up my NFS hosting, AND my dedicated server via standard rsync over ssh account, and I will be billed at 1.40$gb/mo as a whole on my NFS account/credit both for MY backups, and MY NFS backups ?

    This would be great.

    Comment by G — May 8, 2008 #

  3. Sounds to me like you’ve thought this through pretty well. Let us know in a month or two if it’s working as well as you hoped.

    Comment by KC — May 9, 2008 #

  4. This is awesome!

    I have a small account with rsync.net with my own local backups, and they have indeed always struck me as taking a similar no-nonsense, don’t-get-in-my-way attitude.

    This is a great new feature. Thanks!

    Comment by ttuttle — May 9, 2008 #

  5. Wow! You guys are the best and you did the best choice you could! I have a few questions though.

    So, I get a fully featured rsync.net account, right? E.g. I can backup my local machine with the same account? And NFSN has a separate subaccount…

    Continuing with this, I don’t have to pay for a minimum 3 GB quota as I would have to if I opened an account separately with them?

    I pay $1.40 per rsync.net-stored gigabyte per (30-day) month *for everything* — NFSN backed data plus my data, right?

    I don’t have a quota, I can scale up or down, I only pay for what I use?

    Basically, if I opt-in for this, I get a *regular* rsync.net account that is cheaper, has no minimum or maximum quotas and I can use it for everything as if I opened an account directly with them, am I right?

    Regards,
    – Sime

    You are right. -jdw

    Comment by Sime — May 9, 2008 #

  6. What an awesome idea for a service! Great job!

    I actually looked at rsync.net awhile back to recommend to friends/family who don’t make backups, but was put off by their “minimum” order size of 3 GB. That’s WAY more than casual users need. I’m glad that NFSN was able to get around that restriction.

    Comment by Douglas Muth — May 12, 2008 #

  7. Amazon S3 is an excellent backup utility as duplicity can talk S3’s native language and can have incremental backups encrypted and uploaded automatically. However, it sounds like rsync.net is a far better match.

    Comment by Jacob Torrey — May 14, 2008 #

  8. Great Innovation. I wish hosts back here in Australia did that. Its most prob time for me to kick start my NFS account and use up some of my domain names.

    Comment by Pyrmont — May 15, 2008 #

  9. Can you clarify exactly what sort of failures *are* protected against by the default NFSN hosting?

    I had been assuming that there is sufficient, say, RAID redundancy to protect my data against a single hard-drive failure, or against the failure of a single content server, and I suppose I had even assumed there were some kind of content backups in reserve against some kind of large failure.

    It seems from your comment the last sentence is not correct, at least in terms of *off-site* backups, so it would be nice to be clear on what we are relying on if we decide not to sign up for this service.

    For detailed information about filesystem redundancy and backup procedures, you need look no further than our FAQ. -jdw

    Comment by jaoswald — May 17, 2008 #

Sorry, the comment form is closed at this time.

Entries Feed and comments Feed feeds. Valid XHTML and CSS.
Powered by WordPress. Hosted by NearlyFreeSpeech.NET.

NFSN