In 2011 I made a major change in my website hosting business. Up to that point, we had owned and operated our own servers. Finally, though, we reached the crossover point where it got cheaper to retire our gear and go “into the cloud.” (A wonderful euphamism this “cloud.” AWS really operates such a thing but every hosting provider sells it, even though it’s really just virtual private servers.)
Eventually we settled on a hybrid solution for hosting our client sites. One group that needed the highest security level got moved to a VPS at a local cloud computing company. Others, that needed an older version of PHP, we moved to a VPS at a volume hosting provider (1&1). Lastly, the rest of the clients went to a dedicated server managed by yet another volume hosting provider (HostGator). In the end, we wound up saving hundreds of dollars a month.
When we ran our own rackspace, it was easy to dedicate a server just to the task of being an external hard drive to receive the backup tar files. This was an ideal situation because the backup files were not on the same hard drive as the website files (thus insulating the site from drive failures) yet on a server also on the same gigabit switch (so retrieval would be as rapid as possible if needed). However, now that we were having to pay for servers and drive space, it was unrealistic to justify buying servers just to hold those backup tar files.
So I tried a lot of different solutions to the problem. A number of the volume hosting vendors will sell you website hosting space with unlimited traffic and unlimited disk space. That sounded perfect: just grab one of those accounts and have Plesk send the backup container files to the website hosting space via FTP as we had done before. But it didn’t quite work out that way.
First, those unlimited accounts of course aren’t. In reality, I found all kinds of limitations: certain file types weren’t allowed on their server or maximum file sizes were capped. And in general the hosting providers discourage using their accounts for parking files. So I tried another approach: cloud-based storage.
After an extensive search, I ultimately settled on iozeta (a division of LiveDrive). They had several advantages over the many other cloud-backup/storage vendors: 1) They had FTP access to their storage (many of the other vendors did not) and 2) I could get up to 2TB of storage if I wanted that much. And there were no file type or size limits either! So I started weekly backups on the various client accounts on the dedicated server spaced out over all days of the week.
Yet, after a couple of weeks things were not going well as I hoped. For whatever reason, FTP transports were failing. As of this writing, I’m still trying to get an answer as to were the failure lies. Could have been just network congestion causing FTP to time-out. Or perhaps a glitch on their side. Either way, because uploads to their server are like transactions, it’s either pass or fail. The old trick of being able to go back and restart a partially completed FTP transfer just won’t work. So what was happening was that I was getting a logjam of Plesk backup tar files stored in the “local repository.”
Side-note: Here’s one of my bones to pick with Parallels. Their management of backup files is pretty weak. You can download a backup set. You can upload a backup set. You can transfer a backup set from the remote repository to the local one. But you can’t manually instruct Plesk to transfer a backup set from the local repository to the remote one. So, if you were having failures like the ones I was having, you’re screwed. Those backups just sit there eating up valuable disk space threatening to take your server down if you run out of space. This really needs to be fixed!
So faced with the situation I felt the only choice that I had left was to write software to get around the problem. My strategy was this: switch from sending the backups to the remote iozeta server and instead always send them to the local repository; but lower the backup set retention to one. In this way, I can at least plan for a known maximum amount of disk space consumed. I then wrote some software in PHP that, on a nightly basis, compared the local repository against the remote one on iozeta. If the local file didn’t exist on the remote site, transfer it using FTP. When I changed the settings for the backup retention, I also switched to multi-part backup container files of 255MB in size. I was shooting from the hip on the size but my gut tells me that’s probably a reasonable size in terms of transmission reliability. Finally, I set up a remote server retention policy of 28 days. If any file on the remote server is older than that, it gets deleted. That way I can manage my available space on the remote server site.
I’ve been running my software on a nightly basis via a cron job for a couple of weeks now. There are still some transmission failures. However, in reviewing the audit trail logs that my process generates, any files that get skipped the first time around usually get loaded the second time. And since I run my backups per customer on a staggered weekly basis, there’s plenty of time to get everything copied outward before the following week’s backup resets the files. Overall, I now have some peace of mind that, in the face of some catastrophic failure of a server, I’ve still gone something pretty current to fall back on.