As a systems administrator, I really should have known better. Yesterday, the machine (”dexter”) that hosts leyton.org, and a number of other sites and domains, died a painful death. It’s power supply (PSU) blew. Not content with simply taking itself out, it actually knocked out the entire power strip to which it was connected, taking down a further seven machines. The power strips here are not your cheap and cheerful B&Q white strip, but ones which allow power to be turned off remotely - very useful when the machines in question are hundreds of miles away in some data centre in London’s Docklands. But not something that can be replaced all that quickly or easily.

I should explain that about 18 months ago, I wanted a dedicated box of my own. I was fed up of constrained hosted systems and invariably missing software. I run website systems for a living, so figured I could just as well do it myself. Initially it was also for so-far unprogressed “projects”, but I also figured the hundreds of pounds I spend each year on hosting my various sites and domains could be just as well directed at a dedicated box, especially if I could enlist a few friends to help share the cost. But that comes with implications too - like this - of having to worry about failures, security and scaling issues.

But I digress. After a few hours where I figured the outage was just “network trouble” (It’s a running joke in many IT departments that most problems are, ultimately, the network groups problem; It’s therefore appropriate that most network folk I know have something of a thick skin). Clearly the problem didn’t sort itself out, so I dug out my emergency contact details, and gave my ISP a ring. A helpful chap by the name of Rob was on the line, and explained the nature of the problem. An ETA of 3pm seemed great all things considered.

3pm came and went, and by 5pm I was worrying that my system had died rather more seriously. Another call and an operator was dispatched in the remote and anonymous data centre to attach a keyboard and monitor (aka ‘KVM’) so we could inspect what error messages awaited us. Probably just the file system needing some single-user mode love. More time passed, and eventually my phone rang to explain the PSU had blown and was the cause of the earlier problems. Of course, a bit puzzled they hadn’t checked that all the attached boxes had come up again after replacing the power strip, but no matter. I was now more worried that the failing PSU had taken out the contents of the system - something PSU’s have a habit of doing. Large jolts of unexpected ampage taking out the supplying power strip, could just as well take out internal components… including my hard disk with all the data.

Thankfully it wasn’t the case. My hard disk was swapped into a spare chassis, and dexter returned to business at about 8.30pm last night. Of course when faced with a dead system you start to ask “when was the last backup”, and in my case it was Quite Some Time Back. Automating backups was always “on the list”, but never got very far: It’s a busman’s holiday to be tweaking a system, given I do this day in day out.

But I should really have known better. Especially as the server was hosting friends websites. Losing my site and data, and all these weblog posts, really would have been a pain. But to have lost friends content something worse altogether.

So given the extremely close shave I experienced, I’ve today spent a few hours automating database and website backups, and setting up automated jobs to pull the content down to my Mac on a regular basis. The joy that is Leopard’s Time Machine means the backups are archived too. So dexter can die a horrific and painful death now, and I can confidently restore my site on a new machine easily.

A close shave certainly, but to good ends.

So, dear reader, if you’ve not done a backup recently yourself, well, take my experience as a useful warning and get on with it now, or at least check your hosting provider does it. Try to automate it so you don’t need to worry about it in future.

One Response to “When did you last backup? A cautionary tale”

  1. 1
    Roger Darlington Says:

    Good advice, Richard. It could have been a sad start to 2008 - but hopefully you’ll have a great year including the BIG day of the wedding.

Leave a Reply

Please be sure to read the comment policy before posting.