• May 14, 2019

Backups and Disaster Recovery – “Site Health” Checklist

Backups and Disaster Recovery – “Site Health” Checklist

Backups and Disaster Recovery – “Site Health” Checklist 1024 411 Vaughan

Good backups are an essential first line of defence to address a multitude of issues. At the end of the day computer hardware can be replaced, but your critical data will be unique to your business and will be either irreplaceable – or at the very least difficult to recreate.

What makes a “good backup”?

A “good backup”, is a recent backup that has all the data (which can include applications as well as information) in a form that can be easily accessed and restored if needed.

What this means then, is that backups need to be performed regularly (typically at least daily). The value of backups is significantly diminished if they’re not current – or near current to the point-in-time that you need.

The key items in relations to backups are:

  • the backup application itself – its capabilities and features (if you would like recommendations for your environment … please contact us)
  • automated monitoring of backup jobs
    • you need to know that all systems are being backed up at least daily
    • that the backup jobs are being started
    • and that if there are any failures – you’re notified so that they can be investigated and resolved
    • so what you’re interested in is the “exceptions” (i.e. jobs that are not started for some reason, and the jobs that complete with an error)
    • if you don’t have automated monitoring of your backup jobs – we can help!

Servers

  • When most people think about backups, they probably think ‘Servers’ and that is entirely appropriate. Servers typically store data for users and thus Servers need regular and reliable backups in the event of some issue (hardware, user error or virus etc.)

Workstations

  • However backups are also appropriate for workstations. These days USB drives are inexpensive and are ideal both in terms of their physical size (portable) and their capacity.
  • Users don’t always save documents to designated folders or network drives. If important documents are lost or corrupted, local workstation backups are ideal in this scenario.
  • Another situation where local workstation backups can save time and money is in the case where a user has noticed some ‘strange’ behaviour (e.g. some application not performing as expected).
    • If the behaviour cannot be resolved in 15 – 30 mins, and the user can advise that the behaviour started a few days ago.
    • If you have regular backups you can restore back prior to when the issue commenced and resolve the issue in around an hour.
  • Local backups are also particularly useful when users are based at a branch office.
    • Rather than having to return a system to Head office for a rebuild (which will typically take a few days), if you have a recent backup from a point-in-time where the system was working; you can restore the system at the Branch office and have the system working again in around an hour.

Off-site

  • In the case of a disaster at your premises, you need to have a copy of server backups off-site.
  • As with backup monitoring, the process for getting your backups off-site should be automated. Automated processes are not reliant on any individual (i.e. automated processes keep working regardless of who is on vacation)
  • Each off-site backup should be verified – at least weekly – to verify the integrity of the images (if the image is intact – then it should be able to be restored).

Disaster Recovery strategy

  • Off-site backups need to be tested periodically, even if the off-site images are being verified.
  • Frequently the off-site recovery environment is different to the on-premise equipment. It is only by performing an off-site restore that you can identify any potential issues in the restore process. Far better to resolve any restore issues at your leisure than under the pressure of a ‘live’ disaster recovery scenario.
  • If you have a backup system like the one we covered in this article https://zen.net.au/affordable-business-continuity-for-smes/ then you can perform a test restore in around 15 minutes at your convenience.
    • This is the ideal scenario; particularly for medium-sized organisations where the cost of downtime both in terms of lost productivity and loss of reputation due to disruption of service to customers would be significant.
  • However smaller organisations who may not have the budget for a dedicated Data Centre solution, still need the ability to restore their servers in the event of a disaster.
    • We can provide an alternative solution that will provide for the recovery of servers typically in 24 – 48 hours
    • In this scenario, we recommend off-site restores be performed every 6 months.

Join our Newsletter

We'll send you newsletters with news, tips & tricks. No spams here.

Input this code:captcha