Introduction to Exchange Server 2010 Backup and Recovery

Backup and recovery (including for Exchange Server 2010) is one of the least interesting topics in IT.  I’ve never met anyone who got into IT to become a backup and recovery expert.

Note: I’ve met some great backup and recovery experts, I just haven’t met any who said thats what they wanted to become when they started in IT.

Backups tend to be all steak and no sizzle, and that lack of sizzle too often means a lack of attention and financial support for an effective backup system.  The best backup systems are often in businesses that have previously suffered a serious data loss, or those where a legislative requirement for data protection and retention exists.  Even those aren’t guarantees, I’ve met plenty of customers who have lost important data and still won’t invest in their backups.

When it comes to Exchange servers backup and recovery are no less critical than any other part of the business, because so much of businesses these days runs through email.  Email is the most important communication tool in most businesses; the home of their most critical documents and records.

In short, as an Exchange Server administrator you need to make sure your servers are being backed up correctly.

So let’s kick things off by taking a look at some of the fundamentals of Exchange Server backup and recovery.

General Backup and Recovery Terminology

Throughout this series I am going to be repeating some of the same terminology, and so it will help to become familiar with it if you are not already.

Backup Types

  • Full Backup – a complete copy of the data being backed up.  In the context of Exchange Server 2010 this also truncates the transaction logs for databases.
  • Incremental Backup – a partial copy of the data being backed up.  Contains all of the changes to the data since the last Full or Incremental backup.  When Full + Incremental backups are used a restore operation requires the last Full backup plus each of the subsequent Incremental backups.
  • Differential Backup – a partial copy of the data being backed up. Contains all of the changes to the data since the last Full or Incremental backup, however unlike the other backup types does not mark the data as having being backed up.  This means that a restore operation only requires the last Full plus the last Differential backup.

Each of the backup types makes a trade off between backup and recovery speed.  Full backups are the easiest and fastest to restore from but take the longest to backup, whereas Incremental backups are usually the fastest to backup but require more effort and time to restore from.

Backup Storage

  • Tape – magnetic tape backup storage comes in many different formats. It used to be the most cost effective and portable media for storing backups but these days disk can be more practical in some scenarios.
  • Disk – large capacity hard disk storage is more affordable and portable these days than in years past and has many advantages over traditional tape backups.
  • Cloud – this refers to an off-site, externally hosted backup service that is used for remote backup storage.  The cloud storage may be a mix of tape and disk depending on the service that is being used.
  • Online – backup storage that is immediately accessible, such as a disk array connected to the backup server.
  • Offline – backup storage that is on-premises but is not immediately accessible without human interaction, for example tapes that have been removed from the tape drive.
  • Offsite – backup storage that is stored offsite, either at an alternate physical location for the business (eg, a school with two separate campuses) or that has been taken away by an offsite storage company.

Again each storage type makes a trade off between convenience and protection.  Online disk storage is the easiest for backup and restore but carries the highest risk of data loss if there was a disaster in the data center itself such as fire or flood.  Offsite backup storage is safe from such disasters but adds to the restore time because the media must first be transported from offsite.

Backup Planning and Management

  • RPO – the Recovery Point Objective is the point in time at which you are aiming to recover data.  The RPO basically defines how much data loss the business is willing to tolerate, and so this plays an important part in designing a backup solution, particularly the scheduling of backups to meet the RPO requirements.
  • RTO – the Recovery Time Objective is the amount of time in which a recovery must take place after a disaster has occurred.  Again this plays an important part in designing backup solutions to ensure that the correct infrastructure is in place to facilitate that speed.
  • Backup Window – this is the time each day in which backup operations are able to be run.  For most businesses this is overnight, outside of their core business hours.  However depending on the RPO it may be necessary to run backups during business hours as well.

Other Terminology

  • Bare-Metal – this refers to a type of backup that makes it possible to recover the server and its data in their entirety from a single backup.
  • System State – this refers to a collection of data on a Windows Server that includes various services and configuration information that relate to its particular role, such as the Registry, boot files, Active Directory database (for Domain Controllers), cluster service information, IIS metabase, and other system files.

Exchange Server Backup and Recovery Concepts

Exchange Server 2010 itself has some specific backup and recovery concepts that Exchange Server administrators need to understand.

  • VSS – the Volume Shadow-copy Service is a backup API included with Windows Server operating systems and server products such as Exchange Server 2010.  This is the only supported backup technology for Exchange Server 2010, unlike previous versions that also supported a streaming backup API.
  • Active/Passive Databases – Exchange Server 2010 introduced a new high availability concept called Database Availability Groups (DAGs).  A DAG consists of multiple database copies across 2-16 Mailbox servers.  Only one copy of each database is “active” at any one time, the remainder are considered “passive”.
  • Recovery Databases – this is a special database that can be used as a target for a mailbox database restore operation, allowing the administrator to mount the restored database and extract the required data from it into an active database or a PST file.
  • Database Portability – the ability for Exchange Server 2010 to mount databases that have been copied or restored from other Mailbox servers.  This simplifies restore scenarios in which the original server is not available.
  • Dial Tone Portability – the ability for Exchange Server 2010 to mount a temporary database with empty mailboxes for end users to continue to send and receive email while restore operations are taking place in the background.
  • Log Truncation – all database operations are logged to transaction logs on the Mailbox server.  The logs can be used to recover information written since the last backup was taken if there is a database failure. When a database has been backed up all of the transaction logs that are no longer required for recovery are removed (truncated) from the server.
  • Circular Logging – when this is enabled the database transaction logs are automatically truncated by the server once the database operations are written from memory to the database itself.  When circular logging is enabled the transaction logs are no longer useful for restoring data in the event of a database failure.