6 Types of Backups for Cloud Storage
Passing CompTIA Cloud+ requires in depth knowledge of various exam objectives. In this post, we will cover Objective 3.3: 6 Types of Backups for Cloud Storage. Many times a backup strategy is an afterthought but it is important to include them in post provisioning tasks. It can save you work on a rebuild of a server. When handing over a ready server and another team does an install, that install can go wrong — and if you have a backup or snapshot, it can be easy to restore to the last known good. They can be a huge life saver.
To understand backup needs, the business requirements that drive them need to be understood. Do you have long data retention needs or do you simply need to be able to recover quickly? Do you have any regulatory compliance requirements or client service level agreements which dictate restore times or retention periods?
Which option to choose completely depends on the business needs and therefore we start there when selecting an option. Once the business needs have been determined, we must understand which option meets those requirements.
Snapshots that use redirect-on-write is a pretty neat technology that can minimize performance impacts of snapshots compared to other technologies such as copy-on-write. Essentially the storage provider has a table which lists the location of various blocks. Each location is referred to by a pointer in a table as it simply points to the location but does not contain the actual data. In the case of a write where the block needs to be modified, the pointer is just updated to the new location where the data is updated. This is compared to a copy-on-write where more is needed. First the original block is read (read operation). Then a copy is made (write operation) and finally it is overwritten (write operation).
A great use case for this is to snapshot prior to an upgrade. This can be the operating system upgrade, any number of software or application packages. This will allow for a quick rollback if an issue happens during the upgrade or shortly after. Snapshots usually revert quickly and take up a minimal amount of space. The longer the snapshot exists though, the more data it will start to consume as more changes happen. Some environments do better with longer standing snapshots than others. If there are performance impacts to these, it is imperative to make yourself aware of them. For example, some hypervisors do not recommend having a hypervisor snapshot around longer than a few days while Cloud providers and many SAN vendors have no such performance recommendation/issue.
Cloning is a backup option that has a few use cases. Many of them tend to revolve around duplicating systems to spin up new ones or speedy recovery options. This function tends to consume the most storage. This is because it is an identical copy, typically using the original format. Not only is the data copied, but the metadata about it copied. This is done for speed. In this case you are not only backing up the existing instance but also creating a new instance from it. It is a copy and restore in one action.
In most cases, this is used to duplicate a virtual machine to spin up multiple instances. In some cases it’s used as a temporary backup for a quick restore if something goes wrong with the source. A great use case for this is to clone a production server into a sandbox environment in order to test out an upgrade. A common phrase is "there is no system like production". Many times lower level DEV, QA, UAT systems can complete a certain upgrade just fine but certain quirks tend to only exist in production and this use case can help test scenarios that may only exist in production.
A full backup is one of the most traditional types. Much like a clone, it is a full backup where the contents of the data is identical. How this is stored may differ along with the location of it. For example, a clone may exist side by side in the same location in the same format whereas a backup is traditionally saved in the backup software's native format so that compression and deduplication may be able to be used. Along with that, it is usually stored on a different medium, such as tape, virtual tape or files on a harddrive.
Many system administrators feel a sigh of relief when they provision the system and have their full backup done. It’s a last known good restore point in case anything happens.
A differential backup is simply a backup that locates the changes since the last full backup and saves those. It typically takes much less time to backup only these changes, although with a restore, a full backup needs to be restored first and then the differential can be restored. This may sound like an incremental but the key distinction is that this backup checks for changes since the last full backup only and not since any other backup type. So even if an incremental or another differential has been done since the last full, the most recent still backs up since the last full.
Because these track changes since the last full backup, these are typically used spread out through the week to minimize the number of restores needed to get to the most current. We'll talk more about that in the incremental section.
Incremental backups are much like differential backups although they increment against any backup, whether it is full, differential or incremental. For example if you do a Sunday full backup and then incrementals until the next Sunday, you would have to restore the full and then every incremental to the point of restore. While this provides the fastest backup, restoring can take much more time. Sometimes these tradeoffs are worth it though to decrease server load.
The decision point on Incremental versus Differential depends on which is more important. That is, decreased server load during backups and shorter windows or quicker restores. When applying these to databases, the database may be sized to run backups regularly but restores need to be quick and a backup schedule with more differentials in the mix may help achieve any SLAs for restore times on data. On the other hand, SLA for restoring data may be fairly relaxed and best effort and therefore incrementals may be more helpful to reduce the load on backups.
6. Change Block/Delta Tracking
Change block or delta tracking is a fairly common technique. This is a feature that helps aid tools doing incremental or differential backups so they do not have to determine what's changed. This is already tracked so backup software can simply read that listing of changed items and get to work backing it up. Otherwise it may have to traverse the data to determine, trying to answer the question: has this already been backed up?
This is less of a backup type and more of a feature of backups. The backup types above many use this and it usually allows for quicker backup times. The underlying storage provider has to support this type of tracking though.
When trying to choose a backup strategy or technology, the business needs have to be determined first. These are usually driven by service level agreements or business continuity requirements such as recovery point objectives and recovery time objectives. Those determinations will dictate how quickly the data needs to be restored from an event.
Once the business requirements are laid out and the backup strategy implemented it is extremely important to document the process. Some weeks or years later someone may question the backup strategy and wonder why differentials were included along with incrementals or why differentials weren't used at all. This documentation shouldn't just include the what but the why so that those decision points and information that guided them is documented as well.