| certifications | cloud - David Chapman
7 Types of Updates in the Cloud
Updates are an important part of the lifecycle of all things IT. Sometimes they resolve security issues, other times they are bug fixes or feature enhancements. We like to think that things just run themselves forever, particularly in cloud environments but it all requires periodic updates. When architecting an environment, it is important to understand these types of updates and their impact. This is a key area of focus on CompTIA's Cloud+ certification exam.
1. Hotfix: Fix a Rare or Edge Case Bug Quickly
Hotfixes are typically updates to address specific issues. Vendors may create them as an interim solution until it can be included in the next cumulative update. They are usually applied live to hot or running systems. Due to the nature of the issue being hotfixed, these can be outside of regular release cycles. For example, Microsoft releases patches on the second Tuesday of every month, but often hotfixes are released outside of this window. They may then later be included in the normal patching release cycle going forward.
There can be risk in deploying a hotfix. So, a systems administrator has to weigh the risk versus benefit when deciding to deploy. A hotfix may exist but the risk may be too great for the business — and therefore may elect to wait on deploying.
2. Patch: Keep Things Updated and Secure
A patch can be much like a hotfix, however, in its purest sense, a patch updates or modifies an existing item being updated. Instead of replacing the entire file or file(s), just specific sections needing to be updated are patched. It can be very similar to a hotfix except that it typically follows the defined release cycle. If there is no immediate urgency or great impact to what the patch is correcting, vendors will opt to release this in the next patching cycle.
There are pros and cons to patching. It can take long for a patch to get released but it is a much more scrutinized process. Hotfixes can cause unintended side effects because of it being "quick and dirty" fix. Meanwhile patches likely will follow the typical release engineering which involves quality assurance and regression testing. Often once hotfixes have been thoroughly tested, they make it into the patch release cycle for general availability.
3. Version Update: New Features and Major Bug Fixes
Version updates can apply to quite a few things. For example, for a Platform-as-a-Service offering using Java, there may be a Java version update from 1.7 to 1.8. In the most simplistic sense, a version update is an update that causes the version number to increase. Some operating systems allow in-place upgrades. Linux is usually very good about this. Therefore, minor, and in some distributions, major version upgrades can be done.
When performing version updates it is important to read the release notes as functionality or features can change. Bugs are usually fixed. But the larger the version changes, the more that is likely to change "under the hood". Sometimes in major version changes, features can be deprecated and no longer available. In other cases, new features may be implemented.
For example, in new versions of OpenSSL, new TLS suites and ciphers are available whereas older, less secure ciphers and suites get deprecated or removed over time. This can impact the ability of TLS negotiations, so it is very important to keep an eye on potential effects.
4. Rollback: The Backup Plan When Things Go Wrong
All updates have some level of risk — some are known, others are not. The way to mitigate these risks is to have a rollback process or procedure in place. There are a number of rollback techniques available, particularly in cloud environments. Every update plan should have a rollback procedure documented. In some cases it is simply uninstalling the most recent update.
Other platforms have a specific rollback option when applying one of the above methods. Sometimes, reverting to more traditional methods is warranted. Don't put all of your eggs in one basket with a rollback plan though. If you have the time and resources to have multiple options, plan for it in hopes you never have to use them.
5. Platform Rollback: The Simple Rollback
When using Platform-as-a-Service (PaaS) offerings, many of them have simple one- or two-click update options. Keep an eye for the rollback or downgrade functionality. This is the most simplistic method to returning to normal if an upgrade goes south. Platforms and services that offer this make a systems administrator or DevOps engineer's life much easier. They orchestrate much of the heavy lifting into bringing the environment back to a consistent state prior to the update or upgrade. This is one of the huge benefits to using a managed platform update and/or upgrade is they often have an easy rollback option.
6. Snapshots: The Quick and Dirty Rollback
With snapshots, a point in time capture is taken of a particular instance so that it can be fully restored to that state. Whether you use automated snapshots or manual snapshots before performing an update, they can offer a quick restore to previous state. It is important to understand though that restoring to a previous state will remove any new data generated on the object in question. If this is a virtual machine (VM) for example, any logs or data written to disk will also likely be lost.
Not all snapshots are created equally though. Some just snapshot what is written to the disk. This can leave databases in inconsistent states because some of the data is in RAM. To mitigate this, some snapshot features include system RAM when doing the snapshot to capture a complete system state. There also are hooks in the operating system to tell it to flush any writes to disk prior to taking the snapshot. Volume Shadow Copy Service (VSS) on Windows, this is one of those. It offers plugins to write SQL data to disk and many other types of supported providers.
7. Backups: The Catch-All for Rollbacks
When all else is lost, a good backup can allow you to restore from that known state. As with snapshots, restoring an entire instance or object can cause data loss. Many times backups let you restore granularly and parts of the instance that may help mitigate this. Another mitigation is to restore an offline copy into a new instance, so that the instance does not replace the currently running one. In this scenario, data can be pulled off the newer instance and put back on to the restored instance.
Backups and restores are usually a catch all or last resort option though because there is greater risk of data loss if your backups are not running continuously. Many opt for daily backups and some hourly. There are other and better alternatives to this but if you've made it to this option, those have likely failed you.
When architecting your cloud environment, updating it is an essential part of maintenance. Sometimes these updates can be planned in advance. Other times there are emergency hotfixes that will need to be applied. Ensure that you understand any downtime required to apply these patches, updates and upgrades into your environment. Updates can go wrong and a backout or rollback plan needs to be in place for those instances as part of a risk mitigation plan.
Don't just settle on one rollback plan. Be aware of all of your rollback options — and pick a few that will result in the least disruptions to your networks and end users.