The IT “playing field” is essentially leveled with public cloud, because it brings world-class IT management and IT service management (ITSM) capabilities to all. Release management is a good example of this.
It’s hard not to get the feeling, when using a leading public cloud service provider, that at your fingertips you control a large, complex supercomputer that straddles the globe. It’s packed with advanced features, democratically available to those even with the smallest of IT budgets.
Public cloud delivers enterprise-grade features, although many would say “better than enterprise-grade features,” for enterprises down to Mom and Pop Shops. I’m talking about features and capabilities that used to be only the province of billion-dollar IT budgets in a handful of the largest global banks.
For instance, the leading public cloud service provider in question has made top-quality release processes democratically available to everyone (in the cloud). In this blog, I want to drill down into how companies can benefit from such capabilities to improve their release management.
Included in this is world-class continuous integration and delivery tools that enable everyone to do advanced release management techniques such as “canary releases” – “a technique to reduce the risk of introducing a new software version in production by slowly rolling out the change to a small subset of users before rolling it out to the entire infrastructure and making it available to everybody.”
If you’re interested, and sorry if you aren’t, the name “canary release” comes from the old underground coal mining technique of taking a canary into the mine and if it stops singing, and falls onto the floor of the cage, then you know that there are poisonous gases around. Gases that endanger the human workers, and it’s time get out quick. The canary dies but many people survive.
Please read on for three canary release opportunities.
The canary release technique helps to reduce the impact of negative network changes by gradually rolling out the changes. If an issue with the new release is detected during the rollout, then it can be rolled back, and only a subset of the traffic will have been impacted.
This is different from a “blue/green” release (or deployment), which is much more “big bang” (still with easy rollback, though!). With blue/green you build a new environment side by side with the current live environment and switch over to make the new environment live. And switch back to roll back if needed.
The blue/green switchover can be done using DNS alias/cname records, and so can the canary release. The difference with the canary release is that you only direct a subset of traffic to the new environment by using weighted DNS records. Say, send 5% to my new environment… check if all is well, then maybe increase it… until 100% is on my new environment. At all times, you have the old environment there to fall back on.
Using DNS to change workloads can be a bit unpredictable because, well, the Internet. And users. And maybe even grumpy cats.
It’s possible for DNS records to take some time to propagate across the Internet, so while you might have flicked your DNS switch, it might be five minutes or twenty-four hours before internet service providers and their users update their DNS cache.
In advanced public clouds, it’s possible to use a different technique, one that introduces the new release by adding new servers behind the live load balancer. This can be done very cleverly by changing the launch configuration for an auto scaling group, then adding one to the desired instances so that one new release is added and traffic goes to it.
If all is okay, you can use the auto scaling group to rotate the old instances out of service and add in the new ones.
It’s also possible to do all of this from within the application and not have to “mess about” with cloud infrastructure. With the “feature toggle” technique, the application has been configured to change system behavior without changing code or cloud resources.
It’s possible to do static feature toggles, where toggles are turned on/off in the source code and changing the live system requires a release, or dynamic toggles, where the code does a lookup to instance metadata or an external key/value store (sometimes called a toggle router) to work out what to do.
Feature toggles are used for more than just releases, such as live experiments into operational behaviors and performance. They are normally temporary in nature. This is like popping up canaries on demand and it’s very powerful… but like all powerful techniques, it comes with danger! Feature toggles that live on in code can risk triggering unexpected and undocumented behaviors so they should be used with care.
ITSM change, configuration, and release management processes can be supercharged in the public cloud because advanced techniques such as canary release are common practice. It’s yet another benefit of public cloud – it’s programmable and integrated, allowing for sophisticated techniques to be employed to reduce the risk in releasing new code to production.