What’s it like to lead a cloud migration through a global pandemic?
One CIO, a military veteran, compares surviving COVID-19’s effect on his business to captaining a ship that is both sinking and on fire. “If you panic,” he tells me, “you’ll either burn, drown, or starve to death. The only way to survive is through strategic effort — let the ship sink enough to put out the fire, then patch the hole in the hull while you row toward shore.”
Over the past few weeks, I spoke with a number of CIOs and other technology leaders to understand what they’ve learned about cloud cost optimization through the pandemic. They’ve been through the fire and the flood, and in many cases have emerged stronger and more agile than before. Here are some of their insights.
Understand your spend before you make changes
Gary Watkins, CIO of International Markets at KAR Global, was just weeks from early retirement when the pandemic hit — but he sprang into action anyway.
“We followed a four-step plan to right-size our cloud costs,” he says: “Identify, measure, account, and optimize.”
First, Watkins and his team took inventory of resource usage across all their cloud environments to make sure they knew what they were paying for.
Next, they implemented additional monitoring to understand the usage patterns of those resources. (Does that Elasticsearch cluster in development really need to be so large?)
Third, they established governance and tagging to make sure they understood who owned which resources and why.
And finally, they executed optimization of cost where it made sense. “The basic model is just to be conscientious about what we’re using,” says Watkins. “What can we shut down, what can we turn off on the weekends?”
Shawn Justice, Watkins’ successor, points out the reason so many organizations fail to implement proper cloud financial management: “It’s simply easier not to do it. Anytime you have non-functional requirements like error handling, monitoring, whatever – it’s not customer facing, and it takes time to implement. But when everyone is focused on cost savings, it’s much easier to make the case for investing that time.”
In the end, Justice says KAR was able to achieve significant savings on their cloud bill by going after the low-hanging fruit, cleaning up after developers through automation and situational awareness. “We just turned things off and listened for the screams.”
— acloud.guru (@acloudguru) December 12, 2019
Growth requires cost management too
While quarantine has slowed growth in some parts of the economy, others have seen unexpected rises in demand. Healthcare startup Chartspan experienced a massive traffic spike this spring, just as they were moving on-prem workloads into the cloud.
CTO David Lowry jokes that his main strategy is to make sure the CFO doesn’t yell at him, but he’s also learned that the cloud can easily scale faster than your wallet.
“Small configuration issues, not immediately obvious, can lead to an outsize impact on your cloud bill at scale,” he says, citing a recent incident where increased traffic caused a backend cloud job to download thousands of large, duplicate files. This was never a problem in the past, but consumed terabytes of data transfer expenses under pandemic load.
Lowry and his team are now using Amazon CloudWatch dashboards to keep a proactive eye on spend, and migrating toward more cloud-native architectures as well.
“If you can run a scheduled job on AWS Fargate instead of a dedicated instance, certain costs just go away,” he says. “And now you can focus on other expenses — like compliance and data transfer — that are more challenging to control.”
ACG for Business now includes Cloud Playground: fast, fresh, throwaway cloud environments that keep you innovating at no risk to your budget.
Strategic workloads are rushing to cloud
They may be trying to keep a lid on spend at the moment, but the leaders I spoke with almost unanimously expect their cloud adoption to grow in the months and years to come. Across the industry, some 50% of businesses are currently accelerating their cloud strategy, trying to get ahead of a potential second wave of coronavirus.
Even though KAR was already far down the path of cloud maturity, Shawn Justice says the pandemic has sped up his digital transformation. “We’re always asking ourselves now: how do we digitize paperwork and other costly manual and physical processes? We believe the cloud is the right place for all our strategic workloads because it saves us time and money in the long run.”
With so many cloud providers and so many service offerings, it might seem appealing to architect solutions that mix-and-match between them, saving a bit of money on compute here or storage there.
But Randy Young, VP of cloud infrastructure at Mitel, says that approach is likely to backfire. “Learning two different sets of cloud services incurs its own costs in time and talent,” he says. “When you can write your automation and tooling for one standard set of services, you gain economy of scale across your business.”
Not only that, he points out that going “all-in” with one cloud provider can lead to significant discounts, something you’re unlikely to get if you attempt to play multiple providers off each other. But carefully evaluating any upfront commitments you make can help you avoid locking into spend that doesn’t track with unforeseen changes in usage.
The future: a big red button
The cloud makes it easy to provision new resources, and theoretically just as easy to shut them down … but inertia and lack of consistent governance means that cutting cloud costs can feel like going on a snipe hunt.
Ryan Hughes, VP of cloud at Nationwide, envisions a bolder path. “The future of cloud cost management,” he says, “is a big red button that can unilaterally shut down 40% of your cloud capacity in an emergency.”
This simply isn’t the way traditional IT organizations think — but moving your infrastructure commitments to pay-as-you-go does provide unusual flexibility for those prepared to take advantage, particularly for industries like oil and gas where profitability is closely tied to fast-changing external factors.
Hughes calls for deeper insights into what systems are truly essential. “If times get really tough, do you need all your development and staging environments? What production systems could you shut down to reduce overhead? What cascading impacts would that have?” He suggests that the mechanism could be as simple as an “essential” or “non-essential” tag placed on resources.
Cost optimization is everyone’s job
Gary Watkins knew that his four-step plan for cost optimization at KAR would only be successful if he could get the entire organization onboard.
“We tried to get in front and outmuscle the problem from a leadership perspective,” he says. “We had to get buy-in from the devs, analysts, admins — right down to scrum masters and product owners. We scheduled several sessions around cost awareness.”
Once the product teams were aligned, Watkins and Justice went to their vendor management office and asked them to up their game around financial management practices, providing insights on identified resources and measuring where cloud spend was highest. Then they involved the central cloud architecture team. “We asked them to step back, assess everything, and give us their perspective on what we could re-architect.”
Randy Young is following a similar approach at Mitel with his Cloud Center of Excellence. It’s a cross-functional team with representation from infrastructure, product management, architecture and development, and executive sponsors. They meet biweekly to review a number of topics that index the maturity of the cloud organization, including cost optimization.
“Internally, we’ve had to force folks to get uncomfortable,” says Young. “We use online training, lunch-and-learns, and mentoring to help people understand that we are leaving the old IT model behind. As a central team, we are about enablement.”
Ultimately, Watkins says that although we’re living through unprecedented times, the coronavirus pandemic is just underscoring the cloud fundamentals everyone needs to master.
“We teach our people to question everything,” he says. “Everyone should know how to ask the cost-effectiveness question — whether or not they self-describe as ‘technical’. And during times of duress or uncertainty, that’s all the more reason to challenge and seek clarity.”
A Cloud Guru has helped thousands of businesses scale their cloud expertise through training and enablement.