Leaving the Cloud—Mastering Costs to Avoid Becoming a Cloud Casualty
Most companies have put in place measures to optimize cloud billing and monitor usage. Both of these are important, but many companies are missing an important third component, which could significantly reduce costs and save overbudget cloud programs. In this post we review cost management measures with emphasis on the third, missing discipline.
Joe Bastante
6/12/20247 min read
You've probably read stories in the news about companies moving their workloads out of the public cloud and back to on-premises. X, formerly Twitter, garnered much attention toward the end of last year when they claimed they had attained a 60% cost reduction, with exiting the cloud being a key part of their strategy. Check out this InfoWorld article with survey data by Citrix, which found that almost half of the respondents said cloud costs exceeded their expectations.
I am not making a case against the cloud. Rather, when companies quit the cloud, cost is usually the reason. Furthermore, while numbers vary, many cloud experts claim that about 30% of public cloud costs are waste on average. I believe the number is likely higher since some wasted resources are not self-evident. For example, it's easy to determine that a virtual machine is overprovisioned, but it's not easy to determine whether the countless files being added to the cloud by staff are all needed.
In this article, I'll cover the essential elements for managing cloud costs. The first section will address cloud pricing and utilization management. I won't go too deep here, assuming it's more familiar, but let me know if you'd like more detail. The second section will discuss the less commonly practiced discipline of establishing a cloud usage policy and integrating controls into supporting processes. It will highlight some of the big and commonly overlooked causes of excessive costs.
Cloud Pricing and Utilization Management
In this context, cloud pricing means maximizing value from cloud pricing discounts and programs. Utilization management refers to measures to track cloud usage and eliminate waste. Let's begin with the main components of cloud pricing:
Discounts via an Enterprise Agreement: In establishing an enterprise agreement with cloud vendors, discounts are normally included and influenced by the anticipated total cloud spend.
Private Rate Commitments: Cloud providers often offer additional discounts on a client-by-client basis, normally in response to minimum spend commitments. Cloud vendors are motivated by spend commitments, particularly when they grow each year.
Migration Programs: Cloud providers may offer additional savings for workloads that are newly migrated to their cloud. For example, AWS has a Migration Acceleration Program for this purpose. Discounts are not permanent but are temporarily offered to incentivize moving new workloads from on-premises to the cloud.
Savings Plans: This is another avenue for committing to spend on specific types of resources, bringing additional savings compared to on-demand resources.
Reserved Instances: Reserved instances are specific resources reserved for an extended period, normally one to three years, which offer additional cost reductions relative to on-demand resources.
Spot or Low-Priority Instances: While the name varies, these are instances that can be taken away while in use, but this inconvenience comes with a significant discount. Applications must be able to handle the interruption and grab a new instance when preempted. Technically this isn't a savings program, but it is a savings opportunity.
Complaining: Ok, this is an odd category, but I once bitterly complained to our cloud vendor about the network costs they charge for accessing data in the cloud. Following my tirade, the cloud vendor offered to adjust the costs. The moral of the story is not to assume that prices are immutable. Demand what you think is fair.
Companies ought to employ most or all of the strategies above. They must also actively manage utilization. Below are common strategies for minimizing costs through utilization management. All items below assume usage is actively monitored using cloud vendor cost analysis tools, third-party tools, custom tools, or likely a combination of these.
Rightsizing: Some resource types, such as virtual machines and databases, are relatively easy to spot when they are overprovisioned. If you allocate larger machines or more resources than are needed, you’re overpaying and resources need to be rightsized.
Business Need Justification: Should the servers, databases, storage, services, etc., even be running? Is there a business justification? This type of waste is harder to quantify but essential to control. A process is needed to purge and prevent unnecessary accumulation of resources.
Storage Lifecycle Management: Public cloud vendors offer many tiers of storage, some at very low cost for infrequently accessed data. Notable savings can come by managing data lifecycle and transition to lower-cost tiers as data ages. Keep an eye out for backups and system snapshots, which can grow costs quickly if not managed.
Latest Hardware: Cloud vendors continuously upgrade their hardware, for example, processors and storage. Since newer hardware is normally more efficient to run, cloud customers will often pay the lowest prices if they use the latest equipment. An ongoing program is needed to plan and manage these transitions.
Automated Quiescing: Many resources can be shut down during evenings and weekends, for example, development or sandbox regions. Automated tools can be used to shut down these resources during off hours. This approach depends on reliable resource tagging.
Autoscaling: Effective cloud usage is predicated on the idea of autoscaling, or scaling resources up or down just in time to avoid under or overprovisioning. Adding support for autoscaling often requires rearchitecting the system, which leads to the next utilization management technique.
Rearchitecting: Some applications and systems are simply not designed to use the cloud efficiently. Such applications need to be rearchitected to benefit from cloud features and run cost efficiently. Cloud principles and governance are needed to ensure that applications are only migrated to the cloud when appropriate and with the appropriate architecture.
The Missing Discipline
I’d like to share an insight I wish I had many years ago when I began leading a large cloud deployment. Cloud costs must be viewed from two perspectives. The first, which was the topic of the previous section, is all about ensuring competitive cloud pricing and appropriate utilization. The second perspective is about a company’s cloud philosophy and how that philosophy will change work processes and controls. I know that needs explaining, so allow me to drill in.
Let’s view cloud deployment philosophies on a continuum. On one extreme end, the cloud can be managed just like on-premises infrastructure and platforms. Let’s call this the status quo extreme. In the status quo case, only a centralized group can deploy cloud resources and assign permissions. Budget is confirmed before deploying resources and project members are given very limited and specific permissions. From a cost perspective, this option is generally the easiest to show an apples-to-apples cost comparison between cloud and on-premises because the underlying ways of working stay the same even if the technology is different.
I can almost hear many questioning the value of moving to the public cloud in the status quo option, which brings us to the other extreme. We’ll call it the maximum entitlements extreme. In this case, work transformation, agility, and innovation are emphasized. As a result, workers are given full access to their own environments and have permission to spin up whatever they need. I’ve noticed new cloud deployments tend to lean more toward this extreme. Unanticipated cloud cost growth can be significant in this option. Importantly, a cloud deployment following this approach can’t really be compared financially to an on-premises model as they are wildly different. Giving cloud users permission to create new resources at any time increases costs, even if the per resource costs are cheaper in the cloud as compared to on-premises. Deployments leaning toward this philosophy can be expensive, and companies wrongly conclude that the cloud is too expensive given increasing costs. A more appropriate view would be that the new ways of working in the cloud are creating unanticipated costs and it's necessary to revisit the cloud policy and associated work processes.
So then, the question isn’t merely whether the cloud is economical, but whether the policies and ways of working are deliberate, well understood, and factored into the financial model. To wrap this up, let me suggest a few areas that must be addressed to more fully attain reasonable and predictable cloud costs.
Cloud Use Policy: A cloud use policy can be a very broad document including topics such as security, governance, controls, etc. Of particular interest is the policy regarding entitlements given to cloud users. For example, are developers given their own accounts, subscriptions, or environments? Are they allowed to spin up resources? If so, what resources can they create? Are their environments permanent or only allowed for a predefined timeframe and then automatically purged? These are just examples of policy questions and content, though I don’t often see these topics clearly specified.
Delivery Process Integration: Many organizations use rearview mirror tactics to manage cloud costs. In other words, they focus on cloud billing information and utilization data, which are historical. Effective cost management requires cost controls to be integrated with project, product, and change processes such that costs are understood, approved, and planned before cloud resources are allocated. For example, during a project, the solution design can be reviewed and used to produce a cloud cost estimate for the project. This estimate can be approved with the budget allocated prior to spinning up resources in the cloud. This contrasts with a typical approach where cloud resources are added in an uncontrolled manner, creating overruns and surprises.
Budget Monitoring: Once budgets have been established as described above, they can be codified within the cloud platform so that automated alerts are created should deployments exceed budgeted amounts. This alerting feature is offered by public cloud vendors and obviously is only relevant if budgets are established in advance.
Centralized Spend Optimization: Companies often centralize management of cloud pricing and contracting but decentralize utilization management, delegating it to business units or departments. Since cloud resources are tagged with cost centers or other financial identifiers, the idea is that business units should be accountable for managing their own spend. In my experience, cloud utilization management and optimization entail more complexity and specialized skills than can be expected of all departments. While spend management is a partnership, companies tend to leave money on the table when they expect every department to become cloud efficiency experts.
Resource Allocation for Cloud Hygiene: The following scenario plays out repeatedly across companies. Changes to an application or system are needed to maintain favorable cloud costs, but application owners are too busy to participate in changes. For example, an application may benefit from being moved to the latest instance type, but team resources have no time allocated for this work. Resource allocation processes must account for time allocations needed to maintain a cost-effective and well-run cloud.
In summary, maintaining cloud cost efficiency requires more than just negotiating favorable pricing and monitoring usage. It requires a deliberate and clear policy, adaptation of delivery, change, and resourcing processes, and a skilled and empowered centralized team driving continuous efficiencies. Active management of cloud pricing and utilization is important. Yet, efficiency will not be attained until a clear policy has been established and work processes have been adjusted to enable efficient implementation of the policy.
I hope you found this post informative. Reach out to me if you have questions or feedback.
Contact us
Whether you have a request, a query, or want to work with us, use the form below to get in touch with our team.

