At the beginning of your cloud migration journey? A focus on building five fundamentals — training, connectivity, security, monitoring, and cost management — can go a long way toward helping you overcome the obstacles that slow and disrupt change.
In my years working in IT consulting and services, and recently in AWS, I have seen a common inertia that works against change — FUD. Fear, uncertainty, doubt.
FUD is a human fallacy; one that cannot be solved in a paper like this. But I will offer some learnings which might help reduce the inertia so that you can spend less energy on proving why a change cannot be done, and spend more energy in building — thus changing.
In the context of this blog, change refers to migrating to cloud. Why am I using the words “migrating to cloud” and “change” interchangeably? Because, like a forcing function, migrating to cloud will bring change to some properties of your organization. It has to.
If I sound opinionated, then I am. That opinion stems from seeing too many organizations resisting this change and miserably failing to reap the benefits of the cloud.
Also, don’t get too hung up on the word “migration”. It may not necessarily mean foregoing complete reliance on your beloved on-premises IT infrastructure. How much reliance you can forego is predicated on what your motivation to migrate to the cloud is.
For the reader, a clarification before getting into the steps of change: there is no “easy” button to migrate to the cloud; any cloud. You migrate your production workloads to cloud not because it is easy to do, but because it is the right thing to do.
To catch you up, in case you’re coming out of a long cryo-sleep: running a business on cloud has moved past the “proof of concept” stage. There are thousands of organizations from every imaginable vertical segment running businesses on cloud.
For some, that may still not be enough justification to migrate to the cloud. For others, they truly do not know how to operate on cloud or how to organize themselves to get the best out of this change. Whatever your situation, I recommend performing due diligence to identify what your business motivations are to be on cloud. Doing this exercise should not cost you an arm and a leg.
This due diligence, if done correctly, will form strong fundamentals that will pay forward. In many cases, these motivations should come straight out of your organizational goals or business strategy.
I do not believe there is an absolute framework or questionnaire to help identify your motivation. However, there is one rule you need to follow — do not identify a tech to form fit your motivation. Your motivation should be tech agnostic, focusing only on what you want your business to achieve. Not how.
Ask questions like: “Do we want to reach more customers in a shorter time”, “Do we want to add features to our services or products quicker”, “Do we want to reduce our cost of operations”, “Do we want to increase the resilience of our operations”, etc. Based on business unit or geo or however else your organization may be internally sliced, you will most likely get differing responses based on what is important to each slice.
At the highest level of your leadership, these answers will need to be stacked to yield the final ranking of motivations in line with the org’s business goals. If you haven’t guessed it already, doing this phase in lockstep with your business and IT leadership is of paramount importance. I cannot stress enough the importance of bringing the village together at this stage. It can help you avoid a number of issues down the line, like “the great stall”, “pockets of change”, and most importantly, frustrated souls.
If you’re unbiased in your due diligence but still do not identify any motivation, then exit(). Please. Do not throw yourself into the predicament of “Hey, let’s go-to cloud because everyone else is”.
Pilot your cloud migration fundamentals
You will find that the technical details below are indexed on AWS, the technology that I am familiar with. Other cloud providers may have equivalent services of their own.
To be clear, this is not a pilot of whether cloud will work for your business. If you’re past the motivation phase, it will. I’m calling it a pilot partly because I do not know a better phrasing and partly because I recommend not undertaking full-blown projects unless you have built certain fundamentals: training, connectivity, security, monitoring, and cost management. These fundamentals are valid irrespective of your motivation.
Cloud fluency training
Training: Remember FUD? A number of organizations face FUD headwind because they don’t spend time training their employees and leaders. Arguably, this fundamental is the most important and often overlooked. I have written another piece on how to build cloud fluency for everyone. See the thing is, cloud has shaken up how you build. This is because the cloud providers like AWS have invested so much capital in building elasticity, redundancy, resilience, security, and global reach that these have become table stakes. Accordingly, customers who build on AWS expect these primitives as defaults and are no longer hamstrung for innovation.
If you adopt cloud thinking it’s just servers hosted elsewhere, you need to work on this fundamental. ACG, Linux Academy, and your cloud provider (e.g., AWS training) will have training courses and certifications. Build a training plan intentionally, buy training hours, and reimburse employees that pass certification exams — do not cheap out.
Cloud data migration
Connectivity: You need to connect to the cloud. The type of connection you will use between your on-premises IT infrastructure and the cloud will morph over time. Connectivity to the cloud is generally a function of 1) type of workload (i.e., applications and data) you will be running on cloud vs on-premises, 2) minimum data transfer rate/throughput you need (also dependent on the workload or the use case), 3) geographic spread of users or customers, 4) existing network protocols, network devices, and their deployment, and 5) tolerance on where your data traverses (i.e., over the internet or exclusively on the pipes of your cloud provider).
Training your network engineering team on connecting to the cloud is an investment you should make early. You will have employees who will jump to the conclusion that embracing cloud = their exit. Training them early can help calm their fears and bring them around to seeing cloud as an opportunity rather than a threat.
Additionally, the idea of a programmable network, separation of network control plane and data plane have furthered on cloud. Remember that networking is the fundamental fabric which when done correctly can orchestrate the existence of your on-premises IT with the cloud resources. Investing in the people that build and operate the network is important.
Cloud migration security
Security: If you’re running a business, you need to be serious about security. When you view security fundamentals, it makes it easy to view it through the lens of operation, prevention, detection, and response. The good thing with cloud is that you don’t need to worry about security “of” the cloud i.e., physical security, infrastructure security, host operating system, and virtualization. AWS shares what they call the shared responsibility model in which the customer is responsible for security “in” the cloud. A customer’s responsibility of security “in” the cloud differs based on the AWS service(s) they choose to use.
In addition, cloud providers all have an access management tool. Like the AWS Identity and Access Management (IAM) service which is used by customers to setup permissions. Typically, you will already have an on-premises identity store (like a Microsoft Active Directory) which you will want to extend for cloud use through identity federation. All cloud providers have established design patterns for connecting (or “trusting”) your on-premises identity store to your cloud resources. These will form the primitives of operation and prevention.
Note that the concept of an account is pervasive in the cloud world. Within AWS, an account is an envelope within which everything happens. AWS has developed a set of best practice (recommended) guardrails for customers to use when vending accounts. These best practices are wrapped in a service called AWS Control Tower. Control Tower lets you vend accounts while conforming to these guardrails as you choose.
It is a good idea to review these guardrails and build your account vending workflow right from the get-go. Start with the minimum permissions and add as needed as you grow. These guardrails form the basis of prevention and detection. At this time, review AWS Organizations too. As the name suggests, this service allows you to organize your multiple org units, apply policies at org unit level, attach accounts to Amazon GuardDuty as members for continuous monitoring, and create a separation of concern (dev concern, test concern, security/cleanroom concern, etc.)
You’ll also need to be able to respond to security events. Security is not the airplane that you build as you fly. You need to pre-build the capability to perform forensics in a secure environment using runbooks, IAM, and AWS Security Hub.
One last thing, whatever you do within this fundamental pillar, please AUTOMATE. When it comes to security, humans may have the best intentions, but humans do not have mechanisms. Automation has mechanism. Automation can support scale. Automation can trust but verify. Security (and by extension compliance) is done best when you codify. Security by Design (SbD) is your friend here. AWS has a defined posture for SbD. Following SbD, 1) you define what “secure” looks like to you, 2) you codify those standards using Infra-as-code (like Cloudformation or Terraform), 3) you harden your infrastructure, accounts, systems by deploying that code only by allowing use of curated services (by using something like AWS Service Catalog), and 4) continuously monitor for drift of your deployed environment from your security baseline (like using AWS Config).
Good luck doing that manually.
Monitoring: Monitoring influences how quickly you can preempt issues and resolve them (fancy lingo: MTTR — mean time to resolution). Till you have transitioned into DevOps (simply put, equitable ownership between those who build and those who support), you will likely have a separate NOC and a SOC team. Traditionally these teams sit outside of dev teams in smoky backrooms, “speak” to each other only when there’s an issue in the middle of the night, or when there is pizza on the floor. If you do not change that approach, you will stymie innovation and stop scaling after a while. But ultimately that’s your prerogative. However, if you start the transition to DevOps using that forcing function of change I spoke about earlier, monitoring should be as sexy as development.
Monitoring overlays a few areas of cloud. Monitoring is integrated with a number of services (for Amazon CloudWatch, it is integrated with close to 99 services as of this writing) you may use within cloud. During the development of the fundamentals, you should likely index on infrastructure, security, and cost management monitoring. When you first set up infrastructure monitoring, you can choose the default cloud provider metrics. Within AWS there are about 296 default CloudWatch metrics at the time of this writing across Elastic Load Balancer, EC2, Elastic Block Store, S3, and more services. During the early days of cloud migration, these metrics should be sufficient to get you started. I alluded to security monitoring above. As for cost management monitoring within CloudWatch, you can get a view of estimated charges by service, but that alone is not sufficient. I will expand on cost monitoring next.
Note that with CloudWatch you will be able to enable cross-account, cross-region monitoring which gives you the ability to create a single pane of glass for monitoring without the need to switch accounts (Regions in AWS are physical locations around the world where they cluster data centers. Regions help create cellular infrastructure and among other things have built-in redundancy to overcome failures. Region is a topic of its own, maybe for later). CloudWatch also gives you the ability to automatically respond to events based on alarms using rules. This ability can significantly reduce your operational complexity in the production environment.
Cloud migration cost management
Cost management: Monitoring and controlling costs of your cloud resources must be within the top two or three things you do when you migrate to cloud. As such, spinning up a cloud environment is made easy by making procurement equal for everyone — click through online agreement. This section does not deal with what Enterprise Agreement you may enter with you cloud provider and what benefits you may receive from that agreement, but know that every provider has one.
I will focus on cost management that you need to do irrespective of what agreement you are on. Cost management can be looked at through the lens of cost control, cost analytics, and reporting. If you’re on AWS, sign up to use AWS Organizations. It is a free service that not only gives you many benefits of multi-account security and governance described above, but also signs you up for consolidated billing.
From a cost control standpoint, use AWS Budgets. Budgets allow you to set cost budgets by account (or for about 28 services) to notify you when the charges reach a set threshold amount within a monthly billing period. You can also set usage budgets based on set usage types — about 24 of them across EC2, S3, and DynamoDB. When creating a cost or usage budget you can set action (called Budget Actions) to take automatic measures when your budget threshold exceeds. Although the kind of actions you can take are limited at this point, it lets you attach Service Control Policies (SCPs) to a specific OU or account, which can be a powerful way to stop the bleeding of cost by denying permissions to put new objects in S3 buckets, as an example.
Use AWS Cost Explorer (CE) to analyze cost and usage covering the cost analytics perspective. CE can provide you a view into reservations (for instance based compute resources like EC2, RDS, Redshift, Elasticsearch, etc) and rightsizing recommendations. It also provides you the ability to view the utilization of purchased reservations and how much your purchase reservations are covering your actual eligible usage. Recommendations are forward-looking while Utilization and Coverage are after the fact. Cost Allocation Tags (CAT) within AWS are an additional feature that I recommend using and in fact automating through the use of infra-as-code. CATs are simply a key-value pair, defined by you, and are treated like metadata for your AWS resources. These tags propagate through emitted usage and end up in your cost and usage analytics (Cost Explorer, Budgets, Cost and Usage Reports) so that you can view and analyze cost and usage in a human-understandable way. CATs also greatly reduce the complexity of chargebacks, if that is your thing. You should focus on using reactive tag governance too, to tag resources that somehow slip past the initial proactive tagging. AWS Config, Tag Editor, and Resource Tagging API can be used for reactive tagging.
Finally, AWS Cost and Usage Reports (CUR), lets you view usage and data in the most comprehensive form factor. Do me a favor, please do not try to work with the CUR csv at the end of the month — please integrate with Amazon Athena to be able to query the data.
There are additional resources that can help you with indirect cost management like AWS Trusted Advisor, which can provide reports on cost optimization opportunities. There are various other ways to stop the waste of AWS resources and to keep you from overprovisioning, but those are mainly relevant for specific workloads and may not be a fundamental pilot specific item: Reservations and Savings Plans are great ways to get deeper discounts off of on-demand rates for certain types of services. EC2 Spot Instances are often overlooked, but if you can tolerate fault, then Spot Instances can save you up to 90% off on-demand rates. If you believe in artistry in science, then check out AWS Cost Anomaly Detection, which detects anomalies using ML through contextualized monitors you create and alert you from getting sticker shock. S3 analyzes and can use lifecycle policies to move infrequently used objects to colder storage, yielding a lower cost of ownership of content.
If it seems too much work to migrate to cloud, well it is, if you do it right. Also, in your path to migration you will reach milestones that will present unique challenges and opportunities for learning. This blog focuses on the first milestone — to get started. As you reach further milestones, new technical and business decisions will need to be made.