If you’re just getting started in software engineering, or you’ve been around a long time, you’ve probably at least heard the terms “Infrastructure as Code” and “Terraform” mentioned. But what are they, and why are they important?
Terraform is an Infrastructure as Code (IaC) tool that allows engineers to define their software infrastructure in code. While the idea of “code” may not be novel to engineers; the ability to provision infrastructure this way is a powerful abstraction that enables managing large distributed systems at scale.
In this article, we’ll take a look at what Infrastructure as Code and Terraform are, how they can help you in your work as a developer, and how you can get started using them.
Grab the Terraform cheat sheet
Check out the top 10 Terraform commands and get a full rundown of all the basic commands you need to get the most out of Terraform in our Terraform cheat sheet.
What is Infrastructure as Code?
Infrastructure as Code is a way of defining and managing your infrastructure using code, rather than manual processes like clicking through a UI or using the command line. This means that you can manage your infrastructure in the same way that you manage your application code – with version control, automation, and collaboration. In other words, infrastructure as code is a way of making your infrastructure more like software.
Historically, Infrastructure as Code has seen many iterations, starting with configuration management tools like CFEngine, Chef, Puppet, Ansible, and Salt. Newer tooling like Cloudformation and Terraform take a declarative approach and focus on the actual provisioning of resources, as opposed to the configuration of existing ones. The newest generation of tools focuses on using the capabilities of existing imperative programming languages. AWS and Terraform both provide Cloud Development Kits(CDKs), and Pulumi is also a popular option for provisioning infrastructure with traditional software tools.
What is Terraform?
Terraform is a tool for provisioning, managing, and deploying infrastructure resources. It is an open-source tool written in Golang and created by the HashiCorp company. With Terraform, you can manage infrastructure for your applications across multiple cloud providers – AWS, Azure, GCP, etc. – using a single tool.
To get started with Terraform, developers simply need to download the Terraform binary, choose which provider/platform they’ll be working with, create some boilerplate configuration for that provider, and they can get started creating infrastructure code.
One of the key features of Terraform is its declarative syntax. This means that you define what your desired end state is, and Terraform figures out the best way to achieve that. Compared to the imperative workflow of traditional programming languages, this can be a bit of a shift in mindset – but it enables managing infrastructure deployments at scale without the steep learning curve typical to software development.
The typical workflow for provisioning resources with Terraform is as follows:
- Some Terraform configuration is written, including the provider definition.
- The working directory is initialized(this is called the root module).
- Provider plugins are downloaded.
- The command `terraform plan` is run in the root module, generating a proposed plan to provision resources.
- If the plan is acceptable, the `terraform apply` command is run and resources are provisioned.
Terraform keeps track of the state of the resources it manages in a state file. This file is essentially a large JSON data structure that tracks proposed changes to infrastructure, as well as out-of-band changes that may have occurred to live resources outside of the Terraform configuration. New Terraform users typically maintain a local state file on their workstation or laptop. At scale with multiple engineers managing infrastructure, the state is typically broken down into multiple files and stored remotely using services like AWS S3.
Terraform is also “idempotent”, which means that repeated plan/apply cycles will not trigger a re-deploy of resources; only changes to the existing state will be reflected in a new plan and apply invocation.
Now that you’ve learned a little bit about what Terraform is and what it does, we can start to explore the “why” of Terraform and the benefits it provides.
What are the benefits of Terraform?
Utilizing Infrastructure as Code to manage and deploy infrastructure resources unlocks several benefits for developers and engineers. Making Terraform your tool of choice for IaC confers additional benefits that allow developers to leverage easy-to-use tools to manage complex software architecture.
1. Terraform configuration is written using a declarative paradigm.
The best way to understand declarative code is to compare it to the more familiar pattern of imperative logic that most modern programming languages use.
Imagine a basic Python program that takes a series of numbers as input from the user, prints each number out in ascending order, then returns the sum of all the inputs. A programmer needs to write the code in a specific, logical order, or the program will fail. If the program attempts to sum all the numbers before the input is received, it is likely to generate some kind of exception or error. If the programmer wants the numbers to be sorted into the correct order, that will need to occur before they are printed as output.
In each part of the program, the programmer is having to specifically define both the logic and order of logic when the program executes. In contrast, declarative program syntax means the programmer needs to only “declare” the desired end state of the program. The compiler, in this case, the Terraform binary, is programmed to determine the best order of operations path through which they achieve the desired end state described in the configuration. This is especially helpful in dealing with cloud provider APIs, as many cloud resources have dependencies on the creation of other foundational resources before creation can proceed.
2. Hashicorp Configuration Language (HCL) is a Domain Specific Language (DSL).
Domain-specific languages are designed with a specific use case in mind, specialized to handle the requirements and constraints of a specific application or program domain.
If the utility of a DSL doesn’t seem immediately obvious, you should consider one of the most famous and widely used DSLs: HTML. HTML is a markup language that focuses on the domain of hyper-text (read: the internet). HTML does away with complex program logic and syntax and focuses on the specific use-case of content presentation.
In the case of Terraform, developers only need to learn a minimal amount of HCL syntax before they can be productive. As a DSL, Terraform makes development easier and more efficient by abstracting away the complexity inherent in general-purpose languages. That’s not to say that Terraform doesn’t have more complex logic; advanced users can use newer, imperative constructs like for-loops and if/then logic.
HCl is considered a superset of the JSON language, which means it shares similar syntax, but also has additional features beyond the scope of JSON.
3. Terraform is widely adopted
Terraform is generally considered the industry standard when it comes to Infrastructure as Code tooling. It isn’t always good advice to go with the herd, but when it comes to technology implementation, choosing a tool with a large community, solid support base, and multi-year longevity is critically important. No one wants to have to explain to stakeholders and customers that the application is down because the code or platform is no longer supported!
The other benefit of wide adoption is the collective, shared knowledge of the community. Best practices are developed and shared, and an ecosystem of supporting tools and documentation can be built. A great example of the power of community support is the awesome-terraform repository on Github, a curated list of tools, libraries, documentation, blogs, and more.
4. Terraform enables immutability
Immutable infrastructure makes managing complex distributed systems easier and safer and allows them to scale much more reliably. What exactly is meant by “immutable infrastructure”, and how does Terraform enable it?
Consider a hypothetical scenario: a developer needs to deploy some changes to fix a production application. They create their changes locally, run some tests and linting to validate the changes and check syntax, and now they’re ready to deploy. However, to get their code to run in the staging environment, they have to change some configuration to point to the staging database. Then they need to give the QA team access to the staging servers. Everything manages to check out in the staging environment, but when deploying to production disaster strikes, the application stops serving traffic and an outage occurs.
Was the original code bad? Or was it the changes made in staging? Because the lines between environments and stages were blurred, it’s nearly impossible to tell. Mutable infrastructure means changes can occur at any point in the lifecycle of an application or its infrastructure.
With immutable infrastructure, build, release, and deploy stages are kept separate. Once code changes are built, they are stamped with an immutable release tag. Further changes or fixes result in a new tag being generated. Developers and engineers know that a change that was made in a local development environment is the same change across different environments and deployment stages. The 12-factor app framework highlights this pattern in factor V.
5. Terraform is modular
Modularity is an important feature in a variety of languages and systems. Abstracting logic and resources behind simple interfaces is one of the best ways to manage complexity at scale. As Terraform deployments grow more complex, developers can consider employing modules to encapsulate various resources in a reusable package.
A common use case is providing other development teams with Kubernetes clusters for testing and development. Normally, provisioning a Kubernetes cluster in a cloud provider like AWS requires a lot of boilerplate configuration and resources, even when using managed services. With modules, that configuration can be hidden behind a basic configuration interface. Developers can import the module, specify whatever inputs the module author has provided, and they can provision a complete stack without having to duplicate effort needlessly.
How can I use Terraform?
Although it’s simple to get started, Terraform is a very powerful tool for provisioning infrastructure. The key for new users is to start small with basic configuration, and work towards fully automating their infrastructure as code.
New developers should start with something simple; a couple of basic resources without complex dependency chains. You don’t need to start by trying to manage your entire production application environment on day 1: try using Terraform to manage the S3 bucket where deployment artifacts are stored, or Google DNS records for a frontend website.
Once you’re comfortable, you can start to iterate and grow your Terraform usage. One of the first steps is to move from local state to a remote state file with locking. It’s virtually impossible to safely manage larger Terraform deployments with multiple users without the state having a locking mechanism. When a Terraform plan or apply is running, locks prevent other users from making changes that could result in an inconsistent state or corruption.
With multiple users now contributing Terraform configuration, you can start to expand usage to cover more and more of your infrastructure resources, including networking, security, CDN, and more. As your infrastructure increases in complexity, consider encapsulating certain segments of your architecture in modules.
Finally, your team can start to use Terraform to provision resources as part of your CI/CD strategy. CI/CD is a complex topic that we won’t cover here, but GitLab and GitHub both provide great batteries-included solutions for deployment automation that can be used with Terraform. Hashicorp provides solid documentation specifically targeted at users who are considering automating their Terraform deployments.
Why provision infrastructure with clicks and manual processes while writing application code? Infrastructure as Code tools like Terraform means that infrastructure configuration can be brought into the same development processes, allowing for testing, standardization, and scalability. The modern Infrastructure as Code ecosystem has a broad variety of resources for learning and getting started.
Want to learn more about Terraform?
Why not try studying for the Hashicorp Certified: Terraform associate certification? Pluralsight offers an excellent cert prep course which teaches you everything you need to know about Terraform, including how to use it effectively in your own projects. It’s a great way to learn about Terraform, even if you don’t end up taking the certification exam.
Some other great Terraform resources to check out
- The Ultimate Terraform Commands Cheat Sheet
- Common Terraform Commands
- How to troubleshoot 5 common Terraform errors
- How to use Terraform outputs and inputs
- How to install Terraform on Windows, macOS and Ubuntu
- Getting Started with Terraform Cloud
- Implementing Terraform with AWS
- Implementing Terraform on Microsoft Azure
- Implementing Terraform with Google Cloud Platform
- Creating a Terraform Configuration for Multi-Cloud Use
- Creating an Azure Resource with Terraform
- Provision an AWS resource with Terraform
- Terraform Deep Dive
- Modifying AWS Terraform configurations
- Ansible vs. Terraform: Fight!
About the Author
Mike Vanbuskirk is a Lead DevOps engineer and technical content creator. He’s worked with some of the largest cloud, e-commerce, and CDN platforms in the world. His current focus is cloud-first architecture and serverless infrastructure.