Terraform Best Practices Series - Lessons from the Battlefield: Part 2
Strategic Insights from the Terraform Trenches: Part 2
This article is a continuation of Part 1 of the Terraform Best Practices Series.
Throughout my journey of DevOps/Infrastructure spanning nearly five years, I've been deep in the trenches of Terraform, crafting cloud infrastructure from startups to sprawling enterprises alike. From rookie missteps to triumphant victories, I've had a front-row seat to the power of Terraform's best practices.
In the early days, I learned the hard way that hasty changes can lead to production hiccups. But these experiences, though challenging, became invaluable lessons. They drove home the importance of meticulous planning and the significance of thorough testing before hitting that apply button. Here is my list of top Terraform best practices I have learned through experience and from others.
Table of Contents
Terraform modules practices
Configuring providers/backends in modules
Exposing labels as vars
Exposing outputs for resources
Inline submodules
Resources in root modules
Importing infrastructure
Dynamic blocks
Variables/naming validations
Variable separations
Using locals
Flexible modules
Git hooks
Using examples in folder
Module referencing
Testing & Governance
Starting small: static testing
Testing with Terratest
Policy as Code: OPA
Additional practices
Baking VMs
Leveraging open-source tools
Terraform code structures
6. Terraform modules practices
6.1 - Don't Configure Providers or Backend in Modules
Avoid configuring providers or backends within modules, as it restricts flexibility for users who consume the module. These configurations are best managed at the root level to provide flexibility for users who consume the module.
6.2 - Expose Labels as a Variable
When using labels to tag resources, expose labels as variables to allow customization and consistency across your infrastructure.
variable "instance_tags" {
type = map(string)
default = {
Environment = "Production"
Application = "Web"
}
}
resource "aws_instance" "example_instance" {
# ...
tags = var.instance_tags
}
6.3 - Expose Outputs for All Resources
Provide outputs for all significant resources in your modules to allow downstream consumers to access important information.
output "distribution_domain_name" {
value = module.cloudfront-distribution.domain_name
description = "Domain Name of the CloudFront Distribution"
}
output "distribution_arn" {
value = module.cloudfront-distribution.arn
description = "ARN of the CloudFront Distribution"
}
output "instance_id" {
description = "ID of the instance"
value = aws_instance.example_instance.id
}
6.4 - Use Inline Submodules for Complex Logic
For complex logic within your Terraform modules, consider using inline submodules to maintain code readability and organization.
module "complex_logic" {
source = "./submodules/complex_logic"
input_variable = var.some_value
}
6.5 - Minimize the Number of Resources in Each Root Module
Keep root modules focused and concise by minimizing the number of resources they manage. This promotes modularity and simplifies updates.
# s3-module/main.tf
resource "aws_s3_bucket" "my_bucket" {
bucket = "my-unique-bucket"
# ...
}
# ec2-module/main.tf
resource "aws_instance" "my_instance" {
ami = "ami-12345678"
instance_type = "t2.micro"
# ...
}
--- Separated modules below
# main.tf
module "s3" {
source = "./s3-module"
}
module "ec2" {
source = "./ec2-module"
}
## By organising your infrastructure this way, each module has a clear purpose, making it easier to understand, update, and manage. This approach fosters collaboration and allows teams to focus on specific areas without risking unintended changes to unrelated resources.
6.6 - Import existing infrastructure
When you're migrating to Terraform or managing existing resources, you might need to import those resources into your Terraform state. This can help you manage them using Terraform going forward.
resource "aws_elastic_beanstalk_environment" "existing_env" {
name = "my-existing-environment"
}
# Import the existing environment
terraform import aws_elastic_beanstalk_environment.existing_env my-existing-environment
6.7 - Utilise dynamic blocks
Dynamic Blocks in Terraform allow you to create multiple instances of a nested block within a resource or module, based on dynamic input data. This provides flexibility and reduces code duplication.
Let's explore how to use dynamic blocks with an AWS example:
Suppose you want to create multiple AWS security group rules for different ports. Instead of repeating the ingress
block, you can use a dynamic block to achieve this more efficiently.
# main.tf
provider "aws" {
region = "us-west-2"
}
resource "aws_security_group" "example_sg" {
name_prefix = "example-sg-"
description = "Example Security Group"
}
variable "ingress_rules" {
type = list(object({
from_port = number
to_port = number
protocol = string
cidr_blocks = list(string)
}))
}
dynamic "ingress" {
for_each = var.ingress_rules
content {
from_port = ingress.value.from_port
to_port = ingress.value.to_port
protocol = ingress.value.protocol
cidr_blocks = ingress.value.cidr_blocks
}
}
And provide the rules in your .tfvars
file:
# terraform.tfvars
ingress_rules = [
{
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
},
{
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
]
6.8 - Utilise variable & naming validations
Variable validations in Terraform allow you to enforce constraints on input variables to ensure that they meet specific criteria. This helps prevent incorrect configurations and enhances the reliability of your infrastructure code.
Let's see how you can use variable validations with an AWS example:
Suppose you want to ensure that a variable representing the instance type is chosen from a specific list of allowed values.
# main.tf
provider "aws" {
region = "us-west-2"
}
locals {
allowed_instance_types = ["t2.micro", "m5.large"]
}
variable "instance_type" {
description = "Type of EC2 instance"
type = string
validation {
condition = var.instance_type in local.allowed_instance_types
error_message = "Invalid instance type. Choose either t2.micro or m5.large."
}
}
resource "aws_instance" "example_instance" {
ami = "ami-0123456789abcdef0"
instance_type = var.instance_type
tags = {
Name = "ExampleInstance"
}
}
Now, when someone tries to use an invalid instance type, Terraform will provide a clear error message indicating the allowed options.
Error: Invalid instance type. Choose either t2.micro or m5.large.
An example which uses regex validation:
# modules/s3_bucket/variables.tf
variable "bucket_name" {
description = "Name of the S3 bucket"
type = string
validation {
condition = can(regex("^my-bucket-", var.bucket_name))
error_message = "Bucket name must start with 'my-bucket-'"
}
}
----
# modules/s3_bucket/main.tf
resource "aws_s3_bucket" "my_bucket" {
bucket = var.bucket_name
# Other bucket configuration...
}
6.9 - Keeping variables separated
Following on from the above and for better readability, separate optional and required variables in your code with a comment in your variables.tf
#### Required Variables ####
variable "region" {
description = "The AWS region where resources will be created."
}
variable "vpc_cidr_block" {
description = "The CIDR block for the VPC."
}
#### Optional Variables ####
variable "instance_type" {
description = "The EC2 instance type. Defaults to t2.micro."
default = "t2.micro"
}
variable "subnet_count" {
description = "The number of subnets to create. Defaults to 3."
default = 3
}
In this example, instance_type
and subnet_count
are optional variables. The description
field provides a brief explanation of what each variable represents. The default
field specifies the default value that will be used if the variable is not explicitly defined when running Terraform. This separation of optional variables from required ones enhances the readability and maintainability of your Terraform configuration.
6.10 - Using Locals appropriately
Locals are a way to assign intermediate values or complex expressions to a named variable. This can help improve the readability of your code, avoid duplication, and simplify complex configurations
There are times when locals are used in the wrong places. You can keep locals in their own file but I prefer to keep them close to the infrastructure code they are used for. Locals are useful in instances like:
When concatenating variables for names
locals {
sg_name = "sg-${var.environment}-${var.app_name}"
}
resource "aws_security_group" "example_sg" {
name = local.sg_name
description = "Security group for ${var.app_name}"
vpc_id = aws_vpc.example_vpc.id
}
Using local functions
locals {
transform_name = upper("my-resource-name")
}
resource "aws_s3_bucket" "example_bucket" {
bucket = local.transform_name
# other configurations...
}
When using conditionals
locals {
create_bucket = true
}
resource "aws_s3_bucket" "example_bucket" {
count = local.create_bucket ? 1 : 0
bucket = "my-example-bucket"
force_destroy = true
}
Using functions
locals {
instance_suffixes = [
for i in range(3) : random_string.example_suffix[i].result
]
}
resource "aws_instance" "example_instance" {
count = 3
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t2.micro"
tags = {
Name = "instance-${count.index + 1}-${local.instance_suffixes[count.index]}"
}
}
resource "random_string" "example_suffix" {
count = 3
length = 4
upper = false
special = false
number = true
}
6.11 - Use flexible modules with multiple optional inputs:
Creating flexible and reusable modules in Terraform involves designing them with multiple optional inputs. This allows users to customize the module's behaviour while maintaining a clear and consistent interface.
Here's how you can achieve this using this example:
Suppose you're creating a module to provision an AWS S3 bucket, and you want to provide users with the flexibility to customize various aspects of the bucket, such as its name, access control, and versioning
# s3_bucket.tf
variable "bucket_name" {
description = "Name of the S3 bucket"
type = string
default = "my-s3-bucket"
}
variable "acl" {
description = "Access control list for the bucket"
type = string
default = "private"
}
variable "versioning_enabled" {
description = "Enable versioning for the bucket"
type = bool
default = false
}
resource "aws_s3_bucket" "example_bucket" {
bucket = var.bucket_name
acl = var.acl
versioning {
enabled = var.versioning_enabled
}
}
# main.tf
module "custom_s3_bucket" {
source = "./modules/s3_bucket"
bucket_name = "my-custom-bucket"
acl = "public-read"
versioning_enabled = true
}
6.12 - Use Git hooks:
Use pre-commit hooks so you don’t forget to add documentation or formatting code. Some good examples are:
Pre-commit hooks by Anton Babenko which includes Terraform fmt, checkov, linting, docs and more!
Or you can make your own one:
Create a file named .git/hooks/pre-commit
#!/bin/sh
# Run Terraform fmt on all Terraform files
terraform_fmt_files=$(git diff --cached --name-only --diff-filter=ACM | grep '\.tf$')
if [ -n "$terraform_fmt_files" ]; then
echo "Running 'terraform fmt' on the following files:"
echo "$terraform_fmt_files"
terraform fmt -check=true $terraform_fmt_files
fi
---
## Now, when you try to commit changes, the pre-commit hook will automatically run terraform fmt on your Terraform files and check if they are properly formatted. If any files are not properly formatted, the commit will be rejected, and you'll need to fix the formatting before committing.
6.13 - Use the examples folder in your modules
Always have an examples folder inside your modules repository. Some of the benefits of having an examples directory:
Clear Usage Demonstrations: Users of your module may not be familiar with its capabilities. By providing concrete usage examples, you make it easier for them to understand how to configure and use your module effectively.
Testing and Validation: You can use the examples as part of your testing process. Ensuring that your module works with the provided examples helps maintain module quality and reliability.
Reference for Variations: Examples can show variations of usage, such as different parameter configurations. This provides users with a starting point to customize configurations based on their specific requirements.
Real-World Context: Examples showcase how your module can be used in real-world scenarios. This helps users see how your module fits into their own use cases, making it more relevant and applicable.
Example of using the “examples” directory in your repo:
my-module/
├── main.tf
├── variables.tf
├── outputs.tf
├── README.md
├── examples/
│ ├── basic-usage/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ ├── terraform.tfvars
│ ├── complete/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ ├── terraform.tfvars
│ ├── ...
6.14 - Reference modules via a versioning mechanism
When working with modules in Terraform, it's important to establish a versioning mechanism that ensures consistency and predictability in your infrastructure deployments. This practice involves referencing modules using specific versions to avoid unexpected changes and maintain control over updates.
module "my_vpc" {
source = "./modules/vpc"
version = "1.2.0" # Specify the desired version
# Other module inputs...
}
7. Testing and Governance
Testing Terraform configurations is crucial for reliable infrastructure:
7.1 - Start Small
Begin with less expensive testing methods like static analysis and module integration tests before full end-to-end testing. 🧪
# Format and validate Terraform code
terraform fmt
terraform validate
7.2 - Using Terratest
Terratest is a Golang library that providers patterns and helper functions for testing your infrastructure from Terraform, Packer, Docker, K8s, AWS, GCP and more!
Here's a simple example of using Terratest to test an AWS EC2 instance creation:
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/stretchr/testify/assert"
)
func TestTerraformEC2Instance(t *testing.T) {
t.Parallel()
terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
TerraformDir: "../examples/aws-ec2-instance",
Vars: map[string]interface{}{
"instance_type": "t2.micro",
},
})
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
instanceID := terraform.Output(t, terraformOptions, "instance_id")
assert.NotNil(t, instanceID)
}
7.2 - Use Policy as Code:
This is an essential approach to enforce governance and security in your cloud infrastructure. Open Policy Agent (OPA) is a popular tool that allows you to define and enforce policies as code. It helps ensure that your infrastructure configurations adhere to your organization's security and compliance standards.
How you can integrate OPA into your Terraform workflow:
Define Policies with OPA:
# policies.rego
package main
deny_instance_types = {
"t2.nano",
"m1.small"
}
default allow
deny {
input.aws_instance.instance_type = deny_instance_types[_]
}
Integrate OPA with Terraform:
# Install conftest
brew install conftest # For macOS
# Run OPA checks against Terraform code
conftest test path/to/your/terraform/code
## Example TF code
# main.tf
provider "aws" {
region = "us-west-2"
}
resource "aws_instance" "example_instance" {
ami = "ami-12345678"
instance_type = "t2.micro"
tags = {
Name = "ExampleInstance"
}
}
By introducing Policy as Code with OPA, you can ensure that your Terraform configurations align with your organization's policies and best practices. This approach enhances the security and compliance of your cloud infrastructure deployments.
8. Additional practices
8.1 - Bake Virtual Machine images:
Baking virtual machine images is a recommended practice for AWS deployments. By using a tool like Packer, you can create pre-baked machine images with all the necessary software and configurations. This reduces the startup time and ensures consistent environments when Terraform launches instances using these images.
Example Packer template for building an AWS AMI:
{
"builders": [
{
"type": "amazon-ebs",
"region": "us-west-2",
"source_ami": "ami-0123456789abcdef0",
"instance_type": "t2.micro",
"ssh_username": "ec2-user",
"ami_name": "my-server-{{timestamp}}"
}
]
}
8.2 - Use open-source tools that work alongside Terraform:
There are many free/paid tools out there that assist you with your daily Terraform usage. Here are some of them I have come across and there are more not on the list.
Spacelift - Terraform orchestrator
Atlantis - Workflow for collaborating on Terraform projects
Pre-commit Terraform - Pre-commit git hooks for automation
Checkov - for static testing
terraform-docs - Generate docs from modules
Tfenv/Tfswtitch - Manage your TF versions
8.3 - Terraform code structures:
Having worked in different organisations and projects, there are different structures I have seen and would recommend. Ranging from small startups to large enterprises:
Small: Few resources, no external dependencies. Single AWS account. Single region. Single environment.
Medium-sized org: A few AWS accounts and environments, off-the-shelf infrastructure modules from GitHub and various other sources using Terraform.
Large-corp: Many AWS accounts, many regions, urgent need to reduce copy-paste, custom infrastructure and homemade modules, heavy usage of compositions.
Enterprise: Several providers (AWS, GCP, Azure, K8s). Multi-cloud deployments. Using Terraform.
Conclusion
If you have made it this far, then well done to you!
Sharing my journey is a way of acknowledging the impact Terraform has had on my growth. From a novice tinkering with code to a seasoned engineer architecting cloud solutions, every phase has been a step toward mastery. As I share some of the best practices, looking back at the miles covered, I see that my journey with Terraform is far from over. Every line of code written, every module crafted, contributes to a bigger picture. The goal is a reliable, scalable, and secure infrastructure that serves its purpose seamlessly.