Infrastructure Provisioning Tools
Modern teams manage infrastructure as code to deliver faster and with fewer surprises.

When I first stood up a production service, I manually clicked through a cloud console, chose instance types, set up security groups, and hoped I remembered every click for the next environment. The first time I missed a firewall rule during a 2 a.m. deploy, I learned the hard way that manual steps do not scale. Infrastructure provisioning tools solve that problem by making environments predictable, repeatable, and reviewable. They turn infrastructure into something you can version, test, and share, just like application code.
You may have heard debates like: Is Terraform better than Pulumi? Do we need a declarative tool or an imperative one? What about AWS CloudFormation, CDK, or Ansible? Each tool has strengths, and the best choice depends on your team’s skills, your cloud footprint, and your operational constraints. In this post, I will walk through the landscape, show practical examples, and share real-world tradeoffs so you can decide confidently.
Where provisioning tools fit today
Infrastructure provisioning tools sit between your application code and the cloud. They define compute, networking, storage, identities, and policies. In modern teams, these tools are core to DevOps and platform engineering. They are used by backend developers spinning up ephemeral test environments, platform teams building internal developer platforms, and SREs managing reliability at scale.
Declarative tools like Terraform, CloudFormation, and Pulumi are dominant for cloud resources. Configuration management tools like Ansible, Chef, or Puppet are more common for configuring operating systems and services after provisioning. In practice, teams often combine them: Terraform to create networks and VMs, Ansible to configure software on those VMs.
A high-level comparison helps frame choices:
- Declarative tools (Terraform, CloudFormation, Pulumi): You describe the desired state, and the tool reconciles reality to match. They excel at cloud APIs, drift detection, and controlled apply cycles.
- Imperative tools (scripts, CDK, Pulumi programs): You write code that orchestrates changes. They are flexible and great for complex logic, but you must manage diffs and idempotency yourself.
- Configuration management (Ansible, Chef, Puppet): They target host-level configuration and long-lived systems. They pair well with provisioners for OS settings, package installs, and service setup.
- Specialized tools: Crossplane extends the Kubernetes API to manage cloud resources; Terragrunt helps keep Terraform code DRY; Atlantis adds CI/CD workflows for Terraform; OpenTofu is an open-source fork of Terraform.
In real-world projects, you will see:
- Ephemeral environments spun up per pull request for integration tests.
- Multi-account strategies with shared networking, isolated workloads, and strict IAM.
- Module catalogs maintained by platform teams to standardize best practices.
- Compliance-as-code using policy engines like Open Policy Agent (OPA) or cloud-native policies.
If you are starting, Terraform or CloudFormation are safe bets in AWS-centric shops. If your team wants to program infrastructure with the same language as your app, Pulumi or CDK may be more approachable. If you need to configure many VMs or on-prem servers, Ansible remains a pragmatic choice.
Core concepts and practical examples
At the heart of provisioning is a simple loop: define desired state, plan changes, apply changes, and verify. The best tools make this loop visible and safe.
Declarative state and plan/apply
Terraform uses HCL to declare resources. The plan step shows what will change before you apply. This reduces surprises and supports approvals in CI.
Example: A minimal Terraform project that provisions an AWS S3 bucket with server-side encryption.
Folder structure:
terraform-s3/
├── main.tf
├── variables.tf
├── outputs.tf
└── terraform.tfvars
Code (main.tf):
provider "aws" {
region = var.aws_region
}
resource "aws_s3_bucket" "logs" {
bucket = var.bucket_name
tags = {
Environment = var.env_tag
ManagedBy = "terraform"
}
}
resource "aws_s3_bucket_server_side_encryption_configuration" "logs" {
bucket = aws_s3_bucket.logs.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}
resource "aws_s3_bucket_versioning" "logs" {
bucket = aws_s3_bucket.logs.id
versioning_configuration {
status = "Enabled"
}
}
Code (variables.tf):
variable "aws_region" {
type = string
description = "AWS region for resources"
}
variable "bucket_name" {
type = string
description = "Globally unique S3 bucket name"
}
variable "env_tag" {
type = string
description = "Environment tag"
default = "dev"
}
Code (outputs.tf):
output "bucket_arn" {
description = "ARN of the created S3 bucket"
value = aws_s3_bucket.logs.arn
}
output "bucket_name" {
description = "Name of the created S3 bucket"
value = aws_s3_bucket.logs.id
}
Code (terraform.tfvars):
aws_region = "us-east-1"
bucket_name = "acme-logs-12345"
env_tag = "dev"
Workflow commands:
terraform init
terraform plan -out=tfplan
terraform apply tfplan
This small example shows the plan/apply mental model. You can add a remote backend (like S3 with DynamoDB for state locking) and CI integration for production use.
Modules for reuse
Modules let you package resources and enforce standards. In a platform team, you might publish a “secure-s3” module with encryption, versioning, and lifecycle rules.
Example module usage:
module "app_logs" {
source = "./modules/secure-s3"
bucket_name = "acme-app-logs-${var.env}"
env = var.env
}
Inside modules/secure-s3/main.tf, you encapsulate the resources shown above and add optional features like public access blocks and logging.
Idempotency and drift detection
Declarative tools converge state repeatedly. Running apply twice on the same configuration should not change anything unless the underlying config or resource attributes changed. This idempotency is key for automation. Drift detection helps catch manual changes made outside the tooling, allowing teams to remediate or accept drift intentionally.
Cross-language infrastructure as code
Pulumi and CDK let you write infrastructure using general-purpose languages. That can be appealing if your team wants type safety, reusable libraries, or complex logic. For example, Pulumi with TypeScript lets you reuse shared configuration and utility functions from your app codebase. The tradeoff is that you need to manage dependency versions and think like a developer (unit tests, packaging) as well as an operator (plan/preview, safe applies).
Real-world patterns and code
Let’s explore a realistic setup combining provisioning and configuration for a simple web service. We will provision AWS networking and an EC2 instance, then configure it using Ansible. This is a common pattern for teams starting with cloud VMs before moving to containers.
Terraform for infrastructure
Folder structure:
web-app-infra/
├── terraform/
│ ├── main.tf
│ ├── variables.tf
│ ├── outputs.tf
│ └── terraform.tfvars
└── ansible/
├── inventory.ini
└── web.yml
Terraform (terraform/main.tf) provisions a VPC, subnet, security group, and EC2:
provider "aws" {
region = var.aws_region
}
data "aws_availability_zones" "available" {
state = "available"
}
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
tags = {
Name = "web-app-vpc"
}
}
resource "aws_subnet" "public" {
vpc_id = aws_vpc.main.id
cidr_block = var.subnet_cidr
availability_zone = data.aws_availability_zones.available.names[0]
map_public_ip_on_launch = true
tags = {
Name = "web-app-subnet"
}
}
resource "aws_internet_gateway" "igw" {
vpc_id = aws_vpc.main.id
tags = {
Name = "web-app-igw"
}
}
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.igw.id
}
tags = {
Name = "web-app-rt"
}
}
resource "aws_route_table_association" "public" {
subnet_id = aws_subnet.public.id
route_table_id = aws_route_table.public.id
}
resource "aws_security_group" "web" {
name = "web-app-sg"
description = "Allow HTTP and SSH"
vpc_id = aws_vpc.main.id
ingress {
description = "HTTP"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "SSH"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # tighten in production
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "web-app-sg"
}
}
resource "aws_instance" "web" {
ami = var.ami_id
instance_type = var.instance_type
subnet_id = aws_subnet.public.id
vpc_security_group_ids = [aws_security_group.web.id]
tags = {
Name = "web-app-instance"
}
# You could use a provisioner here, but it's often better to use Ansible.
# Provisioners should be a last resort due to lifecycle quirks.
}
Terraform (terraform/variables.tf):
variable "aws_region" {
type = string
default = "us-east-1"
}
variable "vpc_cidr" {
type = string
default = "10.0.0.0/16"
}
variable "subnet_cidr" {
type = string
default = "10.0.1.0/24"
}
variable "ami_id" {
type = string
}
variable "instance_type" {
type = string
default = "t3.micro"
}
Terraform (terraform/outputs.tf):
output "public_ip" {
value = aws_instance.web.public_ip
}
output "security_group_id" {
value = aws_security_group.web.id
}
Terraform (terraform/terraform.tfvars):
ami_id = "ami-0c55b159cbfafe1f0" # Amazon Linux 2 in us-east-1; update for your region
instance_type = "t3.micro"
Key decisions in this snippet:
- We avoid using Terraform provisioners for config management. Provisioners can be brittle and trigger unnecessary recreation. Instead, we output the public IP and feed it to Ansible.
- We keep networking simple, but in production you would add private subnets, NAT gateways, and more restrictive security rules.
Ansible for configuration
Ansible (ansible/inventory.ini):
[web]
web-host ansible_host=PUBLIC_IP ansible_user=ec2-user
[web:vars]
ansible_ssh_private_key_file=~/.ssh/aws-key.pem
ansible_ssh_common_args='-o StrictHostKeyChecking=no'
Ansible (ansible/web.yml):
---
- name: Configure web server
hosts: web
become: true
tasks:
- name: Install packages
yum:
name:
- httpd
- git
state: present
- name: Enable and start httpd
systemd:
name: httpd
enabled: true
state: started
- name: Deploy index.html
copy:
content: |
<html><body><h1>Hello from Ansible!</h1></body></html>
dest: /var/www/html/index.html
mode: '0644'
Workflow:
- Provision infrastructure with Terraform, capture the public_ip output.
- Update inventory.ini with the actual IP.
- Run Ansible:
ansible-playbook -i inventory.ini web.yml
This is a simple but real pattern. Teams extend it with:
- Dynamic inventory generation (AWS inventory plugin).
- More tasks for TLS (Let’s Encrypt), monitoring agents, and security hardening.
- Image baking (Packer) for faster boots and fewer configuration steps at runtime.
Crossplane for Kubernetes-style provisioning
If your team lives in Kubernetes, Crossplane can manage cloud resources using Kubernetes APIs. You define a composite resource, and Crossplane reconciles it to AWS/GCP/Azure resources. This is powerful in GitOps workflows because you can use kubectl and Argo CD for both apps and infra.
Example concept:
- Create an S3 bucket through a Crossplane Composition.
- Apply a Claim to request the bucket, and Crossplane creates the underlying provider resources.
You can explore Crossplane docs at https://www.crossplane.io/. It pairs well with policy engines like OPA Gatekeeper to enforce naming, regions, or encryption standards.
Pulumi for type-safe infrastructure
Pulumi shines when your team wants to share configuration or use TypeScript/Python to encode complex logic. A small example (TypeScript) that creates an S3 bucket with tags:
import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";
const config = new pulumi.Config();
const env = config.require("env");
const bucketName = `acme-logs-${env}`;
const logsBucket = new aws.s3.Bucket("logs", {
bucket: bucketName,
tags: {
Environment: env,
ManagedBy: "pulumi",
},
});
export const bucketArn = logsBucket.arn;
The Pulumi CLI manages state and previews. Teams often like the ability to use classes, functions, and loops to generate many resources without YAML sprawl.
Terraform + Terragrunt for large-scale projects
Terragrunt helps keep Terraform code DRY and manage remote state configuration across many environments. It is especially useful when you have a module catalog and multiple accounts.
Example terragrunt.hcl:
remote_state {
backend = "s3"
config = {
bucket = "acme-terraform-state"
key = "envs/${path_relative_to_include()}/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "acme-terraform-locks"
}
}
inputs = {
env = path_relative_to_include()
}
While Terragrunt adds another layer, it reduces duplication and enforces consistent state handling.
Honest evaluation: strengths, weaknesses, and tradeoffs
No tool is perfect. Your decision should be guided by team skills, compliance requirements, and operational maturity.
Declarative tools (Terraform, CloudFormation, Pulumi)
Strengths:
- Clear plan/apply flow reduces surprises.
- Drift detection and predictable state management.
- Strong ecosystems: modules, providers, and community patterns.
Weaknesses:
- State file handling can be tricky (remote backends, locking).
- Provider and version mismatches can break workflows.
- Complex conditionals can become hard to read in HCL or require advanced patterns.
Tradeoffs:
- Use Terraform if you want cloud-agnostic tooling and broad provider support.
- Use CloudFormation if you are all-in on AWS and want native integration (especially with AWS Organizations and Control Tower).
- Use Pulumi or CDK if you prefer programming languages for logic and reuse.
Configuration management (Ansible, Chef, Puppet)
Strengths:
- Excellent for OS-level configuration and remediation.
- Agentless (Ansible) or flexible agent-based (Chef/Puppet) options.
- Works well for on-prem and hybrid setups.
Weaknesses:
- Not ideal for managing cloud APIs and ephemeral resources (use a provisioner instead).
- Can be slow for large fleets without proper architecture.
- Idempotency and variable management need discipline.
GitOps and Kubernetes integration (Crossplane, Argo CD)
Strengths:
- Unified control plane (Kubernetes) for apps and infra.
- Policy-driven governance (OPA, Kyverno).
- Auditability via Git history.
Weaknesses:
- Additional complexity and operational overhead.
- Requires Kubernetes expertise.
- Cloud-specific features may lag providers.
When to choose what
- Early-stage startups with one cloud: Terraform or CloudFormation.
- Multi-cloud or multi-language teams: Terraform or Pulumi.
- Heavy Kubernetes + GitOps: Crossplane + Argo CD.
- Hybrid/on-prem + OS config: Ansible layered over Terraform-provisioned hosts.
- Large AWS organization: CloudFormation or Terraform with Terragrunt.
Personal experience: learning curves, mistakes, and lessons
I learned some lessons the hard way and hope you can skip the pain.
-
The first time I used Terraform without remote state, I committed the state file to Git. It was fine until someone applied from a different machine, causing a conflict and a tense morning. Since then, I always use remote backends with locking (S3 + DynamoDB for AWS). The Terraform docs on remote state are clear and worth reading: https://developer.hashicorp.com/terraform/language/settings/backends/configuration.
-
In one project, we tried to use Terraform provisioners to install software. It felt convenient until an AMI update caused repeated reinstalls and resource recreation. Migrating to Ansible stabilized our deployments and reduced risk. The general guidance is to treat provisioners as a last resort.
-
Pulumi’s programming model was a big win for a team that already used TypeScript for services. Reusing internal libraries for naming conventions and tagging was valuable. But we had to get serious about dependency management and unit testing to avoid surprises.
-
Ansible variables can explode in complexity. A simple role with clear defaults and minimal overrides avoided configuration drift. We used
ansible-lintandansible-docto keep playbooks healthy. -
Crossplane made sense in a Kubernetes-heavy org. The biggest hurdle was the learning curve for composite resources and providers. Start small: manage one bucket or secret, then grow to full compositions.
Getting started: workflow and mental models
Provisioning tools reward a deliberate workflow. Here is a mental model that works across tools.
- Define the environment layout: accounts, regions, networking, and naming conventions.
- Choose a tool chain that fits your team: Terraform for cloud, Ansible for OS config, and GitOps for Kubernetes.
- Adopt a repository structure that scales:
infra/
├── modules/
│ ├── networking/
│ └── storage/
├── environments/
│ ├── dev/
│ ├── stage/
│ └── prod/
├── platform/
│ ├── policies/
│ └── scripts/
└── docs/
- Implement a CI pipeline: plan on pull requests, apply on merge to main with approvals. Use tools like Atlantis for Terraform or GitHub Actions for Pulumi.
- Secure your state: remote backends with encryption and locking; minimal IAM permissions; secrets in a vault (not in Git).
- Test and validate: static analysis (tfsec, checkov), policy checks (OPA), and integration tests (run a plan in CI).
- Plan for drift: schedule periodic plans to detect manual changes; document exceptions.
Example GitHub Actions snippet concept for Terraform (not a full file, but a pattern):
- On PR: terraform init, terraform fmt -check, terraform validate, terraform plan.
- On merge: terraform apply (with a manual approval step).
In practice, be conservative with apply automation. Production changes should require human review for cost and risk.
What makes provisioning tools stand out
- Predictable change cycles: Plan and review reduce incidents.
- Reusability: Modules and libraries standardize best practices across teams.
- Auditability: Version-controlled definitions with Git history support compliance.
- Integration: CI/CD, policy engines, and observability tools fit naturally.
Developer experience varies:
- Terraform: HCL is readable; tooling like terraform fmt and tflint helps.
- Pulumi/CDK: Your IDE’s autocomplete and type system catch mistakes early.
- Ansible: YAML is accessible; ansible-lint and molecule improve confidence.
Outcomes matter most:
- Faster environment creation and teardown.
- Fewer configuration drift incidents.
- Clear ownership through code reviews and module catalogs.
Free learning resources
- Terraform documentation: https://developer.hashicorp.com/terraform/docs
- Practical for understanding providers, backends, and modules.
- Pulumi documentation: https://www.pulumi.com/docs/
- Great for learning infrastructure programming patterns in TypeScript, Python, Go, etc.
- AWS CloudFormation documentation: https://docs.aws.amazon.com/cloudformation/
- Useful for AWS-native teams and Control Tower integrations.
- Ansible documentation: https://docs.ansible.com/
- Covers playbooks, roles, and best practices for configuration management.
- Crossplane documentation: https://www.crossplane.io/docs/
- Helps with Kubernetes-native infrastructure and compositions.
- Open Policy Agent: https://www.openpolicyagent.org/
- Learn policy-as-code for guardrails and compliance.
- Terraform remote state: https://developer.hashicorp.com/terraform/language/settings/backends/configuration
- Essential reading for state management and locking.
- Terragrunt documentation: https://terragrunt.gruntwork.io/
- Useful for scaling Terraform across many environments.
Summary: who should use what and why
Infrastructure provisioning tools are a cornerstone for any team building reliable, repeatable systems. If you want to standardize cloud environments and reduce risk, pick a tool that matches your stack and invest in good workflows.
- Choose Terraform if you want a cloud-agnostic, declarative tool with a rich ecosystem.
- Choose CloudFormation if you are all-in on AWS and want deep platform integration.
- Choose Pulumi or CDK if your team prefers programming languages for infrastructure logic and reuse.
- Choose Ansible for OS-level configuration or when managing hybrid infrastructure.
- Choose Crossplane if you have a strong Kubernetes practice and want GitOps for both apps and infra.
If you are in a small team with limited time and a single cloud, start with Terraform or CloudFormation. If you have a developer-heavy team that values types and code reuse, Pulumi or CDK may feel more natural. If you already run Kubernetes at scale, Crossplane is worth a serious look.
A grounded takeaway: start small, choose one tool that fits your current context, and iterate. Treat infrastructure as code seriously, enforce code review, and invest in a remote state backend. The payoff is faster delivery, fewer outages, and a platform that grows with you rather than slowing you down.




