AWS Infrastructure as Code: Complete Guide to CloudFormation, CDK, and Terraform

9 minute read

Infrastructure as Code is the operating model that turns AWS from a collection of hand-built resources into a repeatable platform. The goal is not just faster provisioning. The goal is to make infrastructure changes reviewable, testable, reversible, and consistent across environments.

This guide is part of the Daily DevOps infrastructure automation hub. The companion repository for implementation examples is aws-infrastructure-automation-toolkit.

On AWS, most teams end up choosing between three primary approaches:

CloudFormation for native AWS coverage and direct service integration
AWS CDK for reusable infrastructure written in programming languages
Terraform for a mature declarative workflow and multi-cloud portability

The best choice depends on team skills, governance requirements, existing estate, and how much abstraction the platform needs. This guide covers how to choose, how to structure an implementation, and how to avoid the problems that make infrastructure automation brittle.

What Infrastructure as Code Changes

Infrastructure as Code replaces console-driven infrastructure management with committed source files, peer review, automated validation, and controlled deployment pipelines. A VPC, IAM role, ECS service, Lambda function, RDS cluster, Route 53 record, or CloudWatch alarm should be represented in source control instead of only existing as state inside an AWS account.

That shift creates several operational advantages:

Teams can review infrastructure changes before they reach production.
Environments can be recreated from known source instead of tribal knowledge.
Security controls can be applied consistently across accounts and regions.
Drift becomes detectable instead of invisible.
Rollbacks and disaster recovery plans become more concrete.

IaC does not remove the need for AWS expertise. It makes that expertise explicit and reusable.

CloudFormation: AWS-Native Infrastructure

CloudFormation is AWS’s native infrastructure provisioning engine. Every CloudFormation stack is managed directly by AWS, and new AWS services commonly support CloudFormation early in their lifecycle.

CloudFormation is a strong fit when:

The estate is AWS-only.
The team wants first-party AWS support.
Change sets and stack events are important operational primitives.
StackSets are needed for multi-account rollout.
The organization prefers not to manage an external state engine.

A small CloudFormation stack can be straightforward:

AWSTemplateFormatVersion: "2010-09-09"
Description: Basic application security group

Parameters:
  VpcId:
    Type: AWS::EC2::VPC::Id
  Environment:
    Type: String
    AllowedValues: [dev, staging, prod]

Resources:
  AppSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: !Sub "${Environment} application security group"
      VpcId: !Ref VpcId
      SecurityGroupEgress:
        - IpProtocol: tcp
          FromPort: 443
          ToPort: 443
          CidrIp: 0.0.0.0/0
      Tags:
        - Key: Environment
          Value: !Ref Environment
        - Key: ManagedBy
          Value: CloudFormation

The challenge is scale. Large YAML templates can become difficult to test, refactor, and review. Nested stacks help, but they need clear ownership boundaries. If a team keeps adding unrelated resources to a single stack because it is convenient, CloudFormation can turn into a large procedural artifact instead of a clean platform model.

AWS CDK: Infrastructure With Reusable Constructs

AWS CDK synthesizes CloudFormation from languages such as TypeScript, Python, Java, C#, and Go. It is especially useful when infrastructure patterns need reusable abstractions.

CDK is a strong fit when:

Application teams already work in a supported programming language.
The platform needs reusable constructs across many services.
Infrastructure tests should run with normal unit testing tools.
The team wants higher-level AWS defaults without hand-writing every resource.
CloudFormation remains the desired deployment engine.

A CDK stack can express the same intent with less boilerplate:

import * as cdk from "aws-cdk-lib";
import * as ec2 from "aws-cdk-lib/aws-ec2";
import { Construct } from "constructs";

export class NetworkStack extends cdk.Stack {
  public readonly vpc: ec2.Vpc;

  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    this.vpc = new ec2.Vpc(this, "Vpc", {
      maxAzs: 2,
      natGateways: 1,
      subnetConfiguration: [
        { name: "Public", subnetType: ec2.SubnetType.PUBLIC, cidrMask: 24 },
        { name: "Private", subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS, cidrMask: 24 },
        { name: "Database", subnetType: ec2.SubnetType.PRIVATE_ISOLATED, cidrMask: 28 }
      ]
    });

    cdk.Tags.of(this).add("ManagedBy", "CDK");
  }
}

CDK’s advantage is not only fewer lines. It lets a platform team publish internal constructs such as StandardVpc, PrivateService, EncryptedBucket, or AuditedQueue. Those constructs can encode tagging, encryption, logging, alarm, and access-control defaults.

The risk is over-abstraction. If a construct hides too much, application teams cannot understand what will be deployed. Keep constructs small, documented, and testable.

Terraform: Declarative Workflow and Multi-Cloud Reach

Terraform is widely used for AWS because it has a mature provider ecosystem, a clear plan/apply workflow, and a module model that many teams already understand.

Terraform is a strong fit when:

The organization uses more than one cloud provider.
Teams already have Terraform skills and module conventions.
Infrastructure state should be managed outside CloudFormation.
Third-party providers are part of the platform.
A plan file review step is important to the deployment process.

A simple AWS VPC pattern in Terraform looks like this:

provider "aws" {
  region = var.aws_region
}

resource "aws_vpc" "main" {
  cidr_block           = var.vpc_cidr
  enable_dns_support   = true
  enable_dns_hostnames = true

  tags = {
    Name        = "${var.project_name}-${var.environment}-vpc"
    Environment = var.environment
    ManagedBy   = "Terraform"
  }
}

resource "aws_subnet" "private" {
  for_each = var.private_subnets

  vpc_id            = aws_vpc.main.id
  cidr_block        = each.value.cidr
  availability_zone = each.value.az

  tags = {
    Name        = "${var.project_name}-${var.environment}-${each.key}"
    Environment = var.environment
    Tier        = "private"
  }
}

Terraform’s main operational responsibility is state. Use remote state, lock state during updates, limit who can mutate state, and create backup procedures. Treat state as production data, because it is.

CloudFormation vs CDK vs Terraform

Decision Area	CloudFormation	AWS CDK	Terraform
Deployment engine	AWS native	CloudFormation synthesis	Terraform state engine
Best for	AWS-only native stacks	Reusable AWS platform patterns	Multi-cloud and module workflows
Language	YAML or JSON	TypeScript, Python, Java, C#, Go	HCL
Testing model	Template validation and change sets	Unit tests, snapshot tests, synth validation	Validate, plan, policy checks
State model	AWS-managed stacks	AWS-managed stacks	Remote Terraform state
Main risk	Large hard-to-review templates	Over-abstracted constructs	State drift or state access mistakes

There is no universal winner. Many mature AWS environments use more than one tool. A platform team might use CloudFormation StackSets for account baselines, CDK for application infrastructure, and Terraform for DNS, SaaS providers, or multi-cloud shared services.

The important rule is to avoid tool sprawl without ownership. Every IaC tool in the estate needs a clear purpose, code standard, state model, and deployment path.

Implementation Roadmap

Start with the resources that create the most operational pain, not necessarily the resources that are easiest to model.

Phase 1: Inventory and Ownership

Create an inventory of existing AWS resources by account, region, application, and owner. Tagging gaps should be fixed before or during migration. Infrastructure without an owner is difficult to migrate safely.

Useful inventory inputs include:

AWS Config resource inventory
Cost and Usage Report tags
CloudFormation stack lists
Terraform state files
Load balancer target groups and listener rules
Route 53 hosted zones
IAM roles and policies

Phase 2: Foundation Stacks

Move stable shared infrastructure first:

VPCs, subnets, route tables, NAT gateways, and endpoints
IAM permission boundaries and deployment roles
KMS keys and logging buckets
Security groups with well-understood consumers
AWS Config, CloudTrail, GuardDuty, and baseline alarms

These layers should change slowly and have stricter review rules than application infrastructure.

Phase 3: Application Infrastructure

After the foundation is stable, migrate application-owned resources:

ECS services and task definitions
Lambda functions and event sources
RDS, DynamoDB, SQS, SNS, EventBridge, and S3 resources
CloudWatch dashboards and alarms
Route 53 records and certificates

Keep application stacks small enough that a service team can reason about them during review.

Phase 4: Policy, Testing, and Release Controls

IaC should be validated before deployment. A practical pipeline usually includes:

Formatting checks
Static validation
Security scanning
Policy checks for encryption, public access, and IAM scope
Plan or change-set review
Deployment to a non-production environment
Post-deployment smoke checks

The exact tools vary, but the control points should be consistent.

Security and Governance

Infrastructure automation can make security better or worse. It makes good defaults repeatable, but it also lets a bad pattern spread quickly.

Set baseline rules early:

No secrets in source code, variables, templates, or state outputs.
Encryption is the default for storage, queues, databases, and logs.
Public network paths require explicit review.
IAM policies should be scoped to actions, resources, and conditions.
Production deployments require a separate role from development deployments.
CloudTrail and Config should cover every account and region in scope.

Use policy-as-code where possible. Examples include CloudFormation Guard, Checkov, tfsec, Open Policy Agent, IAM Access Analyzer, and AWS Config custom rules.

Testing Infrastructure Code

Infrastructure tests should catch expensive mistakes before AWS does.

For CloudFormation:

aws cloudformation validate-template --template-body file://template.yaml
aws cloudformation create-change-set \
  --stack-name app-prod \
  --change-set-name app-prod-review \
  --template-body file://template.yaml

For CDK:

npm test
npx cdk synth
npx cdk diff

For Terraform:

terraform fmt -check
terraform validate
terraform plan -out tfplan

Testing should also include runtime validation. If a deployment creates an ALB route, call the route. If it creates an SQS event flow, publish a test event. If it changes IAM, verify the intended principal can perform the intended action and cannot perform adjacent actions.

Migration Patterns

Avoid one giant migration. Use one of these patterns instead.

Import Existing Resources

Import is useful when resources are already correct and need to be brought under IaC management. Use it carefully. Importing a resource does not automatically mean the code accurately describes every important setting.

Replace During Planned Change

Some resources are easier to recreate during a larger change window. For example, a non-critical queue, alarm, or dashboard may be simpler to replace than import.

Wrap Around Existing Infrastructure

Create IaC for new surrounding resources first, then migrate the older center of gravity later. This is common with DNS, observability, and deployment pipelines.

Blue-Green Infrastructure

For high-risk components, build the new infrastructure in parallel, shift traffic, validate behavior, and then retire the old stack. This is slower but safer.

Repository Structure

A workable repository layout keeps shared modules separate from environment configuration:

infrastructure/
  modules/
    network/
    ecs-service/
    rds-postgres/
  environments/
    dev/
    staging/
    prod/
  policies/
  scripts/
  docs/

For CDK, keep constructs separated from deployed stacks:

packages/
  constructs/
    standard-vpc/
    private-service/
apps/
  platform-network/
  billing-service/
  reporting-service/

The layout matters less than the ownership. A repository should make it obvious who owns a stack, how it is deployed, and how to validate it.

Companion Implementation Repositories

The implementation examples for this guide are organized by tool:

Use them as starting points for module boundaries, naming conventions, review checklists, and deployment structure.

Practical Adoption Checklist

Use this checklist before treating IaC as production-ready:

Every production resource has an owner and environment tag.
Shared infrastructure and application infrastructure are in separate stacks or modules.
Production state and deployment roles are locked down.
Every change produces a reviewed plan, diff, or change set.
Security scanning runs before deployment.
Rollback steps are documented for each critical stack.
Drift detection is scheduled.
Post-deployment checks run automatically.
Documentation explains how to create, update, and destroy non-production environments.

Infrastructure as Code is not a one-time migration project. It is the foundation for operating AWS with discipline: small changes, clear ownership, repeatable deployment, and evidence that the system behaves the way the code says it should.

Share on

X Facebook LinkedIn Bluesky

AWS Infrastructure as Code: Complete Guide to CloudFormation, CDK, and Terraform

What Infrastructure as Code Changes

CloudFormation: AWS-Native Infrastructure

AWS CDK: Infrastructure With Reusable Constructs

Terraform: Declarative Workflow and Multi-Cloud Reach

CloudFormation vs CDK vs Terraform

Implementation Roadmap

Phase 1: Inventory and Ownership

Phase 2: Foundation Stacks

Phase 3: Application Infrastructure

Phase 4: Policy, Testing, and Release Controls

Security and Governance

Testing Infrastructure Code

Migration Patterns

Import Existing Resources

Replace During Planned Change

Wrap Around Existing Infrastructure

Blue-Green Infrastructure

Repository Structure

Companion Implementation Repositories

Practical Adoption Checklist

Share on

You may also enjoy

Why I Rewrote GSD in Go

The Right Surface for the Work: Instrumenting AI Usage Beyond Token Counts

Central Auth: The Boring Platform Project That Keeps My Apps From Turning Into Permission Spaghetti

The Role of Cloud Platforms in Serverless Architectures

What Infrastructure as Code Changes

CloudFormation: AWS-Native Infrastructure

AWS CDK: Infrastructure With Reusable Constructs

Terraform: Declarative Workflow and Multi-Cloud Reach

CloudFormation vs CDK vs Terraform

Implementation Roadmap

Phase 1: Inventory and Ownership

Phase 2: Foundation Stacks

Phase 3: Application Infrastructure

Phase 4: Policy, Testing, and Release Controls

Security and Governance

Testing Infrastructure Code

Migration Patterns

Import Existing Resources

Replace During Planned Change

Wrap Around Existing Infrastructure

Blue-Green Infrastructure

Repository Structure

Companion Implementation Repositories

Practical Adoption Checklist

Related Daily DevOps Guides

Share on

You may also enjoy

Why I Rewrote GSD in Go

The Right Surface for the Work: Instrumenting AI Usage Beyond Token Counts

Central Auth: The Boring Platform Project That Keeps My Apps From Turning Into Permission Spaghetti

The Role of Cloud Platforms in Serverless Architectures