Managing High Traffic Applications with AWS Elastic Load Balancer and Terraform

Day 5 of the 30-Day Terraform Challenge - and today was the day I graduated from "it works on my machine" to "it works even if half my machines are on fire." Remember Day 4? I was celebrating my cluster like a proud parent at a kindergarten graduation. Cute, but naive. Today, I strapped a rocket boo

Day 5 of the 30-Day Terraform Challenge - and today was the day I graduated from "it works on my machine" to "it works even if half my machines are on fire."

Remember Day 4? I was celebrating my cluster like a proud parent at a kindergarten graduation. Cute, but naive. Today, I strapped a rocket booster to that cluster and turned it into something that can actually handle real traffic.

Let me tell you about the Application Load Balancer (ALB), Terraform state, and why I now understand what my DevOps friends have been losing sleep over.

Part 1: The ALB - Your Traffic Cop

Yesterday, I had a cluster. Multiple instances, auto-scaling, the works. But there was one problem: no one was directing traffic.

Without a load balancer, my cluster was like a restaurant with multiple chefs but no waiters. Customers (HTTP requests) would show up and... knock on random doors? Get lost? Probably just hit the first instance they found and hope for the best.

Enter the Application Load Balancer — the smoothest traffic cop you've ever seen.

What I Built:

Internet 
    ↓
[ALB] ← Listens on port 80, has a fancy DNS name
    ↓
[Target Group] ← Checks which instances are healthy
    ↓
[Auto Scaling Group] ← Manages 2-5 instances
    ↓
[EC2 Instances] ← Actually serving the web pages

The Secret Sauce: Security Group Chaining

This was the "aha!" moment. Instead of letting instances accept traffic from anywhere (which is what I did on Day 3), I now have:

# ALB Security Group - Welcomes everyone
ingress {
  from_port   = 80
  to_port     = 80
  cidr_blocks = ["0.0.0.0/0"]  # Come one, come all!
}

# Instance Security Group - Super selective
ingress {
  from_port       = 80
  security_groups = [aws_security_group.alb_sg.id]  # ONLY the ALB can talk to me
}

This means:

Users can only reach the ALB
Instances are invisible to the outside world
Even if someone finds an instance IP, they can't access it directly

It's like having a nightclub where:

The bouncer (ALB) is at the door with a guest list
The party rooms (instances) only let the bouncer in
Nobody can sneak in through the back door

Part 2: Terraform State - The Source of Truth

While the ALB was cool, the real mind-blowing part was understanding Terraform State.

Think of the state file (terraform.tfstate) as Terraform's diary. It remembers:

What resources it created
What IDs AWS assigned them
What IP addresses they have
What dependencies exist between them

Without the state file, Terraform would be like Dory from Finding Nemo — constantly forgetting what it just did.

Experiment 1: I Tried to Break It (Intentionally)

I opened terraform.tfstate and changed an instance type from t3.micro to t3.small. Then I ran terraform plan.

What Terraform said:

~ aws_launch_template.web
    instance_type: "t3.micro" => "t3.small"

Plan: 0 to add, 1 to change, 0 to destroy.

Terraform immediately noticed the discrepancy and planned to change it back. The state file is the source of truth for what exists, but my code is the source of truth for what should exist. When they disagree, Terraform fixes reality to match my code.

Lesson learned: Never manually edit state files. That's like editing your own diary while someone else is reading it — chaos will ensue.

Experiment 2: Drift Detection

I went into the AWS Console and manually changed a tag on an instance from Environment: dev to Environment: prod. Then I ran terraform plan.

Terraform's response:

~ aws_autoscaling_group.web
    tag.1.value: "dev" => "prod"

Plan: 0 to add, 1 to change, 0 to destroy.

This is drift detection — Terraform noticed that someone (me, in the console) had changed infrastructure outside of Terraform. And it planned to fix it.

Why this matters: If your team makes manual changes to AWS, Terraform will overwrite them. That's why you MUST use Terraform as the single source of truth. Otherwise, you're playing a game of "who changed what" that nobody wins.

Part 3: Why State Files Don't Belong in Git 🚫

I learned that committing terraform.tfstate to Git is like storing your passwords in a public Google Doc:

Secrets everywhere — State files contain plaintext passwords, access keys, and sensitive data
Merge conflicts from hell — Two people running terraform apply = one corrupted state file
No locks — Git doesn't prevent two people from applying at the same time
Bloated repos — State files get HUGE over time

The Solution: Remote State + Locking

Production teams use:

S3 bucket — Stores the state file remotely (secure, versioned)
DynamoDB table — Provides locking so only one person can run Terraform at a time

terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "prod/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"  # ← The magic lock
    encrypt        = true
  }
}

With this setup, when I run terraform apply:

DynamoDB creates a lock
If my teammate tries to run apply, they get: Error: Error acquiring the state lock
After I finish, the lock releases
No corruption. No conflicts. No tears.

Part 4: The "It's Alive!" Moment

After deploying my enhanced cluster with the ALB, I ran:

terraform output alb_dns_name
# dev-web-alb-1234567890.eu-north-1.elb.amazonaws.com

I opened my browser, hit the URL, and saw the webpage. Then I refreshed. Different instance ID. Refreshed again. Another instance. Refreshed 20 times — each time, the load balancer sent me to a different server.

Then I did the ultimate test: I went to AWS Console and terminated one of the instances.

Result: The website stayed up. The Auto Scaling Group immediately launched a replacement. The ALB automatically stopped sending traffic to the dead instance.

Zero downtime. Zero manual intervention. Just pure, unadulterated infrastructure doing its job.

I felt like a wizard.

Part 5: The Terraform Block Cheat Sheet 📝

Here's my growing reference of every block type I've used so far:

Block Type	Purpose	When to Use	Example
provider	Configures cloud provider	Once per provider at the root	`provider "aws" { region = "us-east-1" }`
resource	Creates infrastructure	Every piece of infrastructure	`resource "aws_instance" "web" {}`
variable	Makes config reusable	To avoid hardcoding values	`variable "instance_type" {}`
output	Exposes values after apply	For IPs, DNS names, IDs	`output "alb_dns" { value = aws_lb.web.dns_name }`
data	Queries existing resources	To fetch dynamic info like AZs	`data "aws_availability_zones" "available" {}`
terraform	Configures Terraform behavior	At the start for version/backend	`terraform { required_version = ">= 1.0" }`
locals	Defines reusable values	For expressions used multiple times	`locals { common_tags = { Project = "MyApp" } }`

What Actually Broke (And How I Fixed It)

Challenge 1: Health Check Failures

Error: Instances marked unhealthy, being replaced

Fix: Added health_check_grace_period = 300 to give instances 5 minutes to boot before the ALB starts judging them.

Challenge 2: State Lock Stuck

Error: Error acquiring the state lock

Fix: Someone (me) had crashed Terraform. Had to manually remove the lock from DynamoDB (or wait 15 minutes for the lease to expire).

Challenge 3: Instances Not Registering
Instances launched, ALB was there, but no traffic.
Fix: I forgot target_group_arns = [aws_lb_target_group.web.arn] in the Auto Scaling Group. Without this, the ASG never told the ALB about the instances.

The Bottom Line

Day 5 taught me two things:

ALBs make clusters useful — Without load balancing, multiple instances are just expensive paperweights.
State management separates pros from beginners — Understanding state files, drift detection, and remote backends is what makes you someone who can be trusted with production infrastructure.

I started today thinking "load balancers are just fancy routers." I'm ending today with a newfound respect for Terraform state and a slight fear of accidentally corrupting it.

Tomorrow: Probably more state magic. Maybe some modules. Definitely more coffee.

P.S. If you're wondering why I didn't set enable_deletion_protection = true on my ALB — I value my ability to destroy resources without going through AWS support. Some lessons are learned by reading documentation. Others are learned by accidentally running terraform destroy on production. I choose the former. 😅

Day 5 of the 30-Day Terraform Challenge - and today was the day I graduated from "it works on my machine" to "it works even if half my machines are on fire."

Let me tell you about the Application Load Balancer (ALB), Terraform state, and why I now understand what my DevOps friends have been losing sleep over.

Part 1: The ALB - Your Traffic Cop

Yesterday, I had a cluster. Multiple instances, auto-scaling, the works. But there was one problem: no one was directing traffic.

Enter the Application Load Balancer — the smoothest traffic cop you've ever seen.

What I Built:

Internet 
    ↓
[ALB] ← Listens on port 80, has a fancy DNS name
    ↓
[Target Group] ← Checks which instances are healthy
    ↓
[Auto Scaling Group] ← Manages 2-5 instances
    ↓
[EC2 Instances] ← Actually serving the web pages

The Secret Sauce: Security Group Chaining

This was the "aha!" moment. Instead of letting instances accept traffic from anywhere (which is what I did on Day 3), I now have:

# ALB Security Group - Welcomes everyone
ingress {
  from_port   = 80
  to_port     = 80
  cidr_blocks = ["0.0.0.0/0"]  # Come one, come all!
}

# Instance Security Group - Super selective
ingress {
  from_port       = 80
  security_groups = [aws_security_group.alb_sg.id]  # ONLY the ALB can talk to me
}

This means:

Users can only reach the ALB
Instances are invisible to the outside world
Even if someone finds an instance IP, they can't access it directly

It's like having a nightclub where:

The bouncer (ALB) is at the door with a guest list
The party rooms (instances) only let the bouncer in
Nobody can sneak in through the back door

Part 2: Terraform State - The Source of Truth

While the ALB was cool, the real mind-blowing part was understanding Terraform State.

Think of the state file (terraform.tfstate) as Terraform's diary. It remembers:

What resources it created
What IDs AWS assigned them
What IP addresses they have
What dependencies exist between them

Without the state file, Terraform would be like Dory from Finding Nemo — constantly forgetting what it just did.

Experiment 1: I Tried to Break It (Intentionally)

I opened terraform.tfstate and changed an instance type from t3.micro to t3.small. Then I ran terraform plan.

What Terraform said:

~ aws_launch_template.web
    instance_type: "t3.micro" => "t3.small"

Plan: 0 to add, 1 to change, 0 to destroy.

Lesson learned: Never manually edit state files. That's like editing your own diary while someone else is reading it — chaos will ensue.

Experiment 2: Drift Detection

I went into the AWS Console and manually changed a tag on an instance from Environment: dev to Environment: prod. Then I ran terraform plan.

Terraform's response:

~ aws_autoscaling_group.web
    tag.1.value: "dev" => "prod"

Plan: 0 to add, 1 to change, 0 to destroy.

This is drift detection — Terraform noticed that someone (me, in the console) had changed infrastructure outside of Terraform. And it planned to fix it.

Part 3: Why State Files Don't Belong in Git 🚫

I learned that committing terraform.tfstate to Git is like storing your passwords in a public Google Doc:

Secrets everywhere — State files contain plaintext passwords, access keys, and sensitive data
Merge conflicts from hell — Two people running terraform apply = one corrupted state file
No locks — Git doesn't prevent two people from applying at the same time
Bloated repos — State files get HUGE over time

The Solution: Remote State + Locking

Production teams use:

S3 bucket — Stores the state file remotely (secure, versioned)
DynamoDB table — Provides locking so only one person can run Terraform at a time

terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "prod/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"  # ← The magic lock
    encrypt        = true
  }
}

With this setup, when I run terraform apply:

DynamoDB creates a lock
If my teammate tries to run apply, they get: Error: Error acquiring the state lock
After I finish, the lock releases
No corruption. No conflicts. No tears.

Part 4: The "It's Alive!" Moment

After deploying my enhanced cluster with the ALB, I ran:

terraform output alb_dns_name
# dev-web-alb-1234567890.eu-north-1.elb.amazonaws.com

Then I did the ultimate test: I went to AWS Console and terminated one of the instances.

Result: The website stayed up. The Auto Scaling Group immediately launched a replacement. The ALB automatically stopped sending traffic to the dead instance.

Zero downtime. Zero manual intervention. Just pure, unadulterated infrastructure doing its job.

I felt like a wizard.

Part 5: The Terraform Block Cheat Sheet 📝

Here's my growing reference of every block type I've used so far:

Block Type	Purpose	When to Use	Example
provider	Configures cloud provider	Once per provider at the root	`provider "aws" { region = "us-east-1" }`
resource	Creates infrastructure	Every piece of infrastructure	`resource "aws_instance" "web" {}`
variable	Makes config reusable	To avoid hardcoding values	`variable "instance_type" {}`
output	Exposes values after apply	For IPs, DNS names, IDs	`output "alb_dns" { value = aws_lb.web.dns_name }`
data	Queries existing resources	To fetch dynamic info like AZs	`data "aws_availability_zones" "available" {}`
terraform	Configures Terraform behavior	At the start for version/backend	`terraform { required_version = ">= 1.0" }`
locals	Defines reusable values	For expressions used multiple times	`locals { common_tags = { Project = "MyApp" } }`

What Actually Broke (And How I Fixed It)

Challenge 1: Health Check Failures

Error: Instances marked unhealthy, being replaced

Fix: Added health_check_grace_period = 300 to give instances 5 minutes to boot before the ALB starts judging them.

Challenge 2: State Lock Stuck

Error: Error acquiring the state lock

Fix: Someone (me) had crashed Terraform. Had to manually remove the lock from DynamoDB (or wait 15 minutes for the lease to expire).

The Bottom Line

Day 5 taught me two things:

ALBs make clusters useful — Without load balancing, multiple instances are just expensive paperweights.
State management separates pros from beginners — Understanding state files, drift detection, and remote backends is what makes you someone who can be trusted with production infrastructure.

I started today thinking "load balancers are just fancy routers." I'm ending today with a newfound respect for Terraform state and a slight fear of accidentally corrupting it.

Tomorrow: Probably more state magic. Maybe some modules. Definitely more coffee.

Managing High Traffic Applications with AWS Elastic Load Balancer and Terraform

Part 1: The ALB - Your Traffic Cop

What I Built:

The Secret Sauce: Security Group Chaining

Part 2: Terraform State - The Source of Truth

Experiment 1: I Tried to Break It (Intentionally)

Experiment 2: Drift Detection

Part 3: Why State Files Don't Belong in Git 🚫

The Solution: Remote State + Locking

Part 4: The "It's Alive!" Moment

Part 5: The Terraform Block Cheat Sheet 📝

What Actually Broke (And How I Fixed It)

The Bottom Line

Related Stories

Majority Element

Building a SQL Tokenizer and Formatter From Scratch — Supporting 6 Dialects

Markdown Knowledge Graph for Humans and Agents

Moving Beyond Disk: How Redis Supercharges Your App Performance

Managing High Traffic Applications with AWS Elastic Load Balancer and Terraform

Part 1: The ALB - Your Traffic Cop

What I Built:

The Secret Sauce: Security Group Chaining

Part 2: Terraform State - The Source of Truth

Experiment 1: I Tried to Break It (Intentionally)

Experiment 2: Drift Detection

Part 3: Why State Files Don't Belong in Git 🚫

The Solution: Remote State + Locking

Part 4: The "It's Alive!" Moment

Part 5: The Terraform Block Cheat Sheet 📝

What Actually Broke (And How I Fixed It)

The Bottom Line

Related Stories

Majority Element

Building a SQL Tokenizer and Formatter From Scratch — Supporting 6 Dialects

Markdown Knowledge Graph for Humans and Agents

Moving Beyond Disk: How Redis Supercharges Your App Performance