Managing High Traffic Applications with AWS Elastic Load Balancer and Terraform
Day 5 of the 30-Day Terraform Challenge - and today was the day I graduated from "it works on my machine" to "it works even if half my machines are on fire." Remember Day 4? I was celebrating my cluster like a proud parent at a kindergarten graduation. Cute, but naive. Today, I strapped a rocket boo
Mukami
Day 5 of the 30-Day Terraform Challenge - and today was the day I graduated from "it works on my machine" to "it works even if half my machines are on fire."
Remember Day 4? I was celebrating my cluster like a proud parent at a kindergarten graduation. Cute, but naive. Today, I strapped a rocket booster to that cluster and turned it into something that can actually handle real traffic.
Let me tell you about the Application Load Balancer (ALB), Terraform state, and why I now understand what my DevOps friends have been losing sleep over.
Part 1: The ALB - Your Traffic Cop
Yesterday, I had a cluster. Multiple instances, auto-scaling, the works. But there was one problem: no one was directing traffic.
Without a load balancer, my cluster was like a restaurant with multiple chefs but no waiters. Customers (HTTP requests) would show up and... knock on random doors? Get lost? Probably just hit the first instance they found and hope for the best.
Enter the Application Load Balancer โ the smoothest traffic cop you've ever seen.
What I Built:
Internet
โ
[ALB] โ Listens on port 80, has a fancy DNS name
โ
[Target Group] โ Checks which instances are healthy
โ
[Auto Scaling Group] โ Manages 2-5 instances
โ
[EC2 Instances] โ Actually serving the web pages
The Secret Sauce: Security Group Chaining
This was the "aha!" moment. Instead of letting instances accept traffic from anywhere (which is what I did on Day 3), I now have:
# ALB Security Group - Welcomes everyone
ingress {
from_port = 80
to_port = 80
cidr_blocks = ["0.0.0.0/0"] # Come one, come all!
}
# Instance Security Group - Super selective
ingress {
from_port = 80
security_groups = [aws_security_group.alb_sg.id] # ONLY the ALB can talk to me
}
This means:
- Users can only reach the ALB
- Instances are invisible to the outside world
- Even if someone finds an instance IP, they can't access it directly
It's like having a nightclub where:
- The bouncer (ALB) is at the door with a guest list
- The party rooms (instances) only let the bouncer in
- Nobody can sneak in through the back door
Part 2: Terraform State - The Source of Truth
While the ALB was cool, the real mind-blowing part was understanding Terraform State.
Think of the state file (terraform.tfstate) as Terraform's diary. It remembers:
- What resources it created
- What IDs AWS assigned them
- What IP addresses they have
- What dependencies exist between them
Without the state file, Terraform would be like Dory from Finding Nemo โ constantly forgetting what it just did.
Experiment 1: I Tried to Break It (Intentionally)
I opened terraform.tfstate and changed an instance type from t3.micro to t3.small. Then I ran terraform plan.
What Terraform said:
~ aws_launch_template.web
instance_type: "t3.micro" => "t3.small"
Plan: 0 to add, 1 to change, 0 to destroy.
Terraform immediately noticed the discrepancy and planned to change it back. The state file is the source of truth for what exists, but my code is the source of truth for what should exist. When they disagree, Terraform fixes reality to match my code.
Lesson learned: Never manually edit state files. That's like editing your own diary while someone else is reading it โ chaos will ensue.
Experiment 2: Drift Detection
I went into the AWS Console and manually changed a tag on an instance from Environment: dev to Environment: prod. Then I ran terraform plan.
Terraform's response:
~ aws_autoscaling_group.web
tag.1.value: "dev" => "prod"
Plan: 0 to add, 1 to change, 0 to destroy.
This is drift detection โ Terraform noticed that someone (me, in the console) had changed infrastructure outside of Terraform. And it planned to fix it.
Why this matters: If your team makes manual changes to AWS, Terraform will overwrite them. That's why you MUST use Terraform as the single source of truth. Otherwise, you're playing a game of "who changed what" that nobody wins.
Part 3: Why State Files Don't Belong in Git ๐ซ
I learned that committing terraform.tfstate to Git is like storing your passwords in a public Google Doc:
- Secrets everywhere โ State files contain plaintext passwords, access keys, and sensitive data
-
Merge conflicts from hell โ Two people running
terraform apply= one corrupted state file - No locks โ Git doesn't prevent two people from applying at the same time
- Bloated repos โ State files get HUGE over time
The Solution: Remote State + Locking
Production teams use:
- S3 bucket โ Stores the state file remotely (secure, versioned)
- DynamoDB table โ Provides locking so only one person can run Terraform at a time
terraform {
backend "s3" {
bucket = "my-terraform-state"
key = "prod/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-locks" # โ The magic lock
encrypt = true
}
}
With this setup, when I run terraform apply:
- DynamoDB creates a lock
- If my teammate tries to run apply, they get:
Error: Error acquiring the state lock - After I finish, the lock releases
- No corruption. No conflicts. No tears.
Part 4: The "It's Alive!" Moment
After deploying my enhanced cluster with the ALB, I ran:
terraform output alb_dns_name
# dev-web-alb-1234567890.eu-north-1.elb.amazonaws.com
I opened my browser, hit the URL, and saw the webpage. Then I refreshed. Different instance ID. Refreshed again. Another instance. Refreshed 20 times โ each time, the load balancer sent me to a different server.
Then I did the ultimate test: I went to AWS Console and terminated one of the instances.
Result: The website stayed up. The Auto Scaling Group immediately launched a replacement. The ALB automatically stopped sending traffic to the dead instance.
Zero downtime. Zero manual intervention. Just pure, unadulterated infrastructure doing its job.
I felt like a wizard.
Part 5: The Terraform Block Cheat Sheet ๐
Here's my growing reference of every block type I've used so far:
| Block Type | Purpose | When to Use | Example |
|---|---|---|---|
| provider | Configures cloud provider | Once per provider at the root | provider "aws" { region = "us-east-1" } |
| resource | Creates infrastructure | Every piece of infrastructure | resource "aws_instance" "web" {} |
| variable | Makes config reusable | To avoid hardcoding values | variable "instance_type" {} |
| output | Exposes values after apply | For IPs, DNS names, IDs | output "alb_dns" { value = aws_lb.web.dns_name } |
| data | Queries existing resources | To fetch dynamic info like AZs | data "aws_availability_zones" "available" {} |
| terraform | Configures Terraform behavior | At the start for version/backend | terraform { required_version = ">= 1.0" } |
| locals | Defines reusable values | For expressions used multiple times | locals { common_tags = { Project = "MyApp" } } |
What Actually Broke (And How I Fixed It)
Challenge 1: Health Check Failures
Error: Instances marked unhealthy, being replaced
Fix: Added health_check_grace_period = 300 to give instances 5 minutes to boot before the ALB starts judging them.
Challenge 2: State Lock Stuck
Error: Error acquiring the state lock
Fix: Someone (me) had crashed Terraform. Had to manually remove the lock from DynamoDB (or wait 15 minutes for the lease to expire).
Challenge 3: Instances Not Registering
Instances launched, ALB was there, but no traffic.
Fix: I forgot target_group_arns = [aws_lb_target_group.web.arn] in the Auto Scaling Group. Without this, the ASG never told the ALB about the instances.
The Bottom Line
Day 5 taught me two things:
ALBs make clusters useful โ Without load balancing, multiple instances are just expensive paperweights.
State management separates pros from beginners โ Understanding state files, drift detection, and remote backends is what makes you someone who can be trusted with production infrastructure.
I started today thinking "load balancers are just fancy routers." I'm ending today with a newfound respect for Terraform state and a slight fear of accidentally corrupting it.
Tomorrow: Probably more state magic. Maybe some modules. Definitely more coffee.
P.S. If you're wondering why I didn't set enable_deletion_protection = true on my ALB โ I value my ability to destroy resources without going through AWS support. Some lessons are learned by reading documentation. Others are learned by accidentally running terraform destroy on production. I choose the former. ๐
Found this useful? Share it!
Read the Full Story
Continue reading on Dev.to
Related Stories
Majority Element
about 2 hours ago
Building a SQL Tokenizer and Formatter From Scratch โ Supporting 6 Dialects
about 2 hours ago
Markdown Knowledge Graph for Humans and Agents
about 2 hours ago

Moving Beyond Disk: How Redis Supercharges Your App Performance
about 2 hours ago