Mastering Terraform Data Sources with AWS (Real Hands-On Example)

Today’s Terraform session was one of the most practical so far.
I finally dove deep into Terraform Data Sources — and honestly, this is the exact moment when Terraform stops being “static IaC” and starts feeling smart, dynamic, and production-ready.
If variables help you inject data into Terraform,
Data Sources help you fetch real-time data from AWS and use it inside your configuration.
And today… I used data sources to fetch VPCs, subnets, security groups, and AMIs — then deployed an EC2 instance using only dynamically discovered resources.
Let’s walk through what I learned 👇
What Are Terraform Data Sources?
Data sources allow Terraform to read existing AWS resources instead of creating new ones.
This is extremely useful when:
✔ Your infra already exists
✔ You want to reference shared resources
✔ You want to avoid hardcoding IDs
✔ Teams work across multiple Terraform modules
Instead of manually copying IDs from AWS, data sources fetch them automatically.
1. Fetching the VPC Dynamically
data "aws_vpc" "selected" {
filter {
name = "tag:Name"
values = ["vpc_for_data_source"]
}
}
This code looks for a VPC with tag Name = vpc_for_data_source.
Terraform now knows the VPC ID without me typing anything manually.
2. Fetching Security Group from the Same VPC
data "aws_security_group" "selected" {
filter {
name = "tag:Name"
values = ["terra"]
}
vpc_id = data.aws_vpc.selected.id
}
This ensures Terraform picks the correct SG in the correct VPC.
No confusion. No wrong-SG surprises.
Output:
output "aws_sg" {
value = data.aws_security_group.selected.id
}
3. Fetching Subnet Dynamically
data "aws_subnet" "test" {
filter {
name = "tag:Name"
values = ["terraform"]
}
vpc_id = data.aws_vpc.selected.id
}
Again, Terraform automatically grabs the subnet without any manual IDs.
Output:
output "subnet_id" {
value = data.aws_subnet.test.id
}
4. Fetching the Latest Ubuntu AMI Using Filters
This is my favorite part — no more hardcoding AMI IDs.
data "aws_ami" "example" {
most_recent = true
owners = ["099720109477"] # Canonical
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-*-amd64-server-*"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
}
Terraform automatically picks the latest Ubuntu AMI.
Perfect for automation workflows.
Output:
output "ami_id" {
value = data.aws_ami.example.id
}
5. Launching an EC2 Instance Using Only Data Sources
Finally, I deployed an EC2 instance using zero hardcoded IDs.
resource "aws_instance" "testingdata_source" {
ami = data.aws_ami.example.id
instance_type = "t2.micro"
subnet_id = data.aws_subnet.test.id
vpc_security_group_ids = [data.aws_security_group.selected.id]
tags = {
Name = "TerraformInstance"
}
}
This is true infrastructure automation:
✔ Dynamic AMI
✔ Dynamic VPC
✔ Dynamic Subnet
✔ Dynamic Security Group
You can run this code in any environment where the tag names match — no changes needed.
📺 Video That Helped Me Understand this concept:
Final Thoughts — Why Data Sources Matter
Today’s lesson taught me that real Terraform workflows are not just about "creating resources"…
They're about connecting Terraform to existing AWS environments.
Data Sources give you:
🔹 More reusable code
🔹 More reliable infra deployments
🔹 Zero hardcoding
🔹 Seamless multi-environment setups
🔹 Production-ready configurations
This was a major milestone in my Terraform journey — and it finally feels like everything is clicking.




