Skip to main content

Command Palette

Search for a command to run...

Mastering Terraform Data Sources with AWS (Real Hands-On Example)

Published
3 min read
Mastering Terraform Data Sources with AWS (Real Hands-On Example)

Today’s Terraform session was one of the most practical so far.
I finally dove deep into Terraform Data Sources — and honestly, this is the exact moment when Terraform stops being “static IaC” and starts feeling smart, dynamic, and production-ready.

If variables help you inject data into Terraform,
Data Sources help you fetch real-time data from AWS and use it inside your configuration.

And today… I used data sources to fetch VPCs, subnets, security groups, and AMIs — then deployed an EC2 instance using only dynamically discovered resources.

Let’s walk through what I learned 👇


What Are Terraform Data Sources?

Data sources allow Terraform to read existing AWS resources instead of creating new ones.

This is extremely useful when:

✔ Your infra already exists
✔ You want to reference shared resources
✔ You want to avoid hardcoding IDs
✔ Teams work across multiple Terraform modules

Instead of manually copying IDs from AWS, data sources fetch them automatically.


1. Fetching the VPC Dynamically

data "aws_vpc" "selected" {
  filter {
    name   = "tag:Name"
    values = ["vpc_for_data_source"]
  }
}

This code looks for a VPC with tag Name = vpc_for_data_source.

Terraform now knows the VPC ID without me typing anything manually.


2. Fetching Security Group from the Same VPC

data "aws_security_group" "selected" {
  filter {
    name   = "tag:Name"
    values = ["terra"]
  }
  vpc_id = data.aws_vpc.selected.id
}

This ensures Terraform picks the correct SG in the correct VPC.
No confusion. No wrong-SG surprises.

Output:

output "aws_sg" {
  value = data.aws_security_group.selected.id
}

3. Fetching Subnet Dynamically

data "aws_subnet" "test" {
  filter {
    name   = "tag:Name"
    values = ["terraform"]
  }
  vpc_id = data.aws_vpc.selected.id
}

Again, Terraform automatically grabs the subnet without any manual IDs.

Output:

output "subnet_id" {
  value = data.aws_subnet.test.id
}

4. Fetching the Latest Ubuntu AMI Using Filters

This is my favorite part — no more hardcoding AMI IDs.

data "aws_ami" "example" {
  most_recent = true
  owners      = ["099720109477"]  # Canonical

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-*-amd64-server-*"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }
}

Terraform automatically picks the latest Ubuntu AMI.
Perfect for automation workflows.

Output:

output "ami_id" {
  value = data.aws_ami.example.id
}

5. Launching an EC2 Instance Using Only Data Sources

Finally, I deployed an EC2 instance using zero hardcoded IDs.

resource "aws_instance" "testingdata_source" {
  ami                         = data.aws_ami.example.id
  instance_type               = "t2.micro"
  subnet_id                   = data.aws_subnet.test.id
  vpc_security_group_ids      = [data.aws_security_group.selected.id]

  tags = {
    Name = "TerraformInstance"
  }
}

This is true infrastructure automation:

✔ Dynamic AMI
✔ Dynamic VPC
✔ Dynamic Subnet
✔ Dynamic Security Group

You can run this code in any environment where the tag names match — no changes needed.

📺 Video That Helped Me Understand this concept:


Final Thoughts — Why Data Sources Matter

Today’s lesson taught me that real Terraform workflows are not just about "creating resources"…
They're about connecting Terraform to existing AWS environments.

Data Sources give you:

🔹 More reusable code
🔹 More reliable infra deployments
🔹 Zero hardcoding
🔹 Seamless multi-environment setups
🔹 Production-ready configurations

This was a major milestone in my Terraform journey — and it finally feels like everything is clicking.

More from this blog

B

Build With Rajesh

31 posts