1. how does terraform handle state, and what are the best practices for managing terrafrom state in a team? Terraform maintains state to track resources it manages. it stores this state in a file (terraform.tfstate). best practices in for managing state file in a team includes:
- use remote backends like AWS S3 with DynamoDB locking, terraform cloud
- enable state locking to prevent concurrent modifications
- use workspaces or multi-env setup for different environments
- encrypt state file especially if they contain sensitive data
- be careful when using state commands
2. Explain terrafrom’s dependency resolution mechanism.
- Implicit dependencies: based on how resources refrence each other
- Explicit dependencies: by using
depends_onargument to specify specific order of execution
3. What happens when you manually delete a resource that terraform manages? if a resource is deleted outside of terraform manually
- running
terraform planwill deletect that the resource is missing - running
terraform applywill recreate the resource - if the resource was removed from the
.tffile and terraform apply is run, terraform won’t manage it anymore
4. How do you manage secrets in Terraform?
- Environment variables- use TF_VAR_variablename with sensitive values
- terrafrom vault provider: store secrets in hashicorp vault
- aws secret manager / azure key vault: retrive secrets dynamically
- sensitive variables - mark variables as sensitive in terraform
- do not commit
terraform.tfstatefile to version control as it many contain secrets
5. What are Terraform Workspaces, and how do they differ from Modules?
- Workspaces allow you to manage multiple instance of terraform configuration within the same backend
Terminal window terraform workspace new devterraform workspace select dev - Modules are reusable terraform configurations that help with abstraction and code organization.
6. How does Terraform handle drift detection?
terraform detects configuration drift by running terraform plan. if the actual state differs from the expected state, terraform highlights the drift and prompts an update.
- to prevent drift:
- implement CI/CD checks.
- use terrafrom state list to inspect current resources.
7. How does Terraform’s for_each differ from count ?
countis index-based (count.index), useful for simple lists.for_eachworks with sets and maps, allowing dynamic key-value associations.- example using
count:resource "aws_instance" "example" {count = 3ami = "ami-1234567890"} - example using
for_each:resource "aws_instance" "example"for_each = toset (["dev", "qa", "prod"])ami = "ami-1234"tags = { Name = each.key }
8. What is the purpose of terraform refresh?
terraform refresh updates the state file with the real-world state but does not apply changes.
9. What is terraform import and how do you use it?
terraform import allows importing existing infrastructure into terrafrom state without modifying resources. after importing update the .tf file to match the real-world configuration.
10. How do you use terraform taint command?
terraform taint marks a resource for recreation in the next terraform apply.
11. Explain the idfference between terraform destory and terraform apply -destroy?
terraform destory: Destory all resources in the state fileterraform apply -destroy: also destroy resources but allows for additional plan checks before applying
12. How can you handle cross-account deployments in Terraform?
- use multiple AWS profiles
- use terraform providers with different aliases
13. What is the purpose of the terraform data source?
- data sources allows terrafrom to fetch existing data without creating new resources.
14. How does terraform handle provider versioning? terraform allows version constraints for providers:
terraform { required_providers { aws = { source = "hashicorp/aws" version = "~> 4.0" } } }- use
terraform providersto check installed versions.
15. How do you optimize Terraform performance for largee infrastructures?
- enable parallelism (
terraform apply -parallelism=10) - use modules to break down configuration
- use caching for remote states
- use state locking to prevent concurrency issues
- use target flag to apply changes to specific resources.
16. You need to update an EC2 instance’s AMI ID without downtime. Terraform wants to destory and recreate the instance. how do you avoid downtime?
- use crete before destroy in lifecycle rules to ensure the new instance is created before the old is deleted
resource "aws_instance" "example" { ami = var.ami_id instance_type = "t3.micro" lifecycle { create_before_destroy = true } }17. You need to deploy an S3 bucket in ap-south-1 and an ec2 instance in us-east-1 using the same terraform configuration. how do you achieve this?
- use multiple provider configurations
Terminal window provider "aws" {alias = "ap-south"region = "ap-south-1"}provider "aws" {alias = "us-east"region = "us-east-1"}resource "aws_s3_bucket" "example" {provider = aws.us-eastbucket = "my-bucket"}resource "aws_instance" "example" {provider = aws.ap-southami = "ami-1234"instance_type = "t2.micro"}
18. Your Terraform-managed AWS infrastructure was modified manually by another team. Terraform does not show changes, but AWS console does. How do you detect and correct this?
- Run
terraform plan -refresh-onlyto detect drift without making changes - use
terraform state listto inspect tracked resources - if any resource is missing, re-import it
- if necessary,
terraform applyto restore the expected configuration
19. Your company wants to ensure that only t2.micro instances are used to control AWS costs. How do you enforce this in Terraform?
- use a terraform validation rule in the variables.tf file to restrict instance types
- example:
Terminal window variable "instance_type" {description = "aws ec2 instance type"type = stringvalidation {condition = contains(["t2.micro], var.instance_type)error_message = "only t2.micro is allowed"}}
20. You want to prevent the accidental deletion of a production RDS database managed by terraform. how do you enforce this?
- use
prevent_destroylifecycle rules:Terminal window resource "aws_db_instance" "production_db" {identifier = "prod-db"engine = "mysql"instance_class = "db.t3.large"lifecycle {prevent_destroy = true}}
21. Your team is using terraform with remote state in s3. A team member’s terraform run failed, leaving the state locked. how do you resolve this issue?
Terraform automatically locks the state in DynamoDB when using backend s3. if a lock persists, run - terraform force-unlock LOCKID
22. Your terraform apply modified resources incorrectly, causing an outage. how can you quickly roll back to the previous state?
- if you have a previous state file stored remotely, restore it.
terraform state pull > backup.tfstateterraform state push backup.tfstate- revert the incorrect code and apply terraform again
23. Your infrastructure requires different EC2 instance type based on environment. how can you dynamically assign instance types in a terraform module?
- we can use a map inside variables.tf:
Terminal window variable "instance_type_map" {type = map(string)default = {dev = "t2.micro"prod = "t3.large"}resource "aws_instance" "example" {ami_id = "ami-12234"instance_type = var.instance_type_map[var.environment} - now define the environment as variable in code to get the correct instance type
24. Your terraform state file is growing too large, causing slow performance. How can you manage it efficiently?
- use terraform state splitting: seperate resources into multiple state files using different workspaces or backend
- enable terraform state locking: sotre state in S3 with DynamoDB to prevent concurrency issues
Terminal window terraform {backend "s3" {bucket = "my-terraform-state"key = "prod/terraform.tfstateregion = "ap-south-1"dynamodb_table = "terraform-lock"}}
25. You updated an EC2 instance type, but terraform wants to destroy and recreate it instead of modifying it in-place. how do you prevent this?
- use the
ignore_changeslifecycle rule to keep the existing resources:Terminal window resource "aws_instance" "example" {ami = "ami-1233"instance_type = "t2.micro"lifecycle {ignore_changes = [instance_type]}}
26. How do you target a specific resource for deployment?
- Use the -target flag with terraform apply or terraform plan.
terraform apply -target=aws_instance.example
27. What are the lifecycle rules in terraform?
- Use
prevent_destroyfor critical infrastructure Avoid accidental deletions - Use
create_before_destroyfor zero downtime updates Safer updates for live infra - Use
ignore_changessparingly Can cause drift if overused
TERRAFORM COMMANDS
| Command | Description |
|---|---|
terraform init | Initializes the working directory (downloads providers, sets up backend, etc.) |
terraform validate | Validates the syntax of your configuration files |
terraform fmt | Formats Terraform files to canonical style |
terraform version | Displays the Terraform version installed |
| ------------------------------- | ---------------------------------------------------------------------------- |
terraform plan | Shows what Terraform will do (preview changes) |
terraform apply | Applies the changes required to reach the desired state |
terraform plan -refresh-only | check drift without overwriting state file |
terraform apply -refresh-only | update the state file with new changes |
| `terraform plan/apply -replace=“aws_s3.example” | mark resource as taint, replacement of replace |
terraform apply -auto-approve | Skips interactive approval during apply |
terraform refresh | Updates the state file with real infrastructure data (not recommended often) |
| ---------------------------------------------- | -------------------------------------------- |
terraform destroy | Destroys all resources defined in the config |
terraform destroy -target=resource_type.name | Destroys a specific resource only |
| ------------------------------------ | ---------------------------------------------------- |
terraform state list | Lists all resources in the current state |
terraform state show <resource> | Shows detailed state for a specific resource |
terraform state rm <resource> | Removes a resource from state (but not from infra) |
terraform state mv <source> <dest> | Moves items in state from one name/module to another |
| ---------------------------------------- | -------------------------------------- |
terraform output | Displays all outputs after apply |
terraform output <name> | Displays a specific output variable |
terraform apply -var='key=value' | Passes a variable via CLI |
terraform apply -var-file="dev.tfvars" | Passes multiple variables using a file |
| ------------------------- | ----------------------------------------------- |
terraform get | Downloads and installs modules for the config |
terraform init -upgrade | Re-initializes and upgrades modules & providers |
| ----------------------------------- | --------------------------------- |
terraform workspace list | Lists all available workspaces |
terraform workspace new <name> | Creates a new workspace |
terraform workspace select <name> | Switches to a different workspace |
terraform workspace delete <name> | Deletes a workspace |
| ----------------------------- | -------------------------- |
TF_LOG=DEBUG terraform plan | Enables detailed logging |
TF_LOG_PATH=log.txt | Saves log output to a file |
| ---------------------------------- | -------------------------------------------------- |
terraform import <resource> <id> | Brings an existing resource into Terraform control |
terraform taint <resource> | Marks a resource for recreation during next apply |
terraform untaint <resource> | Cancels the taint |
| ----------------------- | ---------------------------------------- |
-auto-approve | Skips prompts for approval |
-compact-warnings | Removes extra warnings |
-lock=false | Disables state locking (not recommended) |
-target=resource.name | Targets specific resource(s) only |
rough
how to import existing resources in terraform?
- old way
- manually write down the resource we want to import and run terraform import command
main.tf
resource "aws_s3_bucket" "import-exmaple" { bucket = "testbkrandom12q34r23498fjnd90osfyha90psdf"}terraform import aws_s3_bucket.import-exmaple testbkrandom12q34r23498fjnd90osfyha90psdf- new way, use import block, write your resource and use import block provided in terraform registry
main.tf
resource "aws_s3_bucket" "import-exmaple" { bucket = "testbkrandom12q34r23498fjnd90osfyha90psdf"}
import { to = aws_s3_bucket.import-exmaple id = "testbkrandom12q34r23498fjnd90osfyha90psdf"}-
terraform plan, apply -
done
backend and locking
- so previously we would use dynomodb to lock state but now since we have s3 native locking, dynamodb locking will be depricated and hashicorp suggest to move to s3 native locking
- it works by simply creating a lock file in the path path as terraform.tfstate when someone is running terraform, and if some other person will try to run any changes at the same time, terraform will try to fetch the state file from backend s3, it will also see a lock file there so will deny this new user request and once the lock is removed someone else can run any command
terraform { backend "s3" { bucket = "luffysenpaiterraformbucket" key = "myterra/terraform.tfstate" region = "ap-south-1" profile = "super" encrypt = true use_lockfile = true }}-
use_lockfile option enables the s3 native locking feature
-
encrypt flag allows s3 to encrypt data at rest, custom kms key can be provided
-
force unlock -
terraform force-unlock <LOCK_ID>
terraform lifecycle
-
so normally due to resource/api limitations, if terraform can’t modify any resource for new changes, it will delete it first and create a new resource by default and we can modify this behaviour using terraform lifecycle
-
create_before_destroy - new replacement object is created first and then the old resource is deleted
-
ignore_changes - ignore any changes occured in this resource like auto scaling groups and other dynamic resource
-
prevent_destroy - cannot destroy this resource as long as this line is in the code, for critical and accidental deletion safety
resource "aws_s3_bucket" "import-exmaple" { bucket = "testbkrandom12q34r23498fjnd90osfyha90psdf" lifecycle { prevent_destroy = true }}variables vs locals
-
major difference is variables value can be given from cli or tfvars etc, basically externally but locals are fixed, it only can be overwritten by simply changing the config
-
variables are for input we want to change while locals are reusable values we dont want to change
-
so if i have a value and i assign it via variable block it can be modified by passing -var”key=value” or TF_VAR_key=value
-
but if i define this value as a local, it wont be changed via these, we simply have to modify the locals code itself
drift
-
when someone manually modify the infra managed by terraform from console or something, you get a drift
-
first detect the drift - terraform plan -refresh-only
-
it will compare the infra and state and simply output the changes, it wont update the terrafrom.tfstate file though
-
now if changes have to be reverted, simply rerun the current terrafrom configuration
-
if changes are meant to be now managed by terrafrom then
-
terrafrom apply -refresh-only -> this will update the terraform.tfstate file with new changes
-
but our code is still the old one , so we now update the code by modifying or importing the new resource
-
now we have a uptodate state file and uptodate config so simply run a terraform apply and make sure there are nothing to be created or destoryed
-
terraform refresh is a depricated command replaced by -refresh-only flag cause it used to overwrite the state file
multi environment deployment
- there are multiple ways to achieve it
- seperate directory per environment
- same codebase but different tfvars for each environment
- modules + seperate root module for each env
- for multiple modules sharing values all i have to do is simply put a output block of that value and then i can simply use it by module.modulename.outputname
modules/ vpc/ - main.tf - output.tf ec2/ s3/envs/ dev/ main.tf # uses modules with dev vars staging/ main.tf prod/ main.tfdata types
-
data types are categorized in primitive (basic) and complex data types
-
primitive has string, number, bool
-
complex has list, map, set
-
string = double-quoted sentence “like this”
-
numbers = 15
-
bool = false/true
-
list
-
list are represented by a pair of [] and comma-seperated
variable "subnets" { type = list(string) default = ["subnet-123", "subnet-456", "subnet-789"]}- sets
- a collection of unique values that do not have any secondary identifiers or ordering
- terraform does not suport direct access of values through indexing, convert it to list to access it from index values
variable "zones" { type = set(string) default = ["us-east-1a", "us-east-1b", "us-east-1a"]}- maps
- a key-value pair dictionary
variable "instance_types" { type = map(string) default = { dev = "t2.micro" prod = "t2.large" }}- object
- a map with fixed keys and predefined types of key
variable "app_config" { type = object({ name = string replicas = number enable_logs = bool }) default = { name = "myapp" replicas = 3 enable_logs = true }}REAL ROUGH ONES
how to import existing resources in terraform?
- old way
- manually write down the resource we want to import and run terraform import command
main.tf
resource "aws_s3_bucket" "import-exmaple" { bucket = "testbkrandom12q34r23498fjnd90osfyha90psdf"}terraform import aws_s3_bucket.import-exmaple testbkrandom12q34r23498fjnd90osfyha90psdf- new way, use import block, write your resource and use import block provided in terraform registry
main.tf
resource "aws_s3_bucket" "import-exmaple" { bucket = "testbkrandom12q34r23498fjnd90osfyha90psdf"}
import { to = aws_s3_bucket.import-exmaple id = "testbkrandom12q34r23498fjnd90osfyha90psdf"}-
terraform plan, apply -
done
backend and locking
- so previously we would use dynomodb to lock state but now since we have s3 native locking, dynamodb locking will be depricated and hashicorp suggest to move to s3 native locking
- it works by simply creating a lock file in the path path as terraform.tfstate when someone is running terraform, and if some other person will try to run any changes at the same time, terraform will try to fetch the state file from backend s3, it will also see a lock file there so will deny this new user request and once the lock is removed someone else can run any command
terraform { backend "s3" { bucket = "luffysenpaiterraformbucket" key = "myterra/terraform.tfstate" region = "ap-south-1" profile = "super" encrypt = true use_lockfile = true }}-
use_lockfile option enables the s3 native locking feature
-
encrypt flag allows s3 to encrypt data at rest, custom kms key can be provided
-
force unlock -
terraform force-unlock LOCK_ID
terraform lifecycle
-
so normally due to resource/api limitations, if terraform can’t modify any resource for new changes, it will delete it first and create a new resource by default and we can modify this behaviour using terraform lifecycle
-
create_before_destroy - new replacement object is created first and then the old resource is deleted
-
ignore_changes - ignore any changes occured in this resource like auto scaling groups and other dynamic resource
-
prevent_destroy - cannot destroy this resource as long as this line is in the code, for critical and accidental deletion safety
resource "aws_s3_bucket" "import-exmaple" { bucket = "testbkrandom12q34r23498fjnd90osfyha90psdf" lifecycle { prevent_destroy = true }}variables vs locals
-
major difference is variables value can be given from cli or tfvars etc, basically externally but locals are fixed, it only can be overwritten by simply changing the config
-
variables are for input we want to change while locals are reusable values we dont want to change
-
so if i have a value and i assign it via variable block it can be modified by passing -var”key=value” or TF_VAR_key=value
-
but if i define this value as a local, it wont be changed via these, we simply have to modify the locals code itself
drift
-
when someone manually modify the infra managed by terraform from console or something, you get a drift
-
first detect the drift - terraform plan -refresh-only
-
it will compare the infra and state and simply output the changes, it wont update the terrafrom.tfstate file though
-
now if changes have to be reverted, simply rerun the current terrafrom configuration
-
if changes are meant to be now managed by terrafrom then
-
terrafrom apply -refresh-only -> this will update the terraform.tfstate file with new changes
-
but our code is still the old one , so we now update the code by modifying or importing the new resource
-
now we have a uptodate state file and uptodate config so simply run a terraform apply and make sure there are nothing to be created or destoryed
-
terraform refresh is a depricated command replaced by -refresh-only flag cause it used to overwrite the state file
multi environment deployment
- there are multiple ways to achieve it
- seperate directory per environment
- same codebase but different tfvars for each environment
- modules + seperate root module for each env
- for multiple modules sharing values all i have to do is simply put a output block of that value and then i can simply use it by module.modulename.outputname
modules/ vpc/ - main.tf - output.tf ec2/ s3/envs/ dev/ main.tf # uses modules with dev vars staging/ main.tf prod/ main.tfdata types
-
data types are categorized in primitive (basic) and complex data types
-
primitive has string, number, bool
-
complex has list, map, set
-
string = double-quoted sentence “like this”
-
numbers = 15
-
bool = false/true
-
list
-
list are represented by a pair of [] and comma-seperated
variable "subnets" { type = list(string) default = ["subnet-123", "subnet-456", "subnet-789"]}- sets
- a collection of unique values that do not have any secondary identifiers or ordering
- terraform does not suport direct access of values through indexing, convert it to list to access it from index values
variable "zones" { type = set(string) default = ["us-east-1a", "us-east-1b", "us-east-1a"]}- maps
- a key-value pair dictionary
variable "instance_types" { type = map(string) default = { dev = "t2.micro" prod = "t2.large" }}- object
- a map with fixed keys and predefined types of key
variable "app_config" { type = object({ name = string replicas = number enable_logs = bool }) default = { name = "myapp" replicas = 3 enable_logs = true }}taint
- terraform taint marks a resource for recreation the next time we run terraform apply
- terrafrom taint is depricated - it marked the state file permanently and would recreate at next terrafrom apply
- terraform plan/apply -replace=“resource” flag is the new way where it does not permanently mark the statefile
- with the old method, if you tainted something but forgot to apply, your teammates might accidentally destroy something later.
- -replace is one shot
multi cloud
-
so lets say you want to use multiple cloud providers like aws gcp azure in the same configuration
-
just give the provider block and start creating resources since terrafrom automatically picks which provider to use for which resoure using the prifix of resources like aws_s3 or google_storage etc
-
if we have multiple regions from same provider then we can use tthe alias\
provider "aws" { region = "us-east-1"}
provider "aws" { alias = "west" region = "us-west-2"}
resource "aws_instance" "east" { provider = aws.west ami = "ami-123456" instance_type = "t2.micro"}count for each dynamic block
-
count is simple which creates multiple instances of something based on indexing for lists, if the order is changed then terraform will recreate everything
-
for_each is used for maps or sets where it does not use indexing so if we even change the order terraform wont see this as a change and try to recreate everything
-
dynamic block is used when any configuratin is nested like security groups then we can use dynamic blocks