Workspace 与多账号管理
核心问题:开发、测试、生产三套环境应该完全隔离——怎样用同一套代码管理多个环境,又不让它们互相影响?
两种多环境策略
graph TB
subgraph "方案 A:Workspace(简单)"
WS_DEV[Workspace: dev]
WS_STG[Workspace: staging]
WS_PRD[Workspace: production]
CODE1[同一套代码] --> WS_DEV
CODE1 --> WS_STG
CODE1 --> WS_PRD
end
subgraph "方案 B:独立目录(推荐)"
DIR_DEV[environments/dev/]
DIR_STG[environments/staging/]
DIR_PRD[environments/production/]
MOD[共享模块
modules/] --> DIR_DEV MOD --> DIR_STG MOD --> DIR_PRD end
modules/] --> DIR_DEV MOD --> DIR_STG MOD --> DIR_PRD end
| 维度 | Workspace | 独立目录 |
|---|---|---|
| State 隔离 | ✅ 独立 State | ✅ 独立 State |
| 代码差异 | ❌ 难以维护环境差异 | ✅ 各环境可有不同资源 |
| 操作风险 | ⚠️ 切换 workspace 容易误操作 | ✅ 目录隔离,不易误操作 |
| 适用场景 | 完全同构的环境 | 生产与非生产有明显差异 |
Workspace 使用方法
# 查看当前所有 workspace
terraform workspace list
# * default
# dev
# staging
# production
# 创建并切换
terraform workspace new staging
terraform workspace select production
# 查看当前 workspace
terraform workspace show
在代码中使用 workspace 名称
# 根据 workspace 设置变量
locals {
env = terraform.workspace # "dev" / "staging" / "production"
instance_type = {
dev = "t3.small"
staging = "t3.medium"
production = "t3.large"
}[terraform.workspace]
is_prod = terraform.workspace == "production"
}
resource "aws_instance" "web" {
instance_type = local.instance_type
count = local.is_prod ? 3 : 1
}
独立目录方案(生产推荐)
infra/
├── modules/ # 共享模块
│ ├── vpc/
│ ├── eks/
│ └── rds/
└── environments/
├── dev/
│ ├── main.tf # 调用模块
│ ├── variables.tf
│ ├── outputs.tf
│ ├── backend.tf # 独立 State 路径
│ └── dev.auto.tfvars
├── staging/
│ ├── main.tf
│ ├── backend.tf
│ └── staging.auto.tfvars
└── production/
├── main.tf
├── backend.tf
└── production.auto.tfvars
# environments/production/main.tf
module "vpc" {
source = "../../modules/vpc"
name = "prod-myapp"
cidr = var.vpc_cidr
# 生产环境:3 个可用区,NAT Gateway
azs = ["ap-southeast-1a", "ap-southeast-1b", "ap-southeast-1c"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
enable_nat_gateway = true
tags = local.common_tags
}
# environments/dev/main.tf
module "vpc" {
source = "../../modules/vpc"
name = "dev-myapp"
cidr = var.vpc_cidr
# 开发环境:1 个可用区,省去 NAT Gateway 节省费用
azs = ["ap-southeast-1a"]
private_subnets = ["10.0.1.0/24"]
public_subnets = ["10.0.101.0/24"]
enable_nat_gateway = false
tags = local.common_tags
}
多账号管理
生产最佳实践:每个环境一个 AWS 账号,通过 IAM Role Assume 管理。
graph TB
MGMT["管理账号
(Terraform 执行账号)"] DEV["开发账号
(Account ID: 111...)"] STG["测试账号
(Account ID: 222...)"] PRD["生产账号
(Account ID: 333...)"] MGMT -->|assume role| DEV MGMT -->|assume role| STG MGMT -->|assume role| PRD
(Terraform 执行账号)"] DEV["开发账号
(Account ID: 111...)"] STG["测试账号
(Account ID: 222...)"] PRD["生产账号
(Account ID: 333...)"] MGMT -->|assume role| DEV MGMT -->|assume role| STG MGMT -->|assume role| PRD
配置跨账号 Assume Role
# environments/production/provider.tf
provider "aws" {
region = var.region
assume_role {
role_arn = "arn:aws:iam::${var.prod_account_id}:role/TerraformExecutionRole"
session_name = "Terraform-Production"
external_id = var.external_id # 防止"混乱代理人"攻击
}
}
// 在生产账号中创建 IAM Role 并信任管理账号
// TrustPolicy(target account)
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::MGMT_ACCOUNT_ID:role/TerraformCI"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "my-external-id-secret"
}
}
}]
}
多 Provider 别名
在同一个 Terraform 配置中管理多个区域或多个账号:
# 主区域
provider "aws" {
region = "ap-southeast-1"
}
# 美国区域(用于 Route53 + CloudFront,必须在 us-east-1)
provider "aws" {
alias = "us_east_1"
region = "us-east-1"
}
# 在特定资源上指定 Provider
resource "aws_acm_certificate" "main" {
provider = aws.us_east_1 # CloudFront 证书必须在 us-east-1
domain_name = "*.example.com"
validation_method = "DNS"
}
resource "aws_cloudfront_distribution" "main" {
# 默认使用主 Provider(ap-southeast-1)
viewer_certificate {
acm_certificate_arn = aws_acm_certificate.main.arn
}
}
实战:CI/CD 中的多环境 Terraform
# .github/workflows/terraform.yml
name: Terraform
on:
push:
branches: [main]
paths: ['infra/environments/**']
pull_request:
paths: ['infra/environments/**']
jobs:
plan:
runs-on: ubuntu-latest
strategy:
matrix:
environment: [dev, staging, production]
permissions:
id-token: write # OIDC 免 Access Key
contents: read
steps:
- uses: actions/checkout@v4
- name: 配置 AWS 凭据(OIDC)
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::${{ vars.MGMT_ACCOUNT_ID }}:role/GitHubActionsRole
aws-region: ap-southeast-1
- name: Terraform Init
working-directory: infra/environments/${{ matrix.environment }}
run: terraform init
- name: Terraform Plan
working-directory: infra/environments/${{ matrix.environment }}
run: |
terraform plan \
-var-file="${{ matrix.environment }}.auto.tfvars" \
-out=tfplan
env:
TF_VAR_db_password: ${{ secrets[format('{0}_DB_PASSWORD', matrix.environment)] }}
- name: Terraform Apply(仅 main 分支)
if: github.ref == 'refs/heads/main'
working-directory: infra/environments/${{ matrix.environment }}
run: terraform apply tfplan
环境配置文件示例
# environments/production/production.auto.tfvars
env = "production"
region = "ap-southeast-1"
vpc_cidr = "10.0.0.0/16"
prod_account_id = "333456789012"
web_instance_count = 3
web_instance_type = "t3.large"
rds_instance_class = "db.r6g.large"
rds_multi_az = true
backup_retention = 30
# environments/dev/dev.auto.tfvars
env = "dev"
region = "ap-southeast-1"
vpc_cidr = "10.1.0.0/16"
prod_account_id = "111234567890"
web_instance_count = 1
web_instance_type = "t3.small"
rds_instance_class = "db.t3.micro"
rds_multi_az = false
backup_retention = 7