典型工作流:从零到生产
核心问题:三个工具具体怎么协作?一次完整的"建环境 → 部署应用"流程长什么样?
完整工作流全景
flowchart TD
GIT["Git 仓库
(唯一真相来源)"] subgraph TF ["① Terraform:建基础设施"] T1[tf init / plan / apply] T2[VPC + Subnet + SG] T3[EKS Cluster + Node Group] T4[RDS + ElastiCache + S3] end subgraph AN ["② Ansible:配置节点(可选)"] A1[ansible-playbook] A2[安装系统软件] A3[配置监控 Agent] A4[加固安全基线] end subgraph K8 ["③ Kubernetes:运行应用"] K1[helm install / kubectl apply] K2[Deployment + Service] K3[Ingress + TLS] K4[HPA + PDB] end GIT --> TF TF --> AN AN --> K8 GIT --> K8
(唯一真相来源)"] subgraph TF ["① Terraform:建基础设施"] T1[tf init / plan / apply] T2[VPC + Subnet + SG] T3[EKS Cluster + Node Group] T4[RDS + ElastiCache + S3] end subgraph AN ["② Ansible:配置节点(可选)"] A1[ansible-playbook] A2[安装系统软件] A3[配置监控 Agent] A4[加固安全基线] end subgraph K8 ["③ Kubernetes:运行应用"] K1[helm install / kubectl apply] K2[Deployment + Service] K3[Ingress + TLS] K4[HPA + PDB] end GIT --> TF TF --> AN AN --> K8 GIT --> K8
阶段一:Terraform 建基础设施
项目结构
infra/
├── main.tf # 根模块入口
├── variables.tf # 输入变量
├── outputs.tf # 输出值(供 Ansible / K8s 使用)
├── backend.tf # 远程 State 配置
└── modules/
├── vpc/ # VPC 模块
├── eks/ # EKS 集群模块
└── rds/ # 数据库模块
执行流程
# 1. 初始化:下载 Provider,配置远程 Backend
terraform init
# 2. 查看变更计划(不做任何修改)
terraform plan -var-file=prod.tfvars
# 3. 审查 plan 输出(CI 中可自动 comment 到 PR)
# Plan: 12 to add, 0 to change, 0 to destroy.
# 4. 应用变更
terraform apply -var-file=prod.tfvars -auto-approve
# 5. 查看输出(EKS endpoint、RDS endpoint 等)
terraform output
示例:创建 VPC + EKS 的简化 main.tf
# backend.tf — 远程状态存储
terraform {
backend "s3" {
bucket = "my-tf-state-prod"
key = "prod/terraform.tfstate"
region = "ap-southeast-1"
dynamodb_table = "tf-state-lock"
encrypt = true
}
}
# main.tf
module "vpc" {
source = "./modules/vpc"
name = "prod-vpc"
cidr = "10.0.0.0/16"
azs = ["ap-southeast-1a", "ap-southeast-1b", "ap-southeast-1c"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
}
module "eks" {
source = "./modules/eks"
cluster_name = "prod-cluster"
cluster_version = "1.29"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
node_groups = {
default = {
instance_types = ["t3.medium"]
min_size = 2
max_size = 10
desired_size = 3
}
}
}
阶段二:Ansible 配置节点
Terraform 创建好 EC2 / EKS 节点后,Ansible 负责操作系统层面的配置。
典型使用场景
| 场景 | Ansible 任务 |
|---|---|
| 自建 K8s 集群 | 安装 containerd、kubelet、kubeadm |
| EKS 自管节点 | 安装监控 Agent(Datadog/CloudWatch) |
| 非容器化服务 | 配置 Nginx、PHP-FPM、Supervisor |
| 安全基线 | 禁用 root SSH、配置 UFW、安装 auditd |
动态 Inventory:从 Terraform 输出自动获取主机
# terraform output 返回 JSON,Ansible 动态读取
terraform output -json instance_ips > /tmp/hosts.json
# 或使用 terraform-inventory 工具
ansible-playbook -i terraform-inventory site.yml
示例:K8s 节点初始化 Playbook
# node-init.yml
- name: 初始化 Kubernetes 节点
hosts: k8s_nodes
become: true
tasks:
- name: 安装依赖
apt:
name: [apt-transport-https, ca-certificates, curl]
state: present
update_cache: true
- name: 添加 Kubernetes APT 仓库
apt_repository:
repo: "deb https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /"
state: present
- name: 安装 kubeadm / kubelet / kubectl
apt:
name: [kubeadm=1.29.*, kubelet=1.29.*, kubectl=1.29.*]
state: present
- name: 启用并启动 kubelet
systemd:
name: kubelet
enabled: true
state: started
- name: 配置内核参数(br_netfilter)
copy:
dest: /etc/sysctl.d/k8s.conf
content: |
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
notify: 应用 sysctl
handlers:
- name: 应用 sysctl
command: sysctl --system
阶段三:Kubernetes 运行应用
集群就绪后,通过 Helm 或 kubectl 部署应用。
部署流程
# 1. 配置 kubectl 连接到新集群
aws eks update-kubeconfig --region ap-southeast-1 --name prod-cluster
# 2. 验证节点状态
kubectl get nodes
# NAME STATUS ROLES AGE VERSION
# ip-10-0-1-100.ec2.internal Ready <none> 5m v1.29.0
# 3. 安装 Ingress Controller
helm upgrade --install ingress-nginx ingress-nginx \
--repo https://kubernetes.github.io/ingress-nginx \
--namespace ingress-nginx --create-namespace
# 4. 安装 cert-manager(自动 TLS)
helm upgrade --install cert-manager jetstack/cert-manager \
--namespace cert-manager --create-namespace \
--set installCRDs=true
# 5. 部署你的应用
helm upgrade --install my-api ./charts/api \
--namespace production --create-namespace \
-f values/prod.yaml
应用 Deployment 示例
# charts/api/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Release.Name }}-api
namespace: {{ .Values.namespace }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
app: {{ .Release.Name }}-api
template:
metadata:
labels:
app: {{ .Release.Name }}-api
spec:
containers:
- name: api
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
ports:
- containerPort: 3000
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: app-secrets
key: database-url
readinessProbe:
httpGet:
path: /healthz
port: 3000
initialDelaySeconds: 10
periodSeconds: 5
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
三工具协作的关键接口
graph LR
TF[Terraform
outputs.tf] -->|EKS endpoint
RDS endpoint
安全组 ID| AN[Ansible
group_vars/
prod.yml] TF -->|kubeconfig| K8[kubectl /
Helm] AN -->|节点就绪| K8 K8 -->|External IP| MON[监控 / DNS]
outputs.tf] -->|EKS endpoint
RDS endpoint
安全组 ID| AN[Ansible
group_vars/
prod.yml] TF -->|kubeconfig| K8[kubectl /
Helm] AN -->|节点就绪| K8 K8 -->|External IP| MON[监控 / DNS]
| 接口 | 方式 | 说明 |
|---|---|---|
| Terraform → Ansible | terraform output -json → Ansible dynamic inventory | 传递 IP、端点、ID 等 |
| Terraform → Kubernetes | aws eks update-kubeconfig / terraform output kubeconfig | 传递集群连接信息 |
| Ansible → Kubernetes | 节点初始化完成后 K8s 节点自动注册 | 无显式接口,通过 kubeadm join |
CI/CD 中的完整流水线
# .github/workflows/infra.yml(简化版)
name: Infrastructure Pipeline
on:
push:
branches: [main]
paths: ['infra/**']
jobs:
terraform:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Terraform Plan
run: |
cd infra
terraform init
terraform plan -var-file=prod.tfvars -out=tfplan
- name: Terraform Apply
if: github.ref == 'refs/heads/main'
run: terraform apply tfplan
ansible:
needs: terraform
runs-on: ubuntu-latest
steps:
- name: Run Ansible Playbook
run: ansible-playbook -i inventory/prod.ini site.yml
deploy:
needs: ansible
runs-on: ubuntu-latest
steps:
- name: Helm Deploy
run: |
helm upgrade --install my-api ./charts/api \
-f values/prod.yaml --wait
常见陷阱
| 陷阱 | 后果 | 解决方案 |
|---|---|---|
| Terraform State 放在本地 | 多人操作互相覆盖,State 丢失 | 立即迁移到 S3 + DynamoDB 远程 Backend |
| Ansible Playbook 没有幂等 | 重复运行破坏配置 | 始终使用声明式模块(apt state=present),避免 shell: |
| K8s Secret 明文写在 YAML | 密钥泄露到 Git 仓库 | 使用 External Secrets Operator 或 sealed-secrets |
| 三工具各自独立没有版本约束 | 环境漂移,无法重现 | 在 CI 中锁定 terraform/ansible/helm 版本 |
下一节:工具选型决策树