1 min read263 words

Patroni 自动故障转移

手动故障转移需要几分钟人工操作，而 Patroni 能在主库宕机后 30 秒内自动完成选主和切换。Patroni 是目前最流行的 PostgreSQL 高可用方案，被 Zalando、GitLab、Supabase 等大量生产环境采用。

Patroni 架构

┌─────────────────────────────────────────┐
│           DCS（分布式协调服务）             │
│  etcd / Consul / ZooKeeper / Kubernetes  │
│  存储集群状态、leader 选举、配置信息         │
└────────────┬────────────────────────────┘
│ 心跳 + 状态同步
┌────────┴────────┐
▼                  ▼
┌────────┐        ┌────────┐
│Patroni │        │Patroni │
│ Node 1 │◄──────►│ Node 2 │  流复制
│ (主库) │        │(副本)  │
└────────┘        └────────┘
▲
│ VIP（虚拟 IP）或 HAProxy
│ 应用始终连接同一地址
应用层

安装与配置 Patroni

# 安装（Ubuntu）
pip install patroni[etcd]  # 使用 etcd 作为 DCS
# 或
pip install patroni[consul]  # 使用 Consul
apt-get install patroni  # Debian/Ubuntu 包

# /etc/patroni/patroni.yml（每个节点都需要配置）
scope: my-postgres-cluster    # 集群名称（所有节点相同）
namespace: /service/          # etcd 中的命名空间
name: node1                   # 当前节点名（每个节点不同）
restapi:
listen: 0.0.0.0:8008        # Patroni REST API 端口
connect_address: 10.0.0.1:8008  # 当前节点 IP
etcd3:
hosts: 10.0.0.10:2379,10.0.0.11:2379,10.0.0.12:2379  # etcd 集群地址
bootstrap:
dcs:
ttl: 30                   # 主库心跳超时（秒）
loop_wait: 10             # 心跳间隔
retry_timeout: 30
maximum_lag_on_failover: 1048576  # 副本最大允许落后（1MB）
postgresql:
use_pg_rewind: true     # 故障转移后旧主库追赶
parameters:
wal_level: replica
hot_standby: on
max_wal_senders: 10
max_replication_slots: 10
initdb:
- encoding: UTF8
- locale: en_US.UTF-8
- data-checksums          # 启用数据页校验
postgresql:
listen: 0.0.0.0:5432
connect_address: 10.0.0.1:5432  # 当前节点 IP
data_dir: /var/lib/postgresql/16/main
bin_dir: /usr/lib/postgresql/16/bin
config_dir: /etc/postgresql/16/main
authentication:
replication:
username: replicator
password: replicator_password
superuser:
username: postgres
password: postgres_password
parameters:
max_connections: 200
shared_buffers: 4GB
tags:
nofailover: false       # 是否禁止此节点成为主库
noloadbalance: false    # 是否禁止此节点处理负载均衡
clonefrom: false

# 启动 Patroni
systemctl start patroni
systemctl enable patroni
# 查看集群状态
patronictl -c /etc/patroni/patroni.yml list
# 输出：
# + Cluster: my-postgres-cluster (12345678901234567) +----+-----------+
# | Member | Host        | Role    | State   | TL | Lag in MB |
# +--------+-------------+---------+---------+----+-----------+
# | node1  | 10.0.0.1:5432 | Leader  | running | 1  |           |
# | node2  | 10.0.0.2:5432 | Replica | running | 1  | 0         |
# | node3  | 10.0.0.3:5432 | Replica | running | 1  | 0         |
# +--------+-------------+---------+---------+----+-----------+

HAProxy 负载均衡配置

# /etc/haproxy/haproxy.cfg
frontend postgres_write
bind *:5000         # 应用写入连接到 5000
mode tcp
default_backend postgres_primary
frontend postgres_read
bind *:5001         # 应用只读连接到 5001
mode tcp
default_backend postgres_replicas
backend postgres_primary
mode tcp
option httpchk GET /master  # Patroni 健康检查接口
http-check expect status 200
server node1 10.0.0.1:5432 check port 8008
server node2 10.0.0.2:5432 check port 8008
server node3 10.0.0.3:5432 check port 8008
backend postgres_replicas
mode tcp
balance roundrobin
option httpchk GET /replica  # 只检查副本
http-check expect status 200
server node1 10.0.0.1:5432 check port 8008
server node2 10.0.0.2:5432 check port 8008
server node3 10.0.0.3:5432 check port 8008

Patroni 常用运维命令

# 查看集群状态
patronictl -c /etc/patroni/patroni.yml list
# 手动主从切换（计划内维护）
patronictl -c /etc/patroni/patroni.yml switchover my-postgres-cluster
# 可选：--master node1 --candidate node2 --scheduled "2024-03-22 02:00"
# 强制故障转移（紧急）
patronictl -c /etc/patroni/patroni.yml failover my-postgres-cluster
# 重新加载配置（不重启）
patronictl -c /etc/patroni/patroni.yml reload my-postgres-cluster
# 暂停自动故障转移（维护窗口）
patronictl -c /etc/patroni/patroni.yml pause my-postgres-cluster
patronictl -c /etc/patroni/patroni.yml resume my-postgres-cluster
# 修改动态配置（DCS 中存储，所有节点自动同步）
patronictl -c /etc/patroni/patroni.yml edit-config my-postgres-cluster
# 查看详细历史
patronictl -c /etc/patroni/patroni.yml history my-postgres-cluster

故障转移过程

正常状态：
node1 (主库) → etcd 每 10 秒写入心跳
主库宕机：
T+0s：node1 宕机，停止心跳
T+30s：etcd 中 leader 锁超时（ttl=30）
T+30s：node2、node3 发起选主竞争
T+31s：node2 赢得选主，提升为主库
T+31s：HAProxy 健康检查失败，自动切换到 node2
T+35s：应用连接自动路由到 node2
恢复旧主库：
node1 恢复后：
1. Patroni 检测到集群已有新主库
2. 自动以副本身份追赶（pg_rewind 快速同步）
3. 重新加入集群作为副本

云数据库的高可用选项

如果不想自己维护 Patroni，云数据库提供了托管方案：
AWS RDS PostgreSQL：
- Multi-AZ 部署：自动故障转移，60-120 秒
- Read Replicas：最多 5 个副本
- RDS Proxy：连接池 + 故障转移透明化
AWS Aurora PostgreSQL：
- 故障转移 < 30 秒
- 最多 15 个只读副本
- Global Database：跨区域复制
Google Cloud SQL：
- HA 配置：自动故障转移
- Read Replicas：跨区域
Supabase：
- 内置 Patroni + etcd
- 免费计划包含自动备份

本章记录清单

[ ] 配置每日 pg_dump 自动备份并上传到对象存储
[ ] 测试备份恢复流程（verify_backup.sh）
[ ] 在测试环境搭建主从流复制，验证延迟监控
[ ] 评估是否需要 Patroni（> 5 万 DAU 的生产应用推荐配置）

下一章：连接池、性能调优与监控——高可用架构搭好后，连接管理是下一个瓶颈。pgBouncer 能把 1000 个应用连接压缩到 50 个数据库连接，大幅减少连接开销。