PostgreSQL数据库高可用——patroni REST API[翻译]

原创肥仔菌肥叔菌

Patroni 有丰富的 REST API，Patroni 在领导者竞争，patronictl工具用于执行故障转移/切换/重新初始化/重新启动/重新加载，由 HAProxy 或任何其他类型的负载平衡器执行 HTTP 健康检查，监控期间使用该 API。您将在下面找到 Patroni REST API endpoints。Patroni has a rich REST API, which is used by Patroni itself during the leader race, by the patronictl tool in order to perform failovers/switchovers/reinitialize/restarts/reloads, by HAProxy or any other kind of load balancer to perform HTTP health checks, and of course could also be used for monitoring. Below you will find the list of Patroni REST API endpoints.

Health check endpoints

对于所有健康检查GET请求，Patroni 返回一个带有节点状态的 JSON 文档以及 HTTP 状态代码。如果您不想要或不需要 JSON 文档，您可以考虑使用OPTIONS方法而不是GET（For all health check GET requests Patroni returns a JSON document with the status of the node, along with the HTTP status code. If you don’t want or don’t need the JSON document, you might consider using the OPTIONS method instead of GET.）。

仅当 Patroni 节点作为主节点运行时，Patroni REST API 的以下请求才会返回 HTTP 状态代码200 （The following requests to Patroni REST API will return HTTP status code 200 only when the Patroni node is running as the primary with leader lock）：GET /、GET /master、GET /primary、GET /read-write、
GET /standby-leader：仅当 Patroni 节点作为备用集群中的领导者运行时，才返回 HTTP 状态代码200（returns HTTP status code 200 only when the Patroni node is running as the leader in a standby cluster）。
GET /leader:当 Patroni 节点有领导锁时返回 HTTP 状态码200 。与前两个端点的主要区别在于它没有考虑 PostgreSQL 是primary作为standby_leader（returns HTTP status code 200 when the Patroni node has the leader lock. The major difference from the two previous endpoints is that it doesn’t take into account whether PostgreSQL is running as the primary or the standby_leader）。
GET /replica：副本健康检查端点。仅当 Patroni 节点处于状态running、角色为replica且未设置标签noloadbalance时，才会返回 HTTP 状态码200 （replica health check endpoint. It returns HTTP status code 200 only when the Patroni node is in the state running, the role is replica and noloadbalance tag is not set）。
GET /replica?lag=<max-lag>：副本检查端点。除了检查replica，它还检查复制延迟并仅在低于指定值时返回状态代码200 。出于性能原因，来自 DCS 的 key cluster.last_leader_operation 用于领导者 wal 位置和副本上的计算延迟。max-lag 可以以字节（整数）或人类可读值指定，例如 16kB、64MB、1GB（replica check endpoint. In addition to checks from replica, it also checks replication latency and returns status code 200 only when it is below specified value. The key cluster.last_leader_operation from DCS is used for Leader wal position and compute latency on replica for performance reasons. max-lag can be specified in bytes (integer) or in human readable values, for e.g. 16kB, 64MB, 1GB.）。GET /replica?lag=1048576、GET /replica?lag=1024kB、GET /replica?lag=10MB、GET /replica?lag=1GB
GET /replica?tag_key1=value1&tag_key2=value2：副本检查端点。此外，它还会在 yaml 配置管理的标签部分检查用户定义的标签key1及其key2各自的值。如果没有为实例定义标签，或者 yaml 配置中的值与查询值不匹配，它将返回 HTTP 状态代码 503（replica check endpoint. In addition, It will also check for user defined tags key1 and key2 and their respective values in the tags section of the yaml configuration management. If the tag isn’t defined for an instance, or if the value in the yaml configuration doesn’t match the querying value, it will return HTTP Status Code 503）。在以下请求中，由于我们正在检查领导者或备用领导者状态，Patroni 不应用任何用户定义的标签，它们将被忽略（In the following requests, since we are checking for the leader or standby-leader status, Patroni doesn’t apply any of the user defined tags and they will be ignored）。GET /?tag_key1=value1&tag_key2=value2、GET /master?tag_key1=value1&tag_key2=value2、GET /leader?tag_key1=value1&tag_key2=value2、GET /primary?tag_key1=value1&tag_key2=value2、GET /read-write?tag_key1=value1&tag_key2=value2、GET /standby_leader?tag_key1=value1&tag_key2=value2、

、GET /standby-leader?tag_key1=value1&tag_key2=value2

GET /read-only: 和上面的endpoint一样，也包括primary（like the above endpoint, but also includes the primary）。
GET /synchronous或GET /sync：仅当 Patroni 节点作为同步备用节点运行时返回 HTTP 状态代码200 （returns HTTP status code 200 only when the Patroni node is running as a synchronous standby）。
GET /asynchronous或GET /async：仅当 Patroni 节点作为异步备用节点运行时返回 HTTP 状态代码200（returns HTTP status code 200 only when the Patroni node is running as an asynchronous standby）。
GET /asynchronous?lag=<max-lag>或GET /async?lag=<max-lag>：异步备用检查端点。除了检查asynchronous or async之外，它还检查复制延迟并仅在低于指定值时返回状态代码200。出于性能原因，来自 DCS 的 key cluster.last_leader_operation 用于领导者 wal 位置和副本上的计算延迟。max-lag 可以以字节（整数）或人类可读值指定，例如 16kB、64MB、1GB（asynchronous standby check endpoint. In addition to checks from asynchronous or async, it also checks replication latency and returns status code 200 only when it is below specified value. The key cluster.last_leader_operation from DCS is used for Leader wal position and compute latency on replica for performance reasons. max-lag can be specified in bytes (integer) or in human readable values, for e.g. 16kB, 64MB, 1GB）。GET /async?lag=1048576、GET /async?lag=1024kB、GET /async?lag=10MB、GET /async?lag=1GB
GET /health：仅在 PostgreSQL 启动并运行时返回 HTTP 状态代码200（returns HTTP status code 200 only when PostgreSQL is up and running）。
GET /liveness: 总是返回 HTTP 状态码200仅表示 Patroni 正在运行。可用于livenessProbe（always returns HTTP status code 200 what only indicates that Patroni is running. Could be used for livenessProbe）.
GET /readiness：当 Patroni 节点作为领导者运行或 PostgreSQL 启动并运行时，返回 HTTP 状态代码200 。当无法readinessProbe使用 Kubernetes 端点进行领导者选举 (OpenShift) 时，可以使用端点（returns HTTP status code 200 when the Patroni node is running as the leader or when PostgreSQL is up and running. The endpoint could be used for readinessProbe when it is not possible to use Kubernetes endpoints for leader elections (OpenShift)）。

readiness和liveness端点都非常轻量级，不执行任何 SQL。探针的配置方式应使其在领导密钥到期时开始失败。使用默认值ttl，即30s示例探针将如下所示（Both, readiness and liveness endpoints are very light-weight and not executing any SQL. Probes should be configured in such a way that they start failing about time when the leader key is expiring. With the default value of ttl, which is 30s example probes would look like）：

readinessProbe:  httpGet:    scheme: HTTP    path: /readiness    port: 8008  initialDelaySeconds: 3  periodSeconds: 10  timeoutSeconds: 5  successThreshold: 1  failureThreshold: 3livenessProbe:  httpGet:    scheme: HTTP    path: /liveness    port: 8008  initialDelaySeconds: 3  periodSeconds: 10  timeoutSeconds: 5  successThreshold: 1  failureThreshold: 3

Monitoring endpoint

Patroni 在领先者竞争中使用GET /patroni。您的监控系统也可以使用它。此端点生成的 JSON 文档与运行状况检查端点生成的 JSON 具有相同的结构（The GET /patroni is used by Patroni during the leader race. It also could be used by your monitoring system. The JSON document produced by this endpoint has the same structure as the JSON produced by the health check endpoints）。

$ curl -s http://localhost:8008/patroni | jq .{  "state": "running",  "postmaster_start_time": "2019-09-24 09:22:32.555 CEST",  "role": "master",  "server_version": 110005,  "cluster_unlocked": false,  "xlog": {    "location": 25624640  },  "timeline": 3,  "database_system_identifier": "6739877027151648096",  "patroni": {    "version": "1.6.0",    "scope": "batman"  }}

Cluster status endpoints

GET /cluster 端点产生描述当前集群拓扑和状态的JSON文档：

$ curl -s http://localhost:8008/cluster | jq .{  "members": [    {      "name": "postgresql0",      "host": "127.0.0.1",      "port": 5432,      "role": "leader",      "state": "running",      "api_url": "http://127.0.0.1:8008/patroni",      "timeline": 5,      "tags": {        "clonefrom": true      }    },    {      "name": "postgresql1",      "host": "127.0.0.1",      "port": 5433,      "role": "replica",      "state": "running",      "api_url": "http://127.0.0.1:8009/patroni",      "timeline": 5,      "tags": {        "clonefrom": true      },      "lag": 0    }  ],  "scheduled_switchover": {    "at": "2019-09-24T10:36:00+02:00",    "from": "postgresql0"  }}

GET /history端点提供集群 switchovers / failovers的历史观点。格式与目录中历史文件的内容非常相似。唯一的区别是显示新时间线创建时间的时间戳字段（The GET /history endpoint provides a view on the history of cluster switchovers/failovers. The format is very similar to the content of history files in the pg_wal directory. The only difference is the timestamp field showing when the new timeline was created）。

$ curl -s http://localhost:8008/history | jq .[  [    1,    25623960,    "no recovery target specified",    "2019-09-23T16:57:57+02:00"],  [    2,    25624344,    "no recovery target specified",    "2019-09-24T09:22:33+02:00"],  [    3,    25624752,    "no recovery target specified",    "2019-09-24T09:26:15+02:00"],  [    4,    50331856,    "no recovery target specified",    "2019-09-24T09:35:52+02:00"]]

Config endpoint

GET /config：获取当前版本的动态配置

$ curl -s localhost:8008/config | jq .{  "ttl": 30,  "loop_wait": 10,  "retry_timeout": 10,  "maximum_lag_on_failover": 1048576,  "postgresql": {    "use_slots": true,    "use_pg_rewind": true,    "parameters": {      "hot_standby": "on",      "wal_log_hints": "on",      "wal_level": "hot_standby",      "max_wal_senders": 5,      "max_replication_slots": 5,      "max_connections": "100"    }  }}

PATCH /config：更改现有配置

$ curl -s -XPATCH -d \        '{"loop_wait":5,"ttl":20,"postgresql":{"parameters":{"max_connections":"101"}}}' \        http://localhost:8008/config | jq .{  "ttl": 20,  "loop_wait": 5,  "maximum_lag_on_failover": 1048576,  "retry_timeout": 10,  "postgresql": {    "use_slots": true,    "use_pg_rewind": true,    "parameters": {      "hot_standby": "on",      "wal_log_hints": "on",      "wal_level": "hot_standby",      "max_wal_senders": 5,      "max_replication_slots": 5,      "max_connections": "101"    }  }}

上述 REST API 调用修补现有配置并返回新配置（The above REST API call patches the existing configuration and returns the new configuration）。

让我们检查节点是否处理了这个配置。首先，它应该每 5 秒开始打印日志行（loop_wait=5）。“max_connections”的改变需要重启，所以应该暴露“pending_restart”标志（Let’s check that the node processed this configuration. First of all it should start printing log lines every 5 seconds (loop_wait=5). The change of “max_connections” requires a restart, so the “pending_restart” flag should be exposed）：

$ curl -s http://localhost:8008/patroni | jq .{  "pending_restart": true,  "database_system_identifier": "6287881213849985952",  "postmaster_start_time": "2016-06-13 13:13:05.211 CEST",  "xlog": {    "location": 2197818976  },  "patroni": {    "scope": "batman",    "version": "1.0"  },  "state": "running",  "role": "master",  "server_version": 90503}

删除参数：
如果您想删除（重置）某些设置，只需使用以下命令对其进行修补null：

$ curl -s -XPATCH -d \        '{"postgresql":{"parameters":{"max_connections":null}}}' \        http://localhost:8008/config | jq .{  "ttl": 20,  "loop_wait": 5,  "retry_timeout": 10,  "maximum_lag_on_failover": 1048576,  "postgresql": {    "use_slots": true,    "use_pg_rewind": true,    "parameters": {      "hot_standby": "on",      "unix_socket_directories": ".",      "wal_level": "hot_standby",      "wal_log_hints": "on",      "max_wal_senders": 5,      "max_replication_slots": 5    }  }}

上述调用postgresql.parameters.max_connections从动态配置中删除。

PUT /config：也可以无条件地完全重写现有的动态配置：

$ curl -s -XPUT -d \        '{"maximum_lag_on_failover":1048576,"retry_timeout":10,"postgresql":{"use_slots":true,"use_pg_rewind":true,"parameters":{"hot_standby":"on","wal_log_hints":"on","wal_level":"hot_standby","unix_socket_directories":".","max_wal_senders":5}},"loop_wait":3,"ttl":20}' \        http://localhost:8008/config | jq .{  "ttl": 20,  "maximum_lag_on_failover": 1048576,  "retry_timeout": 10,  "postgresql": {    "use_slots": true,    "parameters": {      "hot_standby": "on",      "unix_socket_directories": ".",      "wal_level": "hot_standby",      "wal_log_hints": "on",      "max_wal_senders": 5    },    "use_pg_rewind": true  },  "loop_wait": 3}

Switchover and failover endpoints

POST /switchover或POST /failover。这些端点彼此非常相似。但是有一些细微的差别：

故障转移端点允许在没有健康节点时执行手动故障转移，但同时它不允许您安排切换（The failover endpoint allows to perform a manual failover when there are no healthy nodes, but at the same time it will not allow you to schedule a switchover）。
切换端点则相反。它仅在集群健康（有领导者）并允许在给定时间安排切换时工作（The switchover endpoint is the opposite. It works only when the cluster is healthy (there is a leader) and allows to schedule a switchover at a given time）。在POST请求的 JSON 正文中，您必须至少指定leader or candidate字段，scheduled_at如果您想在特定时间安排切换，则可以选择指定字段（In the JSON body of the POST request you must specify at least the leader or candidate fields and optionally the scheduled_at field if you want to schedule a switchover at a specific time）。示例：对特定节点执行故障转移：

$ curl -s http://localhost:8009/failover -XPOST -d '{"candidate":"postgresql1"}'Successfully failed over to "postgresql1"

示例：在特定时间安排从领导者到集群中任何其他健康副本的切换（schedule a switchover from the leader to any other healthy replica in the cluster at a specific time）：

$ curl -s http://localhost:8008/switchover -XPOST -d \        '{"leader":"postgresql0","scheduled_at":"2019-09-24T12:00+00"}'Switchover scheduled

根据具体情况，请求可能以不同的 HTTP 状态代码和正文结束。切换或故障切换成功完成时返回状态码200 。如果切换成功安排，Patroni 将返回 HTTP 状态代码202。如果出现问题，将在响应正文中返回错误状态代码（400、412 或 503 之一）以及一些详细信息。有关更多信息，请查看patroni/api.py:do_POST_failover()方法的源代码（Depending on the situation the request might finish with a different HTTP status code and body. The status code 200 is returned when the switchover or failover successfully completed. If the switchover was successfully scheduled, Patroni will return HTTP status code 202. In case something went wrong, the error status code (one of 400, 412 or 503) will be returned with some details in the response body. For more information please check the source code of patroni/api.py:do_POST_failover() method）。

DELETE /switchover：删除预定的切换

POST /switchover和POST端点分别由patronictl switchover和patronictl switchover使用。DELETE /switchover由patronictl flush <cluster-name> switchover使用。（The POST /switchover and POST failover endpoints are used by patronictl switchover and patronictl failover, respectively. The DELETE /switchover is used by patronictl flush switchover）

Restart endpoint

POST /restart：您可以通过执行POST /restartPOST调用在特定节点上重新启动 Postgres。在请求的 JSON 正文中，可以选择指定一些重启条件（You can restart Postgres on the specific node by performing the POST /restart call. In the JSON body of POST request it is possible to optionally specify some restart conditions）：

restart_pending : 布尔值，如果设置为true Patroni 将仅在重新启动挂起时重新启动 PostgreSQL，以便在 PostgreSQL 配置中应用一些更改（boolean, if set to true Patroni will restart PostgreSQL only when restart is pending in order to apply some changes in the PostgreSQL config）。
role：仅当节点的当前角色与 POST 请求中的角色匹配时才执行重启（perform restart only if the current role of the node matches with the role from the POST request）。
postgres_version：仅当 postgres 的当前版本小于 POST 请求中指定的版本时才执行重启（perform restart only if the current version of postgres is smaller than specified in the POST request）。
timeout：在 PostgreSQL 开始接受连接之前我们应该等待多长时间。覆盖master_start_timeout（how long we should wait before PostgreSQL starts accepting connections. Overrides master_start_timeout）。
schedule：带有时区的时间戳，安排在将来某个地方重新启动（timestamp with time zone, schedule the restart somewhere in the future）。

DELETE /restart: 删除预定重启

POST /restart和DELETE /restart端点分别由patronictl restart和patronictl flush <cluster-name> restart使用。

Reload endpoint

POST /reload调用将命令 Patroni 重新读取并应用配置文件。这相当于将信号SIGHUP发送到 Patroni 进程。如果您更改了一些需要重新启动的 Postgres 参数（例如shared_buffers），您仍然必须通过调用POST /restart端点或使用patronictl restart。

（The POST /reload call will order Patroni to re-read and apply the configuration file. This is the equivalent of sending the SIGHUP signal to the Patroni process. In case you changed some of the Postgres parameters which require a restart (like shared_buffers), you still have to explicitly do the restart of Postgres by either calling the POST /restart endpoint or with the help of patronictl restart）

.patronictl reload重新加载端点。

Reinitialize endpoint

POST /reinitialize: 重新初始化指定节点上的 PostgreSQL 数据目录。它只允许在副本上执行。一旦被调用，它将删除数据目录并启动pg_basebackup或一些替代的副本创建方法（reinitialize the PostgreSQL data directory on the specified node. It is allowed to be executed only on replicas. Once called, it will remove the data directory and start pg_basebackup or some alternative replica creation method）。

如果 Patroni 处于试图恢复（重新启动）失败的 Postgres 的循环中，则调用可能会失败。为了克服这个问题，可以{“force”:true}在请求正文中指定（The call might fail if Patroni is in a loop trying to recover (restart) a failed Postgres. In order to overcome this problem one can specify {“force”:true} in the request body）。

重新初始化端点由patronictl reinit使用（The reinitialize endpoint is used by patronictl reinit）。

继续滑动看下一个