enhance docs on clone and restore (#1592)

* enhance docs on clone and restore * add chapter about upgrading the operator * add section for standby clusters * Update docs/administrator.md Co-authored-by: Alexander Kukushkin <cyberdemn@gmail.com> Co-authored-by: Alexander Kukushkin <cyberdemn@gmail.com>
2021-08-27 10:44:06 +02:00 · 2021-08-27 10:44:06 +02:00 · 7469efac88
parent 1dd0cd9691
commit 7469efac88
2 changed files with 85 additions and 20 deletions
--- a/docs/administrator.md
+++ b/docs/administrator.md
@ -3,6 +3,21 @@
 Learn how to configure and manage the Postgres Operator in your Kubernetes (K8s)
 environment.
 ## Upgrading the operator
 The Postgres Operator is upgraded by changing the docker image within the
 deployment. Before doing so, it is recommended to check the release notes
 for new configuration options or changed behavior you might want to reflect
 in the ConfigMap or config CRD. E.g. a new feature might get introduced which
 is enabled or disabled by default and you want to change it to the opposite
 with the corresponding flag option.
 When using helm, be aware that installing the new chart will not update the
 `Postgresql` and `OperatorConfiguration` CRD. Make sure to update them before
 with the provided manifests in the `crds` folder. Otherwise, you might face
 errors about new Postgres manifest or configuration options being unknown
 to the CRD schema validation.
 ## Minor and major version upgrade
 Minor version upgrades for PostgreSQL are handled via updating the Spilo Docker
@ -157,20 +172,26 @@ from numerous escape characters in the latter log entry, view it in CLI with
 `PodTemplate` used by the operator is yet to be updated with the default values
 used internally in K8s.
-The operator also support lazy updates of the Spilo image. That means the pod
+The StatefulSet is replaced if the following properties change:
-template of a PG cluster's stateful set is updated immediately with the new
+- annotations
-image, but no rolling update follows. This feature saves you a switchover - and
+- volumeClaimTemplates
-hence downtime - when you know pods are re-started later anyway, for instance
+- template volumes
 due to the node rotation. To force a rolling update, disable this mode by
 setting the `enable_lazy_spilo_upgrade` to `false` in the operator configuration
 and restart the operator pod. With the standard eager rolling updates the
 operator checks during Sync all pods run images specified in their respective
 statefulsets. The operator triggers a rolling upgrade for PG clusters that
 violate this condition.
-Changes in $SPILO\_CONFIGURATION under path bootstrap.dcs are ignored when
+The StatefulSet is replaced and a rolling updates is triggered if the following
-StatefulSets are being compared, if there are changes under this path, they are
+properties differ between the old and new state:
-applied through rest api interface and following restart of patroni instance
+- container name, ports, image, resources, env, envFrom, securityContext and volumeMounts
 - template labels, annotations, service account, securityContext, affinity, priority class and termination grace period
 Note that, changes in `SPILO_CONFIGURATION` env variable under `bootstrap.dcs`
 path are ignored for the diff. They will be applied through Patroni's rest api
 interface, following a restart of all instances.
 The operator also support lazy updates of the Spilo image. In this case the
 StatefulSet is only updated, but no rolling update follows. This feature saves
 you a switchover - and hence downtime - when you know pods are re-started later
 anyway, for instance due to the node rotation. To force a rolling update,
 disable this mode by setting the `enable_lazy_spilo_upgrade` to `false` in the
 operator configuration and restart the operator pod.
 ## Delete protection via annotations
@ -667,6 +688,12 @@ if it ends up in your specified WAL backup path:
 envdir "/run/etc/wal-e.d/env" /scripts/postgres_backup.sh "/home/postgres/pgdata/pgroot/data"
 ```
 You can also check if Spilo is able to find any backups:
 ```bash
 envdir "/run/etc/wal-e.d/env" wal-g backup-list
 ```
 Depending on the cloud storage provider different [environment variables](https://github.com/zalando/spilo/blob/master/ENVIRONMENT.rst)
 have to be set for Spilo. Not all of them are generated automatically by the
 operator by changing its configuration. In this case you have to use an
@ -734,8 +761,15 @@ WALE_S3_ENDPOINT='https+path://s3.eu-central-1.amazonaws.com:443'
 WALE_S3_PREFIX=$WAL_S3_BUCKET/spilo/{WAL_BUCKET_SCOPE_PREFIX}{SCOPE}{WAL_BUCKET_SCOPE_SUFFIX}/wal/{PGVERSION}
 ```
-If the prefix is not specified Spilo will generate it from `WAL_S3_BUCKET`.
+The operator sets the prefix to an empty string so that spilo will generate it
-When the `AWS_REGION` is set `AWS_ENDPOINT` and `WALE_S3_ENDPOINT` are
+from the configured `WAL_S3_BUCKET`. 
 :warning: When you overwrite the configuration by defining `WAL_S3_BUCKET` in
 the [pod_environment_configmap](#custom-pod-environment-variables) you have
 to set `WAL_BUCKET_SCOPE_PREFIX = ""`, too. Otherwise Spilo will not find
 the physical backups on restore (next chapter).
 When the `AWS_REGION` is set, `AWS_ENDPOINT` and `WALE_S3_ENDPOINT` are
 generated automatically. `WALG_S3_PREFIX` is identical to `WALE_S3_PREFIX`.
 `SCOPE` is the Postgres cluster name.
@ -874,6 +908,36 @@ on one of the other running instances (preferably replicas if they do not lag
 behind). You can test restoring backups by [cloning](user.md#how-to-clone-an-existing-postgresql-cluster)
 clusters.
 If you need to provide a [custom clone environment](#custom-pod-environment-variables)
 copy existing variables about your setup (backup location, prefix, access
 keys etc.) and prepend the `CLONE_` prefix to get them copied to the correct
 directory within Spilo.
 ```yaml
 apiVersion: v1
 kind: ConfigMap
 metadata:
  name: postgres-pod-config
 data:
  AWS_REGION: "eu-west-1"
  AWS_ACCESS_KEY_ID: "****"
  AWS_SECRET_ACCESS_KEY: "****"
  ...
  CLONE_AWS_REGION: "eu-west-1"
  CLONE_AWS_ACCESS_KEY_ID: "****"
  CLONE_AWS_SECRET_ACCESS_KEY: "****"
  ...
 ```
 ### Standby clusters
 The setup for [standby clusters](user.md#setting-up-a-standby-cluster) is very
 similar to cloning. At the moment, the operator only allows for streaming from
 the S3 WAL archive of the master specified in the manifest. Like with cloning,
 if you are using [additional environment variables](#custom-pod-environment-variables)
 to access your backup location you have to copy those variables and prepend the
 `STANDBY_` prefix for Spilo to find the backups and WAL files to stream.
 ## Logical backups
 The operator can manage K8s cron jobs to run logical backups (SQL dumps) of
--- a/docs/user.md
+++ b/docs/user.md
@ -733,20 +733,21 @@ spec:
    uid: "efd12e58-5786-11e8-b5a7-06148230260c"
    cluster: "acid-batman"
    timestamp: "2017-12-19T12:40:33+01:00"
    s3_wal_path: "s3://<bucketname>/spilo/<source_db_cluster>/<UID>/wal/<PGVERSION>"
 ```
 Here `cluster` is a name of a source cluster that is going to be cloned. A new
 cluster will be cloned from S3, using the latest backup before the `timestamp`.
 Note, that a time zone is required for `timestamp` in the format of +00:00 which
-is UTC. The `uid` field is also mandatory. The operator will use it to find a
+is UTC. You can specify the `s3_wal_path` of the source cluster or let the
-correct key inside an S3 bucket. You can find this field in the metadata of the
+operator try to find it based on the configured `wal_[s3|gs]_bucket` and the
-source cluster:
+specified `uid`. You can find the UID of the source cluster in its metadata:
 ```yaml
 apiVersion: acid.zalan.do/v1
 kind: postgresql
 metadata:
-  name: acid-test-cluster
+  name: acid-batman
  uid: efd12e58-5786-11e8-b5a7-06148230260c
 ```
@ -799,7 +800,7 @@ no statefulset will be created.
 ```yaml
 spec:
  standby:
-    s3_wal_path: "s3 bucket path to the master"
+    s3_wal_path: "s3://<bucketname>/spilo/<source_db_cluster>/<UID>/wal/<PGVERSION>"
 ```
 At the moment, the operator only allows to stream from the WAL archive of the