be more permissive with standbys (#842)
* be more permissive with standbys * reflect feedback and updated docs
This commit is contained in:
		
							parent
							
								
									7b94060d17
								
							
						
					
					
						commit
						b997e3682f
					
				|  | @ -11,11 +11,11 @@ switchover (planned failover) of the master to the Pod with new minor version. | ||||||
| The switch should usually take less than 5 seconds, still clients have to | The switch should usually take less than 5 seconds, still clients have to | ||||||
| reconnect. | reconnect. | ||||||
| 
 | 
 | ||||||
| Major version upgrades are supported via [cloning](user.md#clone-directly). The | Major version upgrades are supported via [cloning](user.md#how-to-clone-an-existing-postgresql-cluster). | ||||||
| new cluster manifest must have a higher `version` string than the source cluster | The new cluster manifest must have a higher `version` string than the source | ||||||
| and will be created from a basebackup. Depending of the cluster size, downtime | cluster and will be created from a basebackup. Depending of the cluster size, | ||||||
| in this case can be significant as writes to the database should be stopped and | downtime in this case can be significant as writes to the database should be | ||||||
| all WAL files should be archived first before cloning is started. | stopped and all WAL files should be archived first before cloning is started. | ||||||
| 
 | 
 | ||||||
| Note, that simply changing the version string in the `postgresql` manifest does | Note, that simply changing the version string in the `postgresql` manifest does | ||||||
| not work at present and leads to errors. Neither Patroni nor Postgres Operator | not work at present and leads to errors. Neither Patroni nor Postgres Operator | ||||||
|  |  | ||||||
|  | @ -110,8 +110,10 @@ Those are top-level keys, containing both leaf keys and groups. | ||||||
| 
 | 
 | ||||||
| * **min_instances** | * **min_instances** | ||||||
|   operator will run at least the number of instances for any given Postgres |   operator will run at least the number of instances for any given Postgres | ||||||
|   cluster equal to the value of this parameter. When `-1` is specified, no |   cluster equal to the value of this parameter. Standby clusters can still run | ||||||
|   limits are applied. The default is `-1`. |   with `numberOfInstances: 1` as this is the [recommended setup](../user.md#setting-up-a-standby-cluster). | ||||||
|  |   When `-1` is specified for `min_instances`, no limits are applied. The default | ||||||
|  |   is `-1`. | ||||||
| 
 | 
 | ||||||
| * **resync_period** | * **resync_period** | ||||||
|   period between consecutive sync requests. The default is `30m`. |   period between consecutive sync requests. The default is `30m`. | ||||||
|  |  | ||||||
							
								
								
									
										134
									
								
								docs/user.md
								
								
								
								
							
							
						
						
									
										134
									
								
								docs/user.md
								
								
								
								
							|  | @ -254,29 +254,22 @@ spec: | ||||||
| 
 | 
 | ||||||
| ## How to clone an existing PostgreSQL cluster | ## How to clone an existing PostgreSQL cluster | ||||||
| 
 | 
 | ||||||
| You can spin up a new cluster as a clone of the existing one, using a clone | You can spin up a new cluster as a clone of the existing one, using a `clone` | ||||||
| section in the spec. There are two options here: | section in the spec. There are two options here: | ||||||
| 
 | 
 | ||||||
| * Clone directly from a source cluster using `pg_basebackup` | * Clone from an S3 bucket (recommended) | ||||||
| * Clone from an S3 bucket | * Clone directly from a source cluster | ||||||
| 
 | 
 | ||||||
| ### Clone directly | Note, that cloning can also be used for [major version upgrades](administrator.md#minor-and-major-version-upgrade) | ||||||
| 
 | of PostgreSQL. | ||||||
| ```yaml |  | ||||||
| spec: |  | ||||||
|   clone: |  | ||||||
|     cluster: "acid-batman" |  | ||||||
| ``` |  | ||||||
| 
 |  | ||||||
| Here `cluster` is a name of a source cluster that is going to be cloned. The |  | ||||||
| cluster to clone is assumed to be running and the clone procedure invokes |  | ||||||
| `pg_basebackup` from it. The operator will setup the cluster to be cloned to |  | ||||||
| connect to the service of the source cluster by name (if the cluster is called |  | ||||||
| test, then the connection string will look like host=test port=5432), which |  | ||||||
| means that you can clone only from clusters within the same namespace. |  | ||||||
| 
 | 
 | ||||||
| ### Clone from S3 | ### Clone from S3 | ||||||
| 
 | 
 | ||||||
|  | Cloning from S3 has the advantage that there is no impact on your production | ||||||
|  | database. A new Postgres cluster is created by restoring the data of another | ||||||
|  | source cluster. If you create it in the same Kubernetes environment, use a | ||||||
|  | different name. | ||||||
|  | 
 | ||||||
| ```yaml | ```yaml | ||||||
| spec: | spec: | ||||||
|   clone: |   clone: | ||||||
|  | @ -287,7 +280,8 @@ spec: | ||||||
| 
 | 
 | ||||||
| Here `cluster` is a name of a source cluster that is going to be cloned. A new | Here `cluster` is a name of a source cluster that is going to be cloned. A new | ||||||
| cluster will be cloned from S3, using the latest backup before the `timestamp`. | cluster will be cloned from S3, using the latest backup before the `timestamp`. | ||||||
| In this case, `uid` field is also mandatory - operator will use it to find a | Note, that a time zone is required for `timestamp` in the format of +00:00 which | ||||||
|  | is UTC. The `uid` field is also mandatory. The operator will use it to find a | ||||||
| correct key inside an S3 bucket. You can find this field in the metadata of the | correct key inside an S3 bucket. You can find this field in the metadata of the | ||||||
| source cluster: | source cluster: | ||||||
| 
 | 
 | ||||||
|  | @ -299,9 +293,6 @@ metadata: | ||||||
|   uid: efd12e58-5786-11e8-b5a7-06148230260c |   uid: efd12e58-5786-11e8-b5a7-06148230260c | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
| Note that timezone is required for `timestamp`. Otherwise, offset is relative |  | ||||||
| to UTC, see [RFC 3339 section 5.6) 3339 section 5.6](https://www.ietf.org/rfc/rfc3339.txt). |  | ||||||
| 
 |  | ||||||
| For non AWS S3 following settings can be set to support cloning from other S3 | For non AWS S3 following settings can be set to support cloning from other S3 | ||||||
| implementations: | implementations: | ||||||
| 
 | 
 | ||||||
|  | @ -317,14 +308,35 @@ spec: | ||||||
|     s3_force_path_style: true |     s3_force_path_style: true | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
|  | ### Clone directly | ||||||
|  | 
 | ||||||
|  | Another way to get a fresh copy of your source DB cluster is via basebackup. To | ||||||
|  | use this feature simply leave out the timestamp field from the clone section. | ||||||
|  | The operator will connect to the service of the source cluster by name. If the | ||||||
|  | cluster is called test, then the connection string will look like host=test | ||||||
|  | port=5432), which means that you can clone only from clusters within the same | ||||||
|  | namespace. | ||||||
|  | 
 | ||||||
|  | ```yaml | ||||||
|  | spec: | ||||||
|  |   clone: | ||||||
|  |     cluster: "acid-batman" | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | Be aware that on a busy source database this can result in an elevated load! | ||||||
|  | 
 | ||||||
| ## Setting up a standby cluster | ## Setting up a standby cluster | ||||||
| 
 | 
 | ||||||
| Standby clusters are like normal cluster but they are streaming from a remote | Standby cluster is a [Patroni feature](https://github.com/zalando/patroni/blob/master/docs/replica_bootstrap.rst#standby-cluster) | ||||||
| cluster. As the first version of this feature, the only scenario covered by | that first clones a database, and keeps replicating changes afterwards. As the | ||||||
| operator is to stream from a WAL archive of the master. Following the more | replication is happening by the means of archived WAL files (stored on S3 or | ||||||
| popular infrastructure of using Amazon's S3 buckets, it is mentioned as | the equivalent of other cloud providers), the standby cluster can exist in a | ||||||
| `s3_wal_path` here. To start a cluster as standby add the following `standby` | different location than its source database. Unlike cloning, the PostgreSQL | ||||||
| section in the YAML file: | version between source and target cluster has to be the same. | ||||||
|  | 
 | ||||||
|  | To start a cluster as standby, add the following `standby` section in the YAML | ||||||
|  | file and specify the S3 bucket path. An empty path will result in an error and | ||||||
|  | no statefulset will be created. | ||||||
| 
 | 
 | ||||||
| ```yaml | ```yaml | ||||||
| spec: | spec: | ||||||
|  | @ -332,20 +344,62 @@ spec: | ||||||
|     s3_wal_path: "s3 bucket path to the master" |     s3_wal_path: "s3 bucket path to the master" | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
| Things to note: | At the moment, the operator only allows to stream from the WAL archive of the | ||||||
|  | master. Thus, it is recommended to deploy standby clusters with only [one pod](../manifests/standby-manifest.yaml#L10). | ||||||
|  | You can raise the instance count when detaching. Note, that the same pod role | ||||||
|  | labels like for normal clusters are used: The standby leader is labeled as | ||||||
|  | `master`. | ||||||
| 
 | 
 | ||||||
| - An empty string in the `s3_wal_path` field of the standby cluster will result | ### Providing credentials of source cluster | ||||||
|   in an error and no statefulset will be created. | 
 | ||||||
| - Only one pod can be deployed for stand-by cluster. | A standby cluster is replicating the data (including users and passwords) from | ||||||
| - To manually promote the standby_cluster, use `patronictl` and remove config | the source database and is read-only. The system and application users (like | ||||||
|   entry. | standby, postgres etc.) all have a password that does not match the credentials | ||||||
| - There is no way to transform a non-standby cluster to a standby cluster | stored in secrets which are created by the operator. One solution is to create | ||||||
|   through the operator. Adding the standby section to the manifest of a running | secrets beforehand and paste in the credentials of the source cluster. | ||||||
|   Postgres cluster will have no effect. However, it can be done through Patroni | Otherwise, you will see errors in the Postgres logs saying users cannot log in | ||||||
|   by adding the [standby_cluster](https://github.com/zalando/patroni/blob/bd2c54581abb42a7d3a3da551edf0b8732eefd27/docs/replica_bootstrap.rst#standby-cluster) | and the operator logs will complain about not being able to sync resources. | ||||||
|   section using `patronictl edit-config`. Note that the transformed standby | This, however, can safely be ignored as it will be sorted out once the cluster | ||||||
|   cluster will not be doing any streaming. It will be in standby mode and allow | is detached from the source (and it’s still harmless if you don’t plan to). | ||||||
|   read-only transactions only. | 
 | ||||||
|  | You can also edit the secrets afterwards. Find them by: | ||||||
|  | 
 | ||||||
|  | ```bash | ||||||
|  | kubectl get secrets --all-namespaces | grep <postgres-cluster-name> | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | ### Promote the standby | ||||||
|  | 
 | ||||||
|  | One big advantage of standby clusters is that they can be promoted to a proper | ||||||
|  | database cluster. This means it will stop replicating changes from the source, | ||||||
|  | and start accept writes itself. This mechanism makes it possible to move | ||||||
|  | databases from one place to another with minimal downtime. Currently, the | ||||||
|  | operator does not support promoting a standby cluster. It has to be done | ||||||
|  | manually using `patronictl edit-config` inside the postgres container of the | ||||||
|  | standby leader pod. Remove the following lines from the YAML structure and the | ||||||
|  | leader promotion happens immediately. Before doing so, make sure that the | ||||||
|  | standby is not behind the source database. | ||||||
|  | 
 | ||||||
|  | ```yaml | ||||||
|  | standby_cluster: | ||||||
|  |   create_replica_methods: | ||||||
|  |     - bootstrap_standby_with_wale | ||||||
|  |     - basebackup_fast_xlog | ||||||
|  |   restore_command: envdir "/home/postgres/etc/wal-e.d/env-standby" /scripts/restore_command.sh | ||||||
|  |      "%f" "%p" | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | Finally, remove the `standby` section from the postgres cluster manifest. | ||||||
|  | 
 | ||||||
|  | ### Turn a normal cluster into a standby | ||||||
|  | 
 | ||||||
|  | There is no way to transform a non-standby cluster to a standby cluster through | ||||||
|  | the operator. Adding the `standby` section to the manifest of a running | ||||||
|  | Postgres cluster will have no effect. But, as explained in the previous | ||||||
|  | paragraph it can be done manually through `patronictl edit-config`. This time, | ||||||
|  | by adding the `standby_cluster` section to the Patroni configuration. However, | ||||||
|  | the transformed standby cluster will not be doing any streaming. It will be in | ||||||
|  | standby mode and allow read-only transactions only. | ||||||
| 
 | 
 | ||||||
| ## Sidecar Support | ## Sidecar Support | ||||||
| 
 | 
 | ||||||
|  |  | ||||||
|  | @ -1048,11 +1048,13 @@ func (c *Cluster) getNumberOfInstances(spec *acidv1.PostgresSpec) int32 { | ||||||
| 	cur := spec.NumberOfInstances | 	cur := spec.NumberOfInstances | ||||||
| 	newcur := cur | 	newcur := cur | ||||||
| 
 | 
 | ||||||
| 	/* Limit the max number of pods to one, if this is standby-cluster */ |  | ||||||
| 	if spec.StandbyCluster != nil { | 	if spec.StandbyCluster != nil { | ||||||
| 		c.logger.Info("Standby cluster can have maximum of 1 pod") | 		if newcur == 1 { | ||||||
| 		min = 1 | 			min = newcur | ||||||
| 		max = 1 | 			max = newcur | ||||||
|  | 		} else { | ||||||
|  | 			c.logger.Warningf("operator only supports standby clusters with 1 pod") | ||||||
|  | 		} | ||||||
| 	} | 	} | ||||||
| 	if max >= 0 && newcur > max { | 	if max >= 0 && newcur > max { | ||||||
| 		newcur = max | 		newcur = max | ||||||
|  |  | ||||||
		Loading…
	
		Reference in New Issue