be more permissive with standbys
This commit is contained in:
		
							parent
							
								
									7b94060d17
								
							
						
					
					
						commit
						236fbbf2c6
					
				|  | @ -11,11 +11,11 @@ switchover (planned failover) of the master to the Pod with new minor version. | |||
| The switch should usually take less than 5 seconds, still clients have to | ||||
| reconnect. | ||||
| 
 | ||||
| Major version upgrades are supported via [cloning](user.md#clone-directly). The | ||||
| new cluster manifest must have a higher `version` string than the source cluster | ||||
| and will be created from a basebackup. Depending of the cluster size, downtime | ||||
| in this case can be significant as writes to the database should be stopped and | ||||
| all WAL files should be archived first before cloning is started. | ||||
| Major version upgrades are supported via [cloning](user.md#how-to-clone-an-existing-postgresql-cluster). | ||||
| The new cluster manifest must have a higher `version` string than the source | ||||
| cluster and will be created from a basebackup. Depending of the cluster size, | ||||
| downtime in this case can be significant as writes to the database should be | ||||
| stopped and all WAL files should be archived first before cloning is started. | ||||
| 
 | ||||
| Note, that simply changing the version string in the `postgresql` manifest does | ||||
| not work at present and leads to errors. Neither Patroni nor Postgres Operator | ||||
|  |  | |||
|  | @ -110,8 +110,9 @@ Those are top-level keys, containing both leaf keys and groups. | |||
| 
 | ||||
| * **min_instances** | ||||
|   operator will run at least the number of instances for any given Postgres | ||||
|   cluster equal to the value of this parameter. When `-1` is specified, no | ||||
|   limits are applied. The default is `-1`. | ||||
|   cluster equal to the value of this parameter, except for standby clusters | ||||
|   which run with one pod if `numberOfInstances` is set to 1. When `-1` is | ||||
|   specified for `min_instances`, no limits are applied. The default is `-1`. | ||||
| 
 | ||||
| * **resync_period** | ||||
|   period between consecutive sync requests. The default is `30m`. | ||||
|  |  | |||
							
								
								
									
										87
									
								
								docs/user.md
								
								
								
								
							
							
						
						
									
										87
									
								
								docs/user.md
								
								
								
								
							|  | @ -317,14 +317,21 @@ spec: | |||
|     s3_force_path_style: true | ||||
| ``` | ||||
| 
 | ||||
| Note, that cloning can also be used for [major version upgrades](administrator.md#minor-and-major-version-upgrade) | ||||
| of PostgreSQL. | ||||
| 
 | ||||
| ## Setting up a standby cluster | ||||
| 
 | ||||
| Standby clusters are like normal cluster but they are streaming from a remote | ||||
| cluster. As the first version of this feature, the only scenario covered by | ||||
| operator is to stream from a WAL archive of the master. Following the more | ||||
| popular infrastructure of using Amazon's S3 buckets, it is mentioned as | ||||
| `s3_wal_path` here. To start a cluster as standby add the following `standby` | ||||
| section in the YAML file: | ||||
| Standby cluster is a [Patroni feature](https://github.com/zalando/patroni/blob/master/docs/replica_bootstrap.rst#standby-cluster) | ||||
| that first clones a database, and keeps replicating changes afterwards. As the | ||||
| replication is happening by the means of archived WAL files (stored on S3 or | ||||
| the equivalent of other cloud providers), the standby cluster can exist in a | ||||
| different location than its source database. Unlike cloning, the PostgreSQL | ||||
| version between source and target cluster has to be the same. | ||||
| 
 | ||||
| To start a cluster as standby, add the following `standby` section in the YAML | ||||
| file and specify the S3 bucket path. An empty path will result in an error and | ||||
| no statefulset will be created. | ||||
| 
 | ||||
| ```yaml | ||||
| spec: | ||||
|  | @ -332,20 +339,62 @@ spec: | |||
|     s3_wal_path: "s3 bucket path to the master" | ||||
| ``` | ||||
| 
 | ||||
| Things to note: | ||||
| At the moment, the operator only allows to stream from the WAL archive of the | ||||
| master. Thus, it is recommended to deploy standby clusters with only [one pod](../manifests/standby-manifest.yaml#L10). | ||||
| You can raise the instance count when detaching. Note, that the same pod role | ||||
| labels like for normal clusters are used: The standby leader is labeled as | ||||
| `master`. | ||||
| 
 | ||||
| - An empty string in the `s3_wal_path` field of the standby cluster will result | ||||
|   in an error and no statefulset will be created. | ||||
| - Only one pod can be deployed for stand-by cluster. | ||||
| - To manually promote the standby_cluster, use `patronictl` and remove config | ||||
|   entry. | ||||
| - There is no way to transform a non-standby cluster to a standby cluster | ||||
|   through the operator. Adding the standby section to the manifest of a running | ||||
|   Postgres cluster will have no effect. However, it can be done through Patroni | ||||
|   by adding the [standby_cluster](https://github.com/zalando/patroni/blob/bd2c54581abb42a7d3a3da551edf0b8732eefd27/docs/replica_bootstrap.rst#standby-cluster) | ||||
|   section using `patronictl edit-config`. Note that the transformed standby | ||||
|   cluster will not be doing any streaming. It will be in standby mode and allow | ||||
|   read-only transactions only. | ||||
| ### Providing credentials of source cluster | ||||
| 
 | ||||
| A standby cluster is replicating the data (including users and passwords) from | ||||
| the source database and is read-only. The system and application users (like | ||||
| standby, postgres etc.) all have a password that does not match the credentials | ||||
| stored in secrets which are created by the operator. One solution is to create | ||||
| secrets beforehand and paste in the credentials of the source cluster. | ||||
| Otherwise, you will see errors in the Postgres logs saying users cannot log in | ||||
| and the operator logs will complain about not being able to sync resources. | ||||
| This, however, can safely be ignored as it will be sorted out once the cluster | ||||
| is detached from the source (and it’s still harmless if you don’t plan to). | ||||
| 
 | ||||
| You can also edit the secrets afterwards. Find them by: | ||||
| 
 | ||||
| ```bash | ||||
| kubectl get secrets --all-namespaces | grep <postgres-cluster-name> | ||||
| ``` | ||||
| 
 | ||||
| ### Promote the standby | ||||
| 
 | ||||
| One big advantage of standby clusters is that they can be promoted to a proper | ||||
| database cluster. This means it will stop replicating changes from the source, | ||||
| and start accept writes itself. This mechanism makes it possible to move | ||||
| databases from one place to another with minimal downtime. Currently, the | ||||
| operator does not support promoting a standby cluster. It has to be done | ||||
| manually using `patronictl edit-config` inside the postgres container of the | ||||
| standby leader pod. Remove the following lines from the YAML structure and the | ||||
| leader promotion happens immediately. Before doing so, make sure that the | ||||
| standby is not behind the source database. | ||||
| 
 | ||||
| ```yaml | ||||
| standby_cluster: | ||||
|   create_replica_methods: | ||||
|     - bootstrap_standby_with_wale | ||||
|     - basebackup_fast_xlog | ||||
|   restore_command: envdir "/home/postgres/etc/wal-e.d/env-standby" /scripts/restore_command.sh | ||||
|      "%f" "%p" | ||||
| ``` | ||||
| 
 | ||||
| Finally, remove the `standby` section from the postgres cluster manifest. | ||||
| 
 | ||||
| ### Turn a normal cluster into a standby | ||||
| 
 | ||||
| There is no way to transform a non-standby cluster to a standby cluster through | ||||
| the operator. Adding the `standby` section to the manifest of a running | ||||
| Postgres cluster will have no effect. But, as explained in the previous | ||||
| paragraph it can be done manually through `patronictl edit-config`. This time, | ||||
| by adding the `standby_cluster` section to the Patroni configuration. However, | ||||
| the transformed standby cluster will not be doing any streaming. It will be in | ||||
| standby mode and allow read-only transactions only. | ||||
| 
 | ||||
| ## Sidecar Support | ||||
| 
 | ||||
|  |  | |||
|  | @ -1048,11 +1048,13 @@ func (c *Cluster) getNumberOfInstances(spec *acidv1.PostgresSpec) int32 { | |||
| 	cur := spec.NumberOfInstances | ||||
| 	newcur := cur | ||||
| 
 | ||||
| 	/* Limit the max number of pods to one, if this is standby-cluster */ | ||||
| 	if spec.StandbyCluster != nil { | ||||
| 		c.logger.Info("Standby cluster can have maximum of 1 pod") | ||||
| 		min = 1 | ||||
| 		max = 1 | ||||
| 		if newcur > 1 { | ||||
| 			c.logger.Warningf("operator only supports standby clusters with 1 pod") | ||||
| 		} else { | ||||
| 			min = newcur | ||||
| 			max = newcur | ||||
| 		} | ||||
| 	} | ||||
| 	if max >= 0 && newcur > max { | ||||
| 		newcur = max | ||||
|  |  | |||
		Loading…
	
		Reference in New Issue