Merge branch 'master' into bump-v1.7.0

This commit is contained in:
Felix Kunde 2021-08-27 13:02:05 +02:00
commit 568fdb4ff5
8 changed files with 153 additions and 96 deletions

View File

@ -3,6 +3,21 @@
Learn how to configure and manage the Postgres Operator in your Kubernetes (K8s)
environment.
## Upgrading the operator
The Postgres Operator is upgraded by changing the docker image within the
deployment. Before doing so, it is recommended to check the release notes
for new configuration options or changed behavior you might want to reflect
in the ConfigMap or config CRD. E.g. a new feature might get introduced which
is enabled or disabled by default and you want to change it to the opposite
with the corresponding flag option.
When using helm, be aware that installing the new chart will not update the
`Postgresql` and `OperatorConfiguration` CRD. Make sure to update them before
with the provided manifests in the `crds` folder. Otherwise, you might face
errors about new Postgres manifest or configuration options being unknown
to the CRD schema validation.
## Minor and major version upgrade
Minor version upgrades for PostgreSQL are handled via updating the Spilo Docker
@ -157,20 +172,26 @@ from numerous escape characters in the latter log entry, view it in CLI with
`PodTemplate` used by the operator is yet to be updated with the default values
used internally in K8s.
The operator also support lazy updates of the Spilo image. That means the pod
template of a PG cluster's stateful set is updated immediately with the new
image, but no rolling update follows. This feature saves you a switchover - and
hence downtime - when you know pods are re-started later anyway, for instance
due to the node rotation. To force a rolling update, disable this mode by
setting the `enable_lazy_spilo_upgrade` to `false` in the operator configuration
and restart the operator pod. With the standard eager rolling updates the
operator checks during Sync all pods run images specified in their respective
statefulsets. The operator triggers a rolling upgrade for PG clusters that
violate this condition.
The StatefulSet is replaced if the following properties change:
- annotations
- volumeClaimTemplates
- template volumes
Changes in $SPILO\_CONFIGURATION under path bootstrap.dcs are ignored when
StatefulSets are being compared, if there are changes under this path, they are
applied through rest api interface and following restart of patroni instance
The StatefulSet is replaced and a rolling updates is triggered if the following
properties differ between the old and new state:
- container name, ports, image, resources, env, envFrom, securityContext and volumeMounts
- template labels, annotations, service account, securityContext, affinity, priority class and termination grace period
Note that, changes in `SPILO_CONFIGURATION` env variable under `bootstrap.dcs`
path are ignored for the diff. They will be applied through Patroni's rest api
interface, following a restart of all instances.
The operator also support lazy updates of the Spilo image. In this case the
StatefulSet is only updated, but no rolling update follows. This feature saves
you a switchover - and hence downtime - when you know pods are re-started later
anyway, for instance due to the node rotation. To force a rolling update,
disable this mode by setting the `enable_lazy_spilo_upgrade` to `false` in the
operator configuration and restart the operator pod.
## Delete protection via annotations
@ -667,6 +688,12 @@ if it ends up in your specified WAL backup path:
envdir "/run/etc/wal-e.d/env" /scripts/postgres_backup.sh "/home/postgres/pgdata/pgroot/data"
```
You can also check if Spilo is able to find any backups:
```bash
envdir "/run/etc/wal-e.d/env" wal-g backup-list
```
Depending on the cloud storage provider different [environment variables](https://github.com/zalando/spilo/blob/master/ENVIRONMENT.rst)
have to be set for Spilo. Not all of them are generated automatically by the
operator by changing its configuration. In this case you have to use an
@ -734,8 +761,15 @@ WALE_S3_ENDPOINT='https+path://s3.eu-central-1.amazonaws.com:443'
WALE_S3_PREFIX=$WAL_S3_BUCKET/spilo/{WAL_BUCKET_SCOPE_PREFIX}{SCOPE}{WAL_BUCKET_SCOPE_SUFFIX}/wal/{PGVERSION}
```
If the prefix is not specified Spilo will generate it from `WAL_S3_BUCKET`.
When the `AWS_REGION` is set `AWS_ENDPOINT` and `WALE_S3_ENDPOINT` are
The operator sets the prefix to an empty string so that spilo will generate it
from the configured `WAL_S3_BUCKET`.
:warning: When you overwrite the configuration by defining `WAL_S3_BUCKET` in
the [pod_environment_configmap](#custom-pod-environment-variables) you have
to set `WAL_BUCKET_SCOPE_PREFIX = ""`, too. Otherwise Spilo will not find
the physical backups on restore (next chapter).
When the `AWS_REGION` is set, `AWS_ENDPOINT` and `WALE_S3_ENDPOINT` are
generated automatically. `WALG_S3_PREFIX` is identical to `WALE_S3_PREFIX`.
`SCOPE` is the Postgres cluster name.
@ -874,6 +908,36 @@ on one of the other running instances (preferably replicas if they do not lag
behind). You can test restoring backups by [cloning](user.md#how-to-clone-an-existing-postgresql-cluster)
clusters.
If you need to provide a [custom clone environment](#custom-pod-environment-variables)
copy existing variables about your setup (backup location, prefix, access
keys etc.) and prepend the `CLONE_` prefix to get them copied to the correct
directory within Spilo.
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: postgres-pod-config
data:
AWS_REGION: "eu-west-1"
AWS_ACCESS_KEY_ID: "****"
AWS_SECRET_ACCESS_KEY: "****"
...
CLONE_AWS_REGION: "eu-west-1"
CLONE_AWS_ACCESS_KEY_ID: "****"
CLONE_AWS_SECRET_ACCESS_KEY: "****"
...
```
### Standby clusters
The setup for [standby clusters](user.md#setting-up-a-standby-cluster) is very
similar to cloning. At the moment, the operator only allows for streaming from
the S3 WAL archive of the master specified in the manifest. Like with cloning,
if you are using [additional environment variables](#custom-pod-environment-variables)
to access your backup location you have to copy those variables and prepend the
`STANDBY_` prefix for Spilo to find the backups and WAL files to stream.
## Logical backups
The operator can manage K8s cron jobs to run logical backups (SQL dumps) of

View File

@ -733,20 +733,21 @@ spec:
uid: "efd12e58-5786-11e8-b5a7-06148230260c"
cluster: "acid-batman"
timestamp: "2017-12-19T12:40:33+01:00"
s3_wal_path: "s3://<bucketname>/spilo/<source_db_cluster>/<UID>/wal/<PGVERSION>"
```
Here `cluster` is a name of a source cluster that is going to be cloned. A new
cluster will be cloned from S3, using the latest backup before the `timestamp`.
Note, that a time zone is required for `timestamp` in the format of +00:00 which
is UTC. The `uid` field is also mandatory. The operator will use it to find a
correct key inside an S3 bucket. You can find this field in the metadata of the
source cluster:
is UTC. You can specify the `s3_wal_path` of the source cluster or let the
operator try to find it based on the configured `wal_[s3|gs]_bucket` and the
specified `uid`. You can find the UID of the source cluster in its metadata:
```yaml
apiVersion: acid.zalan.do/v1
kind: postgresql
metadata:
name: acid-test-cluster
name: acid-batman
uid: efd12e58-5786-11e8-b5a7-06148230260c
```
@ -799,7 +800,7 @@ no statefulset will be created.
```yaml
spec:
standby:
s3_wal_path: "s3 bucket path to the master"
s3_wal_path: "s3://<bucketname>/spilo/<source_db_cluster>/<UID>/wal/<PGVERSION>"
```
At the moment, the operator only allows to stream from the WAL archive of the

View File

@ -1006,9 +1006,9 @@ func (c *Cluster) initSystemUsers() {
// Connection pooler user is an exception, if requested it's going to be
// created by operator as a normal pgUser
if needConnectionPooler(&c.Spec) {
// initialize empty connection pooler if not done yet
if c.Spec.ConnectionPooler == nil {
c.Spec.ConnectionPooler = &acidv1.ConnectionPooler{}
connectionPoolerSpec := c.Spec.ConnectionPooler
if connectionPoolerSpec == nil {
connectionPoolerSpec = &acidv1.ConnectionPooler{}
}
// Using superuser as pooler user is not a good idea. First of all it's
@ -1016,13 +1016,13 @@ func (c *Cluster) initSystemUsers() {
// and second it's a bad practice.
username := c.OpConfig.ConnectionPooler.User
isSuperUser := c.Spec.ConnectionPooler.User == c.OpConfig.SuperUsername
isSuperUser := connectionPoolerSpec.User == c.OpConfig.SuperUsername
isProtectedUser := c.shouldAvoidProtectedOrSystemRole(
c.Spec.ConnectionPooler.User, "connection pool role")
connectionPoolerSpec.User, "connection pool role")
if !isSuperUser && !isProtectedUser {
username = util.Coalesce(
c.Spec.ConnectionPooler.User,
connectionPoolerSpec.User,
c.OpConfig.ConnectionPooler.User)
}

View File

@ -3,6 +3,7 @@ package cluster
import (
"context"
"fmt"
"reflect"
"strings"
"github.com/r3labs/diff"
@ -60,7 +61,7 @@ func needMasterConnectionPooler(spec *acidv1.PostgresSpec) bool {
}
func needMasterConnectionPoolerWorker(spec *acidv1.PostgresSpec) bool {
return (nil != spec.EnableConnectionPooler && *spec.EnableConnectionPooler) ||
return (spec.EnableConnectionPooler != nil && *spec.EnableConnectionPooler) ||
(spec.ConnectionPooler != nil && spec.EnableConnectionPooler == nil)
}
@ -114,7 +115,7 @@ func (c *Cluster) createConnectionPooler(LookupFunction InstallFunction) (SyncRe
c.setProcessName("creating connection pooler")
//this is essentially sync with nil as oldSpec
if reason, err := c.syncConnectionPooler(nil, &c.Postgresql, LookupFunction); err != nil {
if reason, err := c.syncConnectionPooler(&acidv1.Postgresql{}, &c.Postgresql, LookupFunction); err != nil {
return reason, err
}
return reason, nil
@ -140,11 +141,15 @@ func (c *Cluster) createConnectionPooler(LookupFunction InstallFunction) (SyncRe
// RESERVE_SIZE is how many additional connections to allow for a pooler.
func (c *Cluster) getConnectionPoolerEnvVars() []v1.EnvVar {
spec := &c.Spec
connectionPoolerSpec := spec.ConnectionPooler
if connectionPoolerSpec == nil {
connectionPoolerSpec = &acidv1.ConnectionPooler{}
}
effectiveMode := util.Coalesce(
spec.ConnectionPooler.Mode,
connectionPoolerSpec.Mode,
c.OpConfig.ConnectionPooler.Mode)
numberOfInstances := spec.ConnectionPooler.NumberOfInstances
numberOfInstances := connectionPoolerSpec.NumberOfInstances
if numberOfInstances == nil {
numberOfInstances = util.CoalesceInt32(
c.OpConfig.ConnectionPooler.NumberOfInstances,
@ -152,7 +157,7 @@ func (c *Cluster) getConnectionPoolerEnvVars() []v1.EnvVar {
}
effectiveMaxDBConn := util.CoalesceInt32(
spec.ConnectionPooler.MaxDBConnections,
connectionPoolerSpec.MaxDBConnections,
c.OpConfig.ConnectionPooler.MaxDBConnections)
if effectiveMaxDBConn == nil {
@ -201,17 +206,21 @@ func (c *Cluster) getConnectionPoolerEnvVars() []v1.EnvVar {
func (c *Cluster) generateConnectionPoolerPodTemplate(role PostgresRole) (
*v1.PodTemplateSpec, error) {
spec := &c.Spec
connectionPoolerSpec := spec.ConnectionPooler
if connectionPoolerSpec == nil {
connectionPoolerSpec = &acidv1.ConnectionPooler{}
}
gracePeriod := int64(c.OpConfig.PodTerminateGracePeriod.Seconds())
resources, err := generateResourceRequirements(
spec.ConnectionPooler.Resources,
connectionPoolerSpec.Resources,
makeDefaultConnectionPoolerResources(&c.OpConfig))
effectiveDockerImage := util.Coalesce(
spec.ConnectionPooler.DockerImage,
connectionPoolerSpec.DockerImage,
c.OpConfig.ConnectionPooler.Image)
effectiveSchema := util.Coalesce(
spec.ConnectionPooler.Schema,
connectionPoolerSpec.Schema,
c.OpConfig.ConnectionPooler.Schema)
if err != nil {
@ -220,7 +229,7 @@ func (c *Cluster) generateConnectionPoolerPodTemplate(role PostgresRole) (
secretSelector := func(key string) *v1.SecretKeySelector {
effectiveUser := util.Coalesce(
spec.ConnectionPooler.User,
connectionPoolerSpec.User,
c.OpConfig.ConnectionPooler.User)
return &v1.SecretKeySelector{
@ -321,12 +330,13 @@ func (c *Cluster) generateConnectionPoolerDeployment(connectionPooler *Connectio
// default values, initialize it to an empty structure. It could be done
// anywhere, but here is the earliest common entry point between sync and
// create code, so init here.
if spec.ConnectionPooler == nil {
spec.ConnectionPooler = &acidv1.ConnectionPooler{}
connectionPoolerSpec := spec.ConnectionPooler
if connectionPoolerSpec == nil {
connectionPoolerSpec = &acidv1.ConnectionPooler{}
}
podTemplate, err := c.generateConnectionPoolerPodTemplate(connectionPooler.Role)
numberOfInstances := spec.ConnectionPooler.NumberOfInstances
numberOfInstances := connectionPoolerSpec.NumberOfInstances
if numberOfInstances == nil {
numberOfInstances = util.CoalesceInt32(
c.OpConfig.ConnectionPooler.NumberOfInstances,
@ -371,16 +381,6 @@ func (c *Cluster) generateConnectionPoolerDeployment(connectionPooler *Connectio
func (c *Cluster) generateConnectionPoolerService(connectionPooler *ConnectionPoolerObjects) *v1.Service {
spec := &c.Spec
// there are two ways to enable connection pooler, either to specify a
// connectionPooler section or enableConnectionPooler. In the second case
// spec.connectionPooler will be nil, so to make it easier to calculate
// default values, initialize it to an empty structure. It could be done
// anywhere, but here is the earliest common entry point between sync and
// create code, so init here.
if spec.ConnectionPooler == nil {
spec.ConnectionPooler = &acidv1.ConnectionPooler{}
}
serviceSpec := v1.ServiceSpec{
Ports: []v1.ServicePort{
{
@ -668,12 +668,14 @@ func makeDefaultConnectionPoolerResources(config *config.Config) acidv1.Resource
func logPoolerEssentials(log *logrus.Entry, oldSpec, newSpec *acidv1.Postgresql) {
var v []string
var input []*bool
newMasterConnectionPoolerEnabled := needMasterConnectionPoolerWorker(&newSpec.Spec)
if oldSpec == nil {
input = []*bool{nil, nil, newSpec.Spec.EnableConnectionPooler, newSpec.Spec.EnableReplicaConnectionPooler}
input = []*bool{nil, nil, &newMasterConnectionPoolerEnabled, newSpec.Spec.EnableReplicaConnectionPooler}
} else {
input = []*bool{oldSpec.Spec.EnableConnectionPooler, oldSpec.Spec.EnableReplicaConnectionPooler, newSpec.Spec.EnableConnectionPooler, newSpec.Spec.EnableReplicaConnectionPooler}
oldMasterConnectionPoolerEnabled := needMasterConnectionPoolerWorker(&oldSpec.Spec)
input = []*bool{&oldMasterConnectionPoolerEnabled, oldSpec.Spec.EnableReplicaConnectionPooler, &newMasterConnectionPoolerEnabled, newSpec.Spec.EnableReplicaConnectionPooler}
}
for _, b := range input {
@ -684,25 +686,16 @@ func logPoolerEssentials(log *logrus.Entry, oldSpec, newSpec *acidv1.Postgresql)
}
}
log.Debugf("syncing connection pooler from (%v, %v) to (%v, %v)", v[0], v[1], v[2], v[3])
log.Debugf("syncing connection pooler (master, replica) from (%v, %v) to (%v, %v)", v[0], v[1], v[2], v[3])
}
func (c *Cluster) syncConnectionPooler(oldSpec, newSpec *acidv1.Postgresql, LookupFunction InstallFunction) (SyncReason, error) {
var reason SyncReason
var err error
var newNeedConnectionPooler, oldNeedConnectionPooler bool
oldNeedConnectionPooler = false
var connectionPoolerNeeded bool
if oldSpec == nil {
oldSpec = &acidv1.Postgresql{
Spec: acidv1.PostgresSpec{
ConnectionPooler: &acidv1.ConnectionPooler{},
},
}
}
needSync, _ := needSyncConnectionPoolerSpecs(oldSpec.Spec.ConnectionPooler, newSpec.Spec.ConnectionPooler, c.logger)
needSync := !reflect.DeepEqual(oldSpec.Spec.ConnectionPooler, newSpec.Spec.ConnectionPooler)
masterChanges, err := diff.Diff(oldSpec.Spec.EnableConnectionPooler, newSpec.Spec.EnableConnectionPooler)
if err != nil {
c.logger.Error("Error in getting diff of master connection pooler changes")
@ -712,15 +705,14 @@ func (c *Cluster) syncConnectionPooler(oldSpec, newSpec *acidv1.Postgresql, Look
c.logger.Error("Error in getting diff of replica connection pooler changes")
}
// skip pooler sync only
// 1. if there is no diff in spec, AND
// 2. if connection pooler is already there and is also required as per newSpec
//
// Handling the case when connectionPooler is not there but it is required
// skip pooler sync when theres no diff or it's deactivated
// but, handling the case when connectionPooler is not there but it is required
// as per spec, hence do not skip syncing in that case, even though there
// is no diff in specs
if (!needSync && len(masterChanges) <= 0 && len(replicaChanges) <= 0) &&
(c.ConnectionPooler != nil && (needConnectionPooler(&newSpec.Spec))) {
((!needConnectionPooler(&newSpec.Spec) && (c.ConnectionPooler == nil || !needConnectionPooler(&oldSpec.Spec))) ||
(c.ConnectionPooler != nil && needConnectionPooler(&newSpec.Spec) &&
(c.ConnectionPooler[Master].LookupFunction || c.ConnectionPooler[Replica].LookupFunction))) {
c.logger.Debugln("syncing pooler is not required")
return nil, nil
}
@ -731,15 +723,9 @@ func (c *Cluster) syncConnectionPooler(oldSpec, newSpec *acidv1.Postgresql, Look
for _, role := range [2]PostgresRole{Master, Replica} {
if role == Master {
newNeedConnectionPooler = needMasterConnectionPoolerWorker(&newSpec.Spec)
if oldSpec != nil {
oldNeedConnectionPooler = needMasterConnectionPoolerWorker(&oldSpec.Spec)
}
connectionPoolerNeeded = needMasterConnectionPoolerWorker(&newSpec.Spec)
} else {
newNeedConnectionPooler = needReplicaConnectionPoolerWorker(&newSpec.Spec)
if oldSpec != nil {
oldNeedConnectionPooler = needReplicaConnectionPoolerWorker(&oldSpec.Spec)
}
connectionPoolerNeeded = needReplicaConnectionPoolerWorker(&newSpec.Spec)
}
// if the call is via createConnectionPooler, then it is required to initialize
@ -759,24 +745,22 @@ func (c *Cluster) syncConnectionPooler(oldSpec, newSpec *acidv1.Postgresql, Look
}
}
if newNeedConnectionPooler {
if connectionPoolerNeeded {
// Try to sync in any case. If we didn't needed connection pooler before,
// it means we want to create it. If it was already present, still sync
// since it could happen that there is no difference in specs, and all
// the resources are remembered, but the deployment was manually deleted
// in between
// in this case also do not forget to install lookup function as for
// creating cluster
if !oldNeedConnectionPooler || !c.ConnectionPooler[role].LookupFunction {
newConnectionPooler := newSpec.Spec.ConnectionPooler
// in this case also do not forget to install lookup function
if !c.ConnectionPooler[role].LookupFunction {
connectionPooler := c.Spec.ConnectionPooler
specSchema := ""
specUser := ""
if newConnectionPooler != nil {
specSchema = newConnectionPooler.Schema
specUser = newConnectionPooler.User
if connectionPooler != nil {
specSchema = connectionPooler.Schema
specUser = connectionPooler.User
}
schema := util.Coalesce(
@ -787,9 +771,10 @@ func (c *Cluster) syncConnectionPooler(oldSpec, newSpec *acidv1.Postgresql, Look
specUser,
c.OpConfig.ConnectionPooler.User)
if err = LookupFunction(schema, user, role); err != nil {
if err = LookupFunction(schema, user); err != nil {
return NoSync, err
}
c.ConnectionPooler[role].LookupFunction = true
}
if reason, err = c.syncConnectionPoolerWorker(oldSpec, newSpec, role); err != nil {
@ -808,8 +793,8 @@ func (c *Cluster) syncConnectionPooler(oldSpec, newSpec *acidv1.Postgresql, Look
}
}
}
if !needMasterConnectionPoolerWorker(&newSpec.Spec) &&
!needReplicaConnectionPoolerWorker(&newSpec.Spec) {
if (needMasterConnectionPoolerWorker(&oldSpec.Spec) || needReplicaConnectionPoolerWorker(&oldSpec.Spec)) &&
!needMasterConnectionPoolerWorker(&newSpec.Spec) && !needReplicaConnectionPoolerWorker(&newSpec.Spec) {
if err = c.deleteConnectionPoolerSecret(); err != nil {
c.logger.Warningf("could not remove connection pooler secret: %v", err)
}
@ -874,8 +859,6 @@ func (c *Cluster) syncConnectionPoolerWorker(oldSpec, newSpec *acidv1.Postgresql
newConnectionPooler = &acidv1.ConnectionPooler{}
}
c.logger.Infof("old: %+v, new %+v", oldConnectionPooler, newConnectionPooler)
var specSync bool
var specReason []string

View File

@ -19,7 +19,7 @@ import (
"k8s.io/client-go/kubernetes/fake"
)
func mockInstallLookupFunction(schema string, user string, role PostgresRole) error {
func mockInstallLookupFunction(schema string, user string) error {
return nil
}

View File

@ -508,7 +508,7 @@ func (c *Cluster) execCreateOrAlterExtension(extName, schemaName, statement, doi
// Creates a connection pool credentials lookup function in every database to
// perform remote authentication.
func (c *Cluster) installLookupFunction(poolerSchema, poolerUser string, role PostgresRole) error {
func (c *Cluster) installLookupFunction(poolerSchema, poolerUser string) error {
var stmtBytes bytes.Buffer
c.logger.Info("Installing lookup function")
@ -604,8 +604,8 @@ func (c *Cluster) installLookupFunction(poolerSchema, poolerUser string, role Po
c.logger.Infof("pooler lookup function installed into %s", dbname)
}
if len(failedDatabases) == 0 {
c.ConnectionPooler[role].LookupFunction = true
if len(failedDatabases) > 0 {
return fmt.Errorf("could not install pooler lookup function in every specified databases")
}
return nil

View File

@ -758,6 +758,15 @@ func (c *Cluster) syncDatabases() error {
}
}
if len(createDatabases) > 0 {
// trigger creation of pooler objects in new database in syncConnectionPooler
if c.ConnectionPooler != nil {
for _, role := range [2]PostgresRole{Master, Replica} {
c.ConnectionPooler[role].LookupFunction = false
}
}
}
// set default privileges for prepared database
for _, preparedDatabase := range preparedDatabases {
if err := c.initDbConnWithName(preparedDatabase); err != nil {

View File

@ -72,7 +72,7 @@ type ClusterStatus struct {
type TemplateParams map[string]interface{}
type InstallFunction func(schema string, user string, role PostgresRole) error
type InstallFunction func(schema string, user string) error
type SyncReason []string