* Handle retry connect to Postgres when ping return EOF error.
* Update pkg/cluster/database.go
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
---------
Co-authored-by: Trung Minh Lai <trung.lai@hitachivantara.com>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
* add prefix /vol- on when EBS doesn't have
* add new unit test for to get the volumeID
* add a prefix to search in the string of volumeID
---------
Co-authored-by: Jociele Padilha <jociele.padilha@zalando.de>
* bump to v1.9.1
* update year in license and add links to more blog posts
* bump go to 1.19 and update dependencies
* go for 1.10.0 instead of 1.9.1
* fix unit test - removed obsolete ClusterName field
* fix DNS template in UI helm chart deployment file
* bump pooler image
* set pooler pod security context
* use hard coded RunAsUser 100 and RunAsGroup 101 for pooler pod
* unify generation of TLS secret mounts
* extend documentation on tls support
* add unit test for testing TLS support for pooler
* add e2e test for tls support
* fix to pooler TLS support, security context fsGroup added (#2216)
* add environment variable of CA cert path in pooler pod template
* additional logic for custom CA secrets and mount path
* fix ca file name
* Introduce `masterServiceAnnotations` & `replicaServiceAnnotations`
Introduce `masterServiceAnnotations` & `replicaServiceAnnotations` to the `Postgresql` CRD.
`masterServiceAnnotations` overrides `serviceAnnotations` for master role if not empty.
`replicaServiceAnnotations` overrides `serviceAnnotations` for replica role if not empty.
Existing definition of `serviceAnnotations` continue to work for backward compatibitlity when neither `masterServiceAnnotations` nor `replicaServiceAnnotations` is defined.
This closes https://github.com/zalando/postgres-operator/issues/1927
* Accumulate service annotations
First, global config, then ServiceAnnotations overriding, then MasterServiceAnnotations and ReplicaServiceAnnotations.
This addresses
https://github.com/zalando/postgres-operator/pull/2161#discussion_r1063558711.
* Update admin doc with master & replica service annotations overrides
Addressed https://github.com/zalando/postgres-operator/pull/2161#discussion_r1064744086
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
* add volume with custom TLS config to pooler deployment
* bump pg bouncer image tag which support new feature
Co-authored-by: Jérémie Seguin <jeremie.seguin@malt.com>
* new config options to specify resources for logical backup jobs
* bug in logical backup script for s3 dumps
* define enum for logical_backup_provider
* changed order of logical backup azure options
* fix unit test for stream comparison
* Bumped Spilo image tag to the one that supports PostgreSQL 15. Using CDP version temporarily until non-CDP one is released.
* Added support for PostgreSQL 15 and made it default. 9.5 and 9.6 are now no longer supported
* Bumped spilo image tag to 2.1-p9
* Bumped spilo image in test launcher
Co-authored-by: yoshihiko <ariyoshi10@gmail.com>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
* Allow drop slots when it gets deleted from the manifest
* use leader instead replica to query slots
* fix and extend unit tests for config update checks
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
If you import the zalando postgresql v1 api with kubebuilder, it
complains about the missing tags.
```
❯ make manifests
test -s /.../bin/controller-gen && /.../bin/controller-gen --version | grep -q v0.10.0 || \
GOBIN=/.../bin go install sigs.k8s.io/controller-tools/cmd/controller-gen@v0.10.0
/.../bin/controller-gen rbac:roleName=manager-role crd:allowDangerousTypes=true webhook paths="./..." output:crd:artifacts:config=config/crd/bases
/.../go/pkg/mod/github.com/zalando/postgres-operator@v1.8.2/pkg/apis/acid.zalan.do/v1/postgresql_type.go:116:2: encountered struct field "Everyday" without JSON tag in type "MaintenanceWindow"
/.../go/pkg/mod/github.com/zalando/postgres-operator@v1.8.2/pkg/apis/acid.zalan.do/v1/postgresql_type.go:117:2: encountered struct field "Weekday" without JSON tag in type "MaintenanceWindow"
/.../go/pkg/mod/github.com/zalando/postgres-operator@v1.8.2/pkg/apis/acid.zalan.do/v1/postgresql_type.go:118:2: encountered struct field "StartTime" without JSON tag in type "MaintenanceWindow"
/.../go/pkg/mod/github.com/zalando/postgres-operator@v1.8.2/pkg/apis/acid.zalan.do/v1/postgresql_type.go:119:2: encountered struct field "EndTime" without JSON tag in type "MaintenanceWindow"
Error: not all generators ran successfully
run `controller-gen rbac:roleName=manager-role crd:allowDangerousTypes=true webhook paths=./... output:crd:artifacts:config=config/crd/bases -w` to see all available markers, or `controller-gen rbac:roleName=manager-role crd:allowDangerousTypes=true webhook paths=./... output:crd:artifacts:config=config/crd/bases -h` for usage
make: *** [manifests] Error 1
```
This commit adds support of a not-yet-released Patroni feature that allows postgres to run as primary in case of a failed leader lock update.
* Add Patroni 'failsafe_mode' local parameter (enable for a single PG cluster)
* Allow configuring Patroni 'failsafe_mode' parameter globally
* add toggle to turn off readiness probes
* include PodManagementPolicy and ReadinessProbe in stateful set comparison
* add URI scheme to generated readiness probe
* create streams only after postgres instances were restarted
* checkAndSetGlobalPostgreSQLConfiguration returns if config has been patched
* restart can be pending even without a config patch
* deprecate ClusterName field of Postgresql type
* remove for teamId from operator API endpints /status /logs /history
* update dns_format_string and yaml template in UI
* Use getSwitchoverCandidate instead of masterCandidate when trying to migrating master pod to a replica
Ref: #1983
* Remove unused masterCandidate (replaced by getSwitchoverCandidate)
Ref: #1983
* allow in place pw rotation of system users
* block postgres user from rotation
* mark pooler pods for replacement
* adding podsGetter where pooler is synced in unit tests
* move rotation code in extra function
* removing inner goroutine in cluster.Switchover and resolve race between processPodEvent and unregisterPodSubscriber
* unlock mutex after handling event, now with non-blocking default case
* reverse membership for additional owner roles
* remove type RoleOriginSpilo
* use e2e images with cron_admin inside
* let operator resolve reversed membership
* make additional owner roles part of the sync user strategy
* add more context in the docs about additional_owner_roles
* return err if teams API fails with StatusCode other than 404
* add unit test for 404 at team members
Co-authored-by: Jociele Padilha <jociele.padilha@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
* add the possibility to create a standby cluster that streams from a remote primary
* extending unit tests
* add more docs and e2e test
Co-authored-by: machine424 <ayoubmrini424@gmail.com>
* return only warning if team can't be found
Co-authored-by: Jociele Padilha <jociele.padilha@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
* do not create endpoints when use config maps
* delete cluster objects with 'leader' suffix
Co-authored-by: Евграфов Александр Александрович <aevgrafov@cmx.ru>
* feat: add ignored annotations when comparing during sync
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
Co-authored-by: Moshe Immerman <moshe@flanksource.com>
In Go, when a struct field is not set, it becomes a struct with
default values for all fields. These default values are included
during serialization. This causes issues with schema validation
where optional fields cannot be omitted because default values
are considered invalid.
This patch addresses this issue for `Resources` fields on several
types by using a pointer value.
* Add support for pooler load balancer
Signed-off-by: Sergey Shatunov <me@prok.pw>
* Rename to enable_master_pooler_load_balancer
Signed-off-by: Sergey Shatunov <me@prok.pw>
* target port should be intval
* enhance pooler e2e test
* add new options to crds.go
Co-authored-by: Sergey Shatunov <me@prok.pw>
* Add optional logical backup retention time
* Set defaults for potentially unbound variables, so that the script will work with older operator versions
* Document retention time parameter for logical backups
* Add retention time parameter to resources and charts
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
* provide event stream API
* check manifest settings for logical decoding before creating streams
* operator updates Postgres config and creates replication user
* name FES like the Postgres cluster
* add delete case and fix updating streams + update unit test
* check if fes CRD exists before syncing
* existing slot must use the same plugin
* make id and payload columns configurable
* sync streams only when they are defined in manifest
* introduce applicationId for separate stream CRDs
* add FES to RBAC in chart
* disable streams in chart
* switch to pgoutput plugin and let operator create publications
* reflect code review and additional refactoring
Co-authored-by: Paŭlo Ebermann <paul.ebermann@zalando.de>
* synchronous_node_count support
* notification about Patroni image version
* default synchronous_node_count to 1
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
* do not recreate pods if previous Patroni API calls fail
* move retry reads against Patroni API to pod.go
* remove final failover check in node affinity test
* make test_min_resource_limits more robust
* password rotation in K8s secrets
* add db connection to syncSecrets
* add user retention
* add e2e test
* cleanup on username mismatch if rotation was switched off
* add unit test for syncSecrets + new updateSecret func
* include tolerations in statefulset comparison
* provide alternative merge behavior of nodeSelectorTerms for node readiness label
* add config option to change affinity merge behavior
* reworked e2e tests around node affinity
* Make CRD registration configurable and drop RBAC permissions when CRD registration is disabled
* add generated deep copy functions
Co-authored-by: Damian Peckett <d.peckett_admin@mgmt.innovo-cloud.de>
* Add support for manual gs_wal_path in standby
* Remove separate standby version configuration
* Remove setting standby path via cluster/uid/version
Picking up the version doesn't work reliably without making changes to
Spilo. It's clearer to just specify the full S3/GS bucket path.
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
* init error arrays correctly
* avoid nilPointer when syncing connectionPooler
* getInfrastructureRoles should return error
* fix unit tests and return type for getInfrastructureRoles
* restart master first in some edge cases
* edge case is when desired is lower than effective
* wait after config patch and restart on sync whenever we see pending_restart
* convert options to int to check decrease and add unit test
* minor update to e2e tests
* wait only after restart not every sync
* using spilo 14 e2e images
* improve Patroni config sync
* collect new and updated slots to patch patroni
* refactor httpGet in Patroni and extend unit tests
* GetMemberData should call the patroni endpoint
* add PATCH test
* remove role from installLookupFunction and run it on database sync, too
* fix condition to decide on syncing pooler
* trigger lookup from database sync only if pooler is set
* use empty spec everywhere and do not sync if one lookupfunction was passed
* do not sync pooler after being disabled
This commit adds support for using an Azure storage account as a backup
location.
It uses the existing GCS functionality as a reference for what to do,
and follows the example set by GCS as closely as possible.
The decision to name the cloud provider key "aws_or_gcp" is unfortunate
while adding support for Azure, but I have left it alone to allow for
this changeset to be backwards compatible.