Commit Graph

626 Commits

Author SHA1 Message Date
Felix Kunde 886cb86797
allow users to opt out from globally enabled secret rotation (#2528)
* allow users to opt out from globally enabled secret rotation
* cover new option also in e2e test
* change ignore test to existing user
2024-02-09 12:19:06 +01:00
Felix Kunde 29ea863faf
allow empty resources when defaults are empty (#2524)
* allow empty resources when defaults are empty
* update codegen
* add more unit tests and remove internal resources defaults
* a unit test for min limit and raising to request
* uncomment defaults in example configmap
* simplifying pooler pod generation unit test
2024-02-09 07:35:53 +01:00
Felix Kunde bf5db676b1
replace deprecated ioutil (#2531)
* replace deprecated ioutil
* replace ioutil also in kubectl plugin
2024-02-05 11:58:36 +01:00
Chris Boot 8f3139965c
fix: no switchover candidate found with member state "streaming" (#1992) (#2515)
* fix: no switchover candidate found with member state "streaming" (#1992)
* Add test
* Also handle "in archive recovery" state
2024-01-24 10:40:58 +01:00
Felix Kunde 4a0c483514
add unit test and documentation for finalizers (#2509)
* add unit test and documentation for finalizers
* error msg with lower case and cover sync case
* try to avoid adding json-patch dependency
* use Update to remove finalizer
* changing status and finalizer during create
* do not call Delete() twice
2024-01-22 12:13:40 +01:00
Felix Kunde 3bad9aaded
fix when syncing standby discription (#2513) 2024-01-12 10:41:17 +01:00
Felix Kunde 39f426d56f
hugepages empty on default and updated date in codegen files (#2512)
* hugepages empty on default using string pointer and updated date of codegen files
2024-01-12 09:25:51 +01:00
Christian Rohmann 743aade45f
Use finalizers to avoid losing delete events and to ensure full resource cleanup (#941)
* Add Finalizer functions to Cluster; add/remove finalizer on Create/Delete events
* Check if clusters have a deletion timestamp and we missed that event. Run Delete() and remove finalizer when done.
* Fix nil handling when using Service from map; Remove Service, Endpoint entries from their maps - just like with Secrets
* Add handling of ResourceNotFound to all delete functions (Service, Endpoint, LogicalBackup CronJob, PDB and Secret) - this is not a real error when deleting things
* Emit events when there are issues deleting resources to the user is informed
* Depend the removal of the Finalizer on all resources being deleted successfully first. Otherwise the next sync run should let us try again
* Add config option to enable finalizers
* Removed dangling whitespace at EOL
* config.EnableFinalizers is a bool pointer

---------

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2024-01-04 16:22:53 +01:00
Silas 9581ba969b
Add hugepages 2Mi and 1Gi fields to ResourceDescription and pass them to the statefulset (#2311)
* Add hugepages-2Mi and 1Gi to ResourceDescription type and crd (#1549, #1788)
* Add tests for hugepages resource requests/limits
* Add tests for hugepages resource requests/limits on sidecars, too
* Add docs for hugepages support
* Add link to kubernetes docs on hugepages
* Add tests for hugepages not being set on container if not requested in custom resource
* Add hugepages resources fields to manifest docs
* Add hugepages resources fields to complete manifest example
* Add hugepages resources fields to chart crd

---------

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2024-01-04 15:59:27 +01:00
Davide Bizzarri 3ca26d0dc8
Make PodDisruptionBudget master label selector optional (#2364)
* Make PDB master label selector optional

* Update pkg/apis/acid.zalan.do/v1/crds.go

---------

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2024-01-04 15:58:24 +01:00
andrejshapal 0367a07ba8
Populating crd labels and annotations in logical backup job pod manifest (#2456)
* Adding custom pod labels to logical backup job
* Adding custom annotations to logical backup job pod
* Adding job InheritedAnnotations and InheritedLabel tests
2024-01-04 14:03:16 +01:00
Felix Kunde dad5b132ec
Standby cluster promotion by changing manifest (#2472)
* Standby cluster promotion by changing manifest
* Updated the documentation

---------

Co-authored-by: Senthilnathan M <snathanm@vmware.com>
2024-01-04 12:33:50 +01:00
Stef Graces bbba15f9bf
Logical backup secret (#2051)
* Add logical backup secret
2024-01-04 11:09:16 +01:00
seeker 7ceedead35
Fix VolumeClaimTemplates index out of range problem (#2493)
when the desired statefulset has different numbers of volume claim template with current cluster,  will be panic because of index out of range
2024-01-04 11:05:15 +01:00
Felix Kunde 39fcf2e6b9
remove Users section from Patroni Bootstrap (#2490) 2024-01-03 16:47:21 +01:00
Felix Kunde 9ee14f26cb
let isSystemUsername check all system users (#2489)
* let isSystemUsername check all system users
* extend robot user unit test
* reset system users for initSystemUser test
2023-12-08 15:21:56 +01:00
Felix Kunde e03fdaaa51
add support for recovery section in event streams (#2421) 2023-09-19 17:15:50 +02:00
Ida Novindasari 36389b27bc
Enable specifying PVC retention policy for auto deletion (#2343)
* Enable specifying PVC retention policy for auto deletion
* enable StatefulSetAutoDeletePVC in featureGates
* skip node affinity test
2023-09-08 13:17:37 +02:00
Felix Kunde 552bd26c0f
bump to v1.10.1 (#2410)
* bump to v1.10.1
2023-09-07 22:46:26 +02:00
Thinking Chen 781d17b85c
Add service account name in connection pooler (#2352) 2023-09-04 16:26:21 +02:00
Trung Minh Lai 28c27efe43
Handle retry connect to Postgres when ping return EOF error. (#2339)
* Handle retry connect to Postgres when ping return EOF error.
* Update pkg/cluster/database.go

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>

---------

Co-authored-by: Trung Minh Lai <trung.lai@hitachivantara.com>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2023-08-25 17:53:18 +02:00
Felix Kunde 334ceab18e
log error why ebs volume modifying fails (#2395) 2023-08-18 13:18:35 +02:00
Jociele Padilha 04f18b9716
fix extraction of EBS volume id when there's no region prefix (#2351)
* add prefix /vol- on when EBS doesn't have
* add new unit test for to get the volumeID
* add a prefix to search in the string of volumeID

---------

Co-authored-by: Jociele Padilha <jociele.padilha@zalando.de>
2023-06-12 15:18:19 +02:00
Felix Kunde c580e509d3
Bump v1.10.0 (#2299)
* bump to v1.9.1
* update year in license and add links to more blog posts
* bump go to 1.19 and update dependencies
* go for 1.10.0 instead of 1.9.1
* fix unit test - removed obsolete ClusterName field
* fix DNS template in UI helm chart deployment file
2023-04-20 18:21:43 +02:00
drivebyer 1e64ae788e
Fix some errors be ignored (#2290)
Signed-off-by: drivebyer <yang.wu@daocloud.io>
2023-04-17 17:25:07 +02:00
Felix Kunde 0e7beb5fe5
refactor pooler tls support and set pooler pod security context (#2255)
* bump pooler image
* set pooler pod security context
* use hard coded RunAsUser 100 and RunAsGroup 101 for pooler pod
* unify generation of TLS secret mounts
* extend documentation on tls support
* add unit test for testing TLS support for pooler
* add e2e test for tls support
2023-04-17 11:38:56 +02:00
genofire 40db1f6782
fix: make map in generateUserSecrets with correct size (#2273) 2023-04-11 11:55:28 +02:00
Felix Kunde 1105228d3a
in sync mode select only syncStandby as switchover candidate (#2278)
* in sync mode select only syncStandby as swicthover candidate
* do not exit retry with err
* unit test: use error from reading byte stream twice
2023-04-06 12:04:55 +02:00
Felix Kunde 80fee5bda4
continue syncing databases and extensions on err (#2262) 2023-03-14 10:58:54 +01:00
Pavel Ven Gulbin 6953f72bee
fix to pooler TLS support (#2219)
* fix to pooler TLS support, security context fsGroup added (#2216)
* add environment variable of CA cert path in pooler pod template
* additional logic for custom CA secrets and mount path
* fix ca file name
2023-03-07 16:20:28 +01:00
Felix Kunde 9973262b83
sync stateful set when syncing streams during ADD event (#2245) 2023-02-28 09:14:22 +01:00
Felix Kunde 645fcc01a2
remove debug log for generated env vars of logical backup (#2233) 2023-02-23 15:16:16 +01:00
Felix Kunde e6fb57a6bd
add c.replicationSlots on sync (#2238) 2023-02-23 13:19:35 +01:00
Felix Kunde 1d5bc2396a
minor fix to pooler TLS support (#2216) 2023-02-10 17:20:59 +01:00
Felix Kunde 7a90fbcb00
fix sync of stream slots (#2194) 2023-01-27 18:03:37 +01:00
Felix Kunde c9cada66c7
add pooler suffix to DNS annotation of pooler LoadBalancer service (#2188)
* add pooler suffix to DNS annotation of pooler LoadBalancer service
* need generatePoolerServiceAnnotations function
2023-01-27 12:07:48 +01:00
Felix Kunde 7887ebbbce
set wal_level config not on empty parameters map (#2189)
* set wal_level config not on empty parameters map
* UPDATE event must trigger statefulSet sync when streams are added
2023-01-26 09:43:03 +01:00
Felix Kunde b9165190e1
set wal_level for streams in statefulSet sync (#2187)
* set wal_level for streams in statefulSet sync
2023-01-25 17:06:31 +01:00
Felix Kunde 4741b3f734
copy rolconfig during password rotation (#2183)
* copy rolconfig during password rotation

Co-authored-by: idanovinda <idanovinda@gmail.com>
2023-01-25 10:48:23 +01:00
Felix Kunde a4f95e97e0
do not rotate secrets for standby clusters (#2175) 2023-01-17 12:58:14 +01:00
Felix Kunde 28cd2f188a
better backwards compatibility with old DNS name format for LBs (#2171)
* better backwards compatibility with legacy DNS name format for LBs
* improve docs on DNS string
2023-01-17 10:06:11 +01:00
Dmitry Volodin ce1fee8586
Ineffectual assignment of the envVars for connection pooler (#2165)
* Ineffectual assignment of the envVars for connection pooler
* Fixed codegen in case of the GOPATH is specified explicitly
2023-01-12 11:38:54 +01:00
Owen Ou 021ab07a23
Introduce `masterServiceAnnotations` & `replicaServiceAnnotations` (#2161)
* Introduce `masterServiceAnnotations` & `replicaServiceAnnotations`

Introduce `masterServiceAnnotations` & `replicaServiceAnnotations` to the `Postgresql` CRD.
`masterServiceAnnotations` overrides `serviceAnnotations` for master role if not empty.
`replicaServiceAnnotations` overrides `serviceAnnotations` for replica role if not empty.
Existing definition of `serviceAnnotations` continue to work for backward compatibitlity when neither `masterServiceAnnotations` nor `replicaServiceAnnotations` is defined.

This closes https://github.com/zalando/postgres-operator/issues/1927

* Accumulate service annotations

First, global config, then ServiceAnnotations overriding, then MasterServiceAnnotations and ReplicaServiceAnnotations.

This addresses
https://github.com/zalando/postgres-operator/pull/2161#discussion_r1063558711.

* Update admin doc with master & replica service annotations overrides

Addressed https://github.com/zalando/postgres-operator/pull/2161#discussion_r1064744086

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2023-01-11 13:29:16 +01:00
jeremie-seguin 3139c1f3d0
Add Support for Custom TLS Certificates in Connection Pooler (#2146)
* add volume with custom TLS config to pooler deployment
* bump pg bouncer image tag which support new feature

Co-authored-by: Jérémie Seguin <jeremie.seguin@malt.com>
2023-01-09 17:16:00 +01:00
Felix Kunde 29cec0ceda
configurable resources for logical backup pod template (#710)
* new config options to specify resources for logical backup jobs
* bug in logical backup script for s3 dumps
* define enum for logical_backup_provider
* changed order of logical backup azure options
* fix unit test for stream comparison
2023-01-05 15:19:36 +01:00
Stef Graces bb2617a53f
Add logical backup for azure (#2052)
* Add logical backup for azure
2023-01-05 12:16:41 +01:00
Felix Kunde c756cb2f8a
spec.env can override clone and standby variables (#2159) 2023-01-05 12:02:19 +01:00
yoshihikoueno becf8a4715
Bump spilo and target version for PostgreSQL 15 (#2139)
* Bumped Spilo image tag to the one that supports PostgreSQL 15. Using CDP version temporarily until non-CDP one is released.
* Added support for PostgreSQL 15 and made it default. 9.5 and 9.6 are now no longer supported
* Bumped spilo image tag to 2.1-p9
* Bumped spilo image in test launcher

Co-authored-by: yoshihiko <ariyoshi10@gmail.com>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2023-01-04 12:01:30 +01:00
idanovinda 486d5d66e0
Allow drop slots when it gets deleted from the manifest (#2089)
* Allow drop slots when it gets deleted from the manifest
* use leader instead replica to query slots
* fix and extend unit tests for config update checks

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2023-01-03 15:46:59 +01:00
Felix Kunde 819e410959
refactor podAffinity generation (#2156) 2023-01-03 11:34:02 +01:00
Felix Kunde d7e1fb57f1
polish global config about sharing postgresql-run socket (#2155)
* polish global config about sharing postgresql-run socket
2023-01-02 18:28:48 +01:00
Francois Parquet be7b52db92
add preferred during scheduling pod anti affinity (#2048)
* add preferred during scheduling pod anti affinity

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2023-01-02 18:22:47 +01:00
Dmitry Volodin 93a253bde1
Bump k8s api to v0.23.5 for operator v1.9.0 (#1854)
* Bump k8s api to v0.23.5 for operator v1.8.0
* Update k8s version in Makefile
2023-01-02 16:00:17 +01:00
Christian Rohmann 024aab1f13
Add config switch to share pg_socket in /var/run/postgresql via an emptyDir with the sidecar containers (#962) 2023-01-02 12:57:36 +01:00
Felix Kunde 4534a4cd9e
fix syncing of stream CRDs (#2152)
* fix syncing of stream CRDs and improve corresponding unit tests
2022-12-30 13:09:15 +01:00
Felix Kunde c1657ec484
stream resource name must be lower case (#2149) 2022-12-29 16:55:18 +01:00
Felix Kunde e80cccb93b
use random short name for stream CRDs (#2137)
* use random short name for stream CRDs
2022-12-27 16:52:01 +01:00
Felix Kunde 3e148ea57e
enable operator support for pg15 and drop support for 9.5 and 9.6 (#2140)
* enable operator support for pg15 and drop support for 9.5 and 9.6
* not offer 15 in UI before spilo-15 is available
2022-12-15 12:17:27 +01:00
Felix Kunde 0bef3b325f
fix migration of single-node clusters (#2134) 2022-12-09 12:42:10 +01:00
Polina Bungina 4d585250db
Add Patroni failsafe_mode parameter (#2076)
This commit adds support of a not-yet-released Patroni feature that allows postgres to run as primary in case of a failed leader lock update.
* Add Patroni 'failsafe_mode' local parameter (enable for a single PG cluster)
* Allow configuring Patroni 'failsafe_mode' parameter globally
2022-12-02 13:33:02 +01:00
Felix Kunde 1d44dd4694
delete secret resource from map (#2123) 2022-12-02 10:09:19 +01:00
Felix Kunde 528bb81a78
first sync wal_level then publications (#2109) 2022-11-28 16:37:34 +01:00
Felix Kunde 529cdfc0b6
skip slots where publication sync failed (#2091) 2022-10-25 14:51:29 +02:00
Felix Kunde 70f3ee8e36
skip db sync on failed initUsers during UPDATE (#2083)
* skip db sync on failed initUsers during UPDATE
* provide unit test for teams API being unavailable
* add test for 404 case
2022-10-21 17:50:14 +02:00
Felix Kunde d55e74e1e7
create publication before creating logical replication slot (#2085) 2022-10-21 14:31:13 +02:00
machine424 640581fb46
Fix the Operator rolling update statefulsets unnecessary whien Kube API down (#2031) (#2064) 2022-10-18 10:55:25 +02:00
Polina Bungina acb3ffd702
Fix upgrade command (#2075)
Upgrade run using postgres user used to be broken by an invalid option.
Move -o pipefail directly to upgradeCommand.
+ minor formatting corrections
2022-10-17 17:36:02 +02:00
Dmitry Volodin a85023ff10
Cluster env variables should be reflected for StatefulSet update (#2045)
* Cluster env variables should be reflected for StatefulSet update
* Add unit test for comparing StatefulSet's
2022-10-13 13:54:58 +02:00
Felix Kunde 4786f53f03
Fix password rotation (#2043)
* fix password rotation
* test connection with rotation user in e2e test + minor changes
2022-10-13 11:33:26 +02:00
Felix Kunde ce8b009c66
fix team member deprecation (#2072) 2022-10-11 18:02:41 +02:00
Philipp B 84fe38a069
switch to batch API v1 for Jobs (#2066) 2022-10-07 11:27:58 +02:00
Felix Kunde 2aa52094db
switch to policy API v1 for PDBs (#2008)
* switch to policy API v1 for PDBs
* update e2e test dependencies
* use kind 0.14.0
* bump K8s client in e2e docker image
* bump e2e tests-runner
2022-10-06 09:43:17 +02:00
Felix Kunde a119772efb
add toggle to turn off readiness probes (#2004)
* add toggle to turn off readiness probes
* include PodManagementPolicy and ReadinessProbe in stateful set comparison
* add URI scheme to generated readiness probe
2022-10-05 18:25:24 +02:00
Jan Mussler b48034d762
Fix major version upgrade return code (#2056)
Fix major version upgrade return code
2022-09-21 15:25:24 +02:00
Felix Kunde e0c4603057
create streams only after postgres instances were restarted (#2034)
* create streams only after postgres instances were restarted
* checkAndSetGlobalPostgreSQLConfiguration returns if config has been patched
* restart can be pending even without a config patch
2022-09-19 15:25:55 +02:00
Felix Kunde d209612b18 use correct keys in updateSecret (#2029) 2022-09-01 10:58:42 +02:00
Felix Kunde 4c07494ac7
deprecate ClusterName field of Postgresql type and remove team from REST endpoints (#2015)
* deprecate ClusterName field of Postgresql type
* remove for teamId from operator API endpints /status /logs /history
* update dns_format_string and yaml template in UI
2022-08-29 15:00:25 +02:00
Felix Kunde 89375186b3
use old LB DNS format when teamId prefix is disabled (#2011)
* use old LB DNS format when teamId prefix is disabled
* support both old and new format in external-dns
* switch dns template from team to namespace
2022-08-25 18:29:54 +02:00
Felix Kunde 21d00e2ed7
rework map selection in updateSecret (#2010) 2022-08-24 17:33:39 +02:00
Felix Kunde ef324494a0
fetch pooler and fes_user system user only when corresponding features are used (#2009)
* fetch pooler and fes_user system user only when corresponding features are used
* cover error case in unit test
* use string formatting instead of +
2022-08-24 16:28:49 +02:00
Felix Kunde 3bfd63cbe6
Make teamId in cluster name optional (#2001)
* making teamId in clustername optional
* move teamId check to addCluster function
2022-08-24 10:12:50 +02:00
JBWatenbergScality b91b69c736
BugFix: Switchover (during a Node drain) fails randomly in synchronous mode (#1984)
* Use getSwitchoverCandidate instead of masterCandidate when trying to migrating master pod to a replica
Ref: #1983

* Remove unused masterCandidate (replaced by getSwitchoverCandidate)
Ref: #1983
2022-08-19 15:14:53 +02:00
Felix Kunde b2642fa2fc
allow in place pw rotation of system users (#1953)
* allow in place pw rotation of system users
* block postgres user from rotation
* mark pooler pods for replacement
* adding podsGetter where pooler is synced in unit tests
* move rotation code in extra function
2022-08-18 14:14:31 +02:00
Felix Kunde 88a2931550
bump pooler image to use new alpine base image (#1985)
* bump pooler image to use new alpine base image
* use a safe default for PGHOST pooler env variable
2022-08-08 17:36:43 +02:00
Jociele Padilha b41daf4f76
Set maximum CPU and Memory requests on K8s (#1959)
* Set maximum CPU and Memory requests on K8s
2022-07-28 14:18:27 +02:00
Felix Kunde 5e4badd99c
annotation to bypass globally configured instance limits (#1943) 2022-06-30 10:40:03 +02:00
Felix Kunde 7d4da92872
bring back CLONE_WAL_BUCKET_SCOPE_PREFIX (#1902) 2022-05-24 16:27:34 +02:00
Felix Kunde 97be5ee1cb
use uint64 for replication lag from Patroni's member endpoint (#1893)
* use int64 for replication lag from Patroni's member endpoint
2022-05-19 09:39:56 +02:00
Felix Kunde 268a86a045
removing inner goroutine in cluster.Switchover (#1876)
* removing inner goroutine in cluster.Switchover and resolve race between processPodEvent and unregisterPodSubscriber
* unlock mutex after handling event, now with non-blocking default case
2022-05-17 18:10:39 +02:00
Felix Kunde c6f2c68588
ignore case when checking for envVar existence but do not change it (#1889) 2022-05-12 11:59:05 +02:00
Felix Kunde a77d5df158
reverse membership for additional owner roles (#1862)
* reverse membership for additional owner roles
* remove type RoleOriginSpilo
* use e2e images with cron_admin inside
* let operator resolve reversed membership
* make additional owner roles part of the sync user strategy
* add more context in the docs about additional_owner_roles
2022-04-28 11:15:40 +02:00
Felix Kunde 8b6664f1a2
fix container ports (#1864) 2022-04-21 18:52:53 +02:00
Felix Kunde 532772c5cd
do not call EBS api when there are no pvs (#1851)
* do not call EBS api when there are no pvs
* no extra aws api call in executeEBSMigration, operate on fetched cluster.EBSVolumes
2022-04-20 12:12:02 +02:00
Felix Kunde eecd13169c
refactor spilo env var generation (#1848)
* refactor spilo env generation
* enhance docs on env vars
* add unit test for appendEnvVar
2022-04-14 11:47:33 +02:00
Jociele Padilha 483bf624ee
add test team member (#1842)
* return err if teams API fails with StatusCode other than 404
* add unit test for 404 at team members

Co-authored-by: Jociele Padilha <jociele.padilha@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2022-04-14 10:02:54 +02:00
Dmitry Volodin 9bcb25ac7e
Ability to set pod environment variables on cluster resource (#1794)
* Ability to set pod environment variables on cluster resource

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2022-04-11 10:16:35 +02:00
Felix Kunde 0dc370f15d
standby cluster that streams from a remote primary (#1830)
* add the possibility to create a standby cluster that streams from a remote primary
* extending unit tests
* add more docs and e2e test

Co-authored-by: machine424 <ayoubmrini424@gmail.com>
2022-04-04 15:41:11 +02:00
Jociele Padilha 2dfb11ad4c
update team message (#1818)
* return only warning if team can't be found

Co-authored-by: Jociele Padilha <jociele.padilha@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2022-04-04 13:08:03 +02:00
neelasha-09 f5cca1a093
major version upgrade for rootless and ocp : solving #1689 (#1770)
* major version upgrade for rootless and ocp : solving #1689

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2022-03-31 14:53:02 +02:00
Felix Kunde 2333d531d3
Fix deletion of event streams resources (#1831)
* fix deletion of event streams
* create cluster field to store stream application ids
2022-03-31 11:48:37 +02:00