This change improves the responsiveness of the operator when handling
deletion requests by running sync operations in the background and
using context cancellation to interrupt stuck operations.
Changes:
- Add context field to Cluster struct, passed through New()
- Add Cancel() method to cancel cluster's context
- Add StartSync/EndSync/NeedsResync for managing background sync state
- Run Sync() in a background goroutine so worker can process other events
- Add context-aware DB connection methods (initDbConnWithContext)
- Add RetryWithContext() that respects context cancellation
- Cancel cluster context immediately when DeletionTimestamp detected
- Use context-aware connections in syncRoles/syncDatabases
- StartSync/NeedsResync check context cancellation to prevent new syncs
during deletion (no need for separate deleted flag)
Flow:
1. Sync event spawns background goroutine and returns immediately
2. If another sync arrives while one is running, needsResync flag is set
3. When sync completes, it checks needsResync and requeues if needed
4. Delete cancels context -> stuck DB operations return early -> mutex released
5. StartSync/NeedsResync return false when context cancelled
6. Delete proceeds without waiting for slow/stuck sync operations
* skip db user actions when its secret failed to sync on update
* need to add new pgUser field to e2e test
* lets collect errors of syncSecret so we still get status updateFailed
* Replace `golang.org/x/exp` with stdlib
These experimental packages are now available in the Go standard
library since Go 1.21.
1. golang.org/x/exp/slices -> slices [1]
2. golang.org/x/exp/maps -> maps [2]
[1]: https://go.dev/doc/go1.21#slices
[2]: https://go.dev/doc/go1.21#maps
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
* Run go mod tidy
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
---------
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
* do not remove publications of slot defined in manifest
* improve condition to sync streams
* init publication tables map when adding manifest slots
* need to update c.Stream when there is no update
Create the second PDB to cover Pods with a special "critical operation" label set.
This label is going to be assigned to all pg cluster's Pods by the Operator during a PG major version upgrade, by Patroni during a cluster/replica bootstrap. It can also be set manually or by any other automation tool.
* extend and improve hasSlotsInSync unit test
* fix sync streams and add diffs for annotations and owner references
* incl. current annotations as desired where we do not fully control them
* added one more unit test and fixed sub test names
* pass maintenance windows to function and update unit test
* sync all resources to cluster fields (CronJob, Streams, Patroni resources)
* separated sync and delete logic for Patroni resources
* align delete streams and secrets logic with other resources
* rename gatherApplicationIds to getDistinctApplicationIds
* improve slot check before syncing streams CRD
* add ownerReferences and annotations diff to Patroni objects
* add extra sync code for config service so it does not get too ugly
* some bugfixes when comparing annotations and return err on found
* sync Patroni resources on update event and extended unit tests
* add config service/endpoint owner references check to e2e tes
* feat(498): Add ownerReferences to managed entities
* empty owner reference for cross namespace secret and more tests
* update ownerReferences of existing resources
* removing ownerReference requires Update API call
* CR ownerReference on PVC blocks pvc retention policy of statefulset
* make ownerreferences optional and disabled by default
* update unit test to check len ownerReferences
* update codegen
* add owner references e2e test
* update unit test
* add block_owner_deletion field to test owner reference
* fix typos and update docs once more
* reflect code feedback
---------
Co-authored-by: Max Begenau <max@begenau.com>
* Annotate PVC on Sync/Update, not only change PVC template
* Don't rotate pods when only annotations changed
* Annotate Logical Backup's and Pooler's pods
* Annotate PDB, Endpoints created by the Operator, Secrets, Logical Backup jobs
Inherited annotations are only added/updated, not removed
* make bucket prefix for logical backup configurable
* include container comparison in logical backup diff
* add unit test and update description for compareContainers
* don't rely on users putting / in the config - reflect other comments from review
CRD support for synchronous_node_count was previously added in #1484, however the desired SynchronousNodeCount was not compared to the actual patroni configuration, which meant it was never updated.
* add unit test and documentation for finalizers
* error msg with lower case and cover sync case
* try to avoid adding json-patch dependency
* use Update to remove finalizer
* changing status and finalizer during create
* do not call Delete() twice
* Allow drop slots when it gets deleted from the manifest
* use leader instead replica to query slots
* fix and extend unit tests for config update checks
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
This commit adds support of a not-yet-released Patroni feature that allows postgres to run as primary in case of a failed leader lock update.
* Add Patroni 'failsafe_mode' local parameter (enable for a single PG cluster)
* Allow configuring Patroni 'failsafe_mode' parameter globally
* create streams only after postgres instances were restarted
* checkAndSetGlobalPostgreSQLConfiguration returns if config has been patched
* restart can be pending even without a config patch
* allow in place pw rotation of system users
* block postgres user from rotation
* mark pooler pods for replacement
* adding podsGetter where pooler is synced in unit tests
* move rotation code in extra function
* feat: add ignored annotations when comparing during sync
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
Co-authored-by: Moshe Immerman <moshe@flanksource.com>
* Add support for pooler load balancer
Signed-off-by: Sergey Shatunov <me@prok.pw>
* Rename to enable_master_pooler_load_balancer
Signed-off-by: Sergey Shatunov <me@prok.pw>
* target port should be intval
* enhance pooler e2e test
* add new options to crds.go
Co-authored-by: Sergey Shatunov <me@prok.pw>
* provide event stream API
* check manifest settings for logical decoding before creating streams
* operator updates Postgres config and creates replication user
* name FES like the Postgres cluster
* add delete case and fix updating streams + update unit test
* check if fes CRD exists before syncing
* existing slot must use the same plugin
* make id and payload columns configurable
* sync streams only when they are defined in manifest
* introduce applicationId for separate stream CRDs
* add FES to RBAC in chart
* disable streams in chart
* switch to pgoutput plugin and let operator create publications
* reflect code review and additional refactoring
Co-authored-by: Paŭlo Ebermann <paul.ebermann@zalando.de>