This change improves the responsiveness of the operator when handling
deletion requests by running sync operations in the background and
using context cancellation to interrupt stuck operations.
Changes:
- Add context field to Cluster struct, passed through New()
- Add Cancel() method to cancel cluster's context
- Add StartSync/EndSync/NeedsResync for managing background sync state
- Run Sync() in a background goroutine so worker can process other events
- Add context-aware DB connection methods (initDbConnWithContext)
- Add RetryWithContext() that respects context cancellation
- Cancel cluster context immediately when DeletionTimestamp detected
- Use context-aware connections in syncRoles/syncDatabases
- StartSync/NeedsResync check context cancellation to prevent new syncs
during deletion (no need for separate deleted flag)
Flow:
1. Sync event spawns background goroutine and returns immediately
2. If another sync arrives while one is running, needsResync flag is set
3. When sync completes, it checks needsResync and requeues if needed
4. Delete cancels context -> stuck DB operations return early -> mutex released
5. StartSync/NeedsResync return false when context cancelled
6. Delete proceeds without waiting for slow/stuck sync operations
* skip db user actions when its secret failed to sync on update
* need to add new pgUser field to e2e test
* lets collect errors of syncSecret so we still get status updateFailed
* Add support for pg17
* use new gcov2lcov-action
* Use ghcr spilo-17
* Update SPILO_CURRENT and SPILO_LAZY
* Update e2e/run.sh
---------
Co-authored-by: Polina Bungina <27892524+hughcapet@users.noreply.github.com>
* bump operator to v1.13.0
* align configmap with CRD config
* remove default from CRD config option additional_secret_mount_path
* enable automatic major version upgrades by default
* feat(498): Add ownerReferences to managed entities
* empty owner reference for cross namespace secret and more tests
* update ownerReferences of existing resources
* removing ownerReference requires Update API call
* CR ownerReference on PVC blocks pvc retention policy of statefulset
* make ownerreferences optional and disabled by default
* update unit test to check len ownerReferences
* update codegen
* add owner references e2e test
* update unit test
* add block_owner_deletion field to test owner reference
* fix typos and update docs once more
* reflect code feedback
---------
Co-authored-by: Max Begenau <max@begenau.com>
* Annotate PVC on Sync/Update, not only change PVC template
* Don't rotate pods when only annotations changed
* Annotate Logical Backup's and Pooler's pods
* Annotate PDB, Endpoints created by the Operator, Secrets, Logical Backup jobs
Inherited annotations are only added/updated, not removed
* bump tp v1.12.0
* code-generator and apiextensions-apiserver still on to 0.25.9 to allow code-generation on GH
* bump go in github action and mini fix in UI
* update UI Dockerfile
---------
Co-authored-by: Ida Novindasari <idanovinda@gmail.com>
* switch to ghcr image in helm chart and examples
* change logical backup config for helm chart
* change internal default for logical backup image config to ghcr, too
* Secrets deletion config
* Update e2e/tests/test_e2e.py
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
---------
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
* make bucket prefix for logical backup configurable
* include container comparison in logical backup diff
* add unit test and update description for compareContainers
* don't rely on users putting / in the config - reflect other comments from review
* add the pg version 16
* add comma after pg16 in crds api
* change minimal_major_version to 12
* add new spilo image for pg16
* edit the registry from current and lazy spilo
* Update e2e/run.sh
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
* Update README.md
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
* add pg 11 to be compatible for the existing DBs
* update pq, pyyaml,k8s and kind version
* skip test_infrastructure_roles
* skip another test
* remove the skipping
* adjust the verification of new Patroni version states
---------
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
* allow empty resources when defaults are empty
* update codegen
* add more unit tests and remove internal resources defaults
* a unit test for min limit and raising to request
* uncomment defaults in example configmap
* simplifying pooler pod generation unit test
* add unit test and documentation for finalizers
* error msg with lower case and cover sync case
* try to avoid adding json-patch dependency
* use Update to remove finalizer
* changing status and finalizer during create
* do not call Delete() twice
* Add Finalizer functions to Cluster; add/remove finalizer on Create/Delete events
* Check if clusters have a deletion timestamp and we missed that event. Run Delete() and remove finalizer when done.
* Fix nil handling when using Service from map; Remove Service, Endpoint entries from their maps - just like with Secrets
* Add handling of ResourceNotFound to all delete functions (Service, Endpoint, LogicalBackup CronJob, PDB and Secret) - this is not a real error when deleting things
* Emit events when there are issues deleting resources to the user is informed
* Depend the removal of the Finalizer on all resources being deleted successfully first. Otherwise the next sync run should let us try again
* Add config option to enable finalizers
* Removed dangling whitespace at EOL
* config.EnableFinalizers is a bool pointer
---------
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
* add prefix /vol- on when EBS doesn't have
* add new unit test for to get the volumeID
* add a prefix to search in the string of volumeID
---------
Co-authored-by: Jociele Padilha <jociele.padilha@zalando.de>
* bump to v1.9.1
* update year in license and add links to more blog posts
* bump go to 1.19 and update dependencies
* go for 1.10.0 instead of 1.9.1
* fix unit test - removed obsolete ClusterName field
* fix DNS template in UI helm chart deployment file
* new config options to specify resources for logical backup jobs
* bug in logical backup script for s3 dumps
* define enum for logical_backup_provider
* changed order of logical backup azure options
* fix unit test for stream comparison
* Bumped Spilo image tag to the one that supports PostgreSQL 15. Using CDP version temporarily until non-CDP one is released.
* Added support for PostgreSQL 15 and made it default. 9.5 and 9.6 are now no longer supported
* Bumped spilo image tag to 2.1-p9
* Bumped spilo image in test launcher
Co-authored-by: yoshihiko <ariyoshi10@gmail.com>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>