Commit Graph

925 Commits

Author SHA1 Message Date
thoro f988e4cf0e
Fix deletion timestamp handling for clusters with finalizers (#3015)
When a Postgres cluster has a finalizer, deleting it sets a DeletionTimestamp
but doesn't remove the object until the finalizer is cleared. The operator
was not properly handling these DeletionTimestamp changes:

1. postgresqlUpdate() was filtering out events where only DeletionTimestamp
   changed (it only checked Spec and Annotations), causing the delete to
   never be processed.

2. EventUpdate case in processEvent() didn't check for DeletionTimestamp,
   so even if the event reached the processor, it would run Update() instead
   of Delete().

3. removeFinalizer() used a cached object with stale resourceVersion,
   causing "object has been modified" errors.

Fixes:
- Add explicit DeletionTimestamp check in postgresqlUpdate() to queue the event
- Add DeletionTimestamp check in EventUpdate to call Delete() when set
- Fetch latest object from API before removing finalizer to avoid conflicts

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2026-06-01 09:18:20 +02:00
laiminhtrung1997 e871a167ed
Add topologySpreadConstraints configuration to pod spec. (#2530)
* Add topologySpreadConstraints configuration to pod spec.
* Run update-codegen.sh to add deepcopy for new field to the api.
* Reuse configured TopologySpreadConstraints for logical backup.
* Remove x-kubernetes-preserve-unknown-fields and XPreserveUnknownFields.
* Add topologySpreadConstraint example in the complete manifest.
* Add support for helm chart.
* Add documentation for topologySpreadConstraint.
* Update e2e test to patch topologySpreadConstraints into the postgresqls manifest.
* For e2e test, updated the PVC retention policy to remove redundant PVCs.
* Fix e2e test, expected PVC count in end-to-end test after config changes.
2026-05-29 17:07:47 +02:00
Sai Asish Y 4d40270890
fix: correct 'occured' typo in error messages (#3094)
* fix: correct 'occured' typo in finalizer error message
* fix: correct 'occured' typo in EBS volume error message
2026-05-11 11:16:37 +02:00
Felix Kunde 618ac156e6
Volume mount length of pooler users (#3093)
* shorten pooler secret mount
* update postgres CRD in helm chart
2026-05-08 17:25:59 +02:00
Mikkel Oscar Lyderik Larsen 3ca1884876
Remove references to registry.opensource.zalan.do (#3092)
Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>
2026-05-08 09:16:10 +02:00
Felix Kunde e1713705f4
build multi-arch pooler image (#3077)
* build multi-arch pooler image
* add pooler build step in delivery.yaml and bump pooler version
* pull from docker hub not zalando registry
* add pooler step to ghcr workflow
* pass infra roles to auth file via pooler entrypoint
* introduce extra pooler secret for mounting auth_file
* use pbgouncer as image name and push to ghcr on next merge
* build with latest pgbouncer
* integrate new image in e2e process and update pooler image default
* update pooler build dependencies
* build pooler image for e2e test
* more Makefile and e2e run script tweaking

---------

Co-authored-by: Ida Novindasari <idanovinda@gmail.com>
2026-04-28 13:34:36 +02:00
annielzy 97f4de7cc0
Fix rolling update deadlock when pods are stuck in non-running state (#3051)
* add fix to recreate non running pods in syncStatefulsets

* remove TestSyncStatefulSetNonRunningPodsDoNotBlockRecreatio

* revert pod_test

* pod without status

---------

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
Co-authored-by: Ida Novindasari <idanovinda@gmail.com>
2026-04-28 12:08:34 +02:00
Felix Kunde 688bbf1b9e
update standby check in pooler code (#3088) 2026-04-28 10:17:28 +02:00
Polina Bungina 0ac28e3aad
Do not set aws-load-balancer-connection-idle-timeout by default (#3054)
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2026-04-24 14:23:54 +02:00
Andreas Mårtensson 27c969d14b
Set securityContext for backup container (#2117)
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2026-04-24 11:06:30 +02:00
Felix Kunde 39cc09ccaa
feature toggle for using maintenance windows (#3074)
* feature toggle for using maintenance windows
2026-04-16 17:13:18 +02:00
Polina Bungina e9478894a8
Avoid rotating pods for PGVERSION change outside of maintenance window (#3065)
* Avoid rotating pods for PGVERSION change outside of maintenance window
* Update docs
2026-04-07 12:16:55 +02:00
Ida Novindasari 421bd6d664
fix: invalid switchover scheduling with default maintenance windows (#3058) 2026-03-24 12:57:15 +01:00
Jorge Solorzano d495825f4b
Remove hardcoded VersionMap from majorversionupgrade (#3043)
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2026-03-02 11:13:10 +01:00
Felix Kunde 2a31c403d0
do not reset secrets of standby clusters (#3044)
* do not reset secrets of standby clusters
align error message with unit test
* check for other env vars, too
2026-02-26 17:27:47 +01:00
Ida Novindasari aefe9d8298
chore: add logging for major upgrade failure (#3046) 2026-02-19 09:57:20 +01:00
Mikkel Oscar Lyderik Larsen 9f9a3acb61
Checkin CRD to make go get work (#3047)
Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>
2026-02-18 14:20:46 +01:00
Felix Kunde cffa0ee63c
try to set infra roles also if one fails (#3045) 2026-02-18 08:38:17 +01:00
Ida Novindasari 6ce7c50cec
Add support for pg18 and remove pg13 (#3035)
* Add support for pg18 and remove pg13
* Update general spilo image and use new rebuilt e2e spilo image

---------

Co-authored-by: Polina Bungina <polina.bungina@zalando.de>
2026-02-17 10:19:19 +01:00
Felix Kunde 4f130f9cce
provide examples for maintenance_windows in manifest examples (#3040) 2026-02-02 16:35:01 +01:00
Felix Kunde b84c58c2a6
add support for global maintenance windows (#3038)
* add support for global maintenance windows
* fix schema validation and trim \ when unmarshalling maintenance window
2026-01-30 11:37:21 +01:00
Mikkel Oscar Lyderik Larsen f05150a81e
Use UpdateStatus instead of patch (#3005)
Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2026-01-27 10:44:30 +01:00
Polina Bungina b97de5d7f1
Standby section improvements (#3033)
- Allow standby_host to be specified together with wal_path
- Add standby_primary_slot_name
2026-01-19 13:54:27 +01:00
Mikkel Oscar Lyderik Larsen 32d6d0a7a7
Fix serving CRD at runtime (#3031)
* Fix serving CRD at runtime

Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>

* Correctly string quote version enum

Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>

---------

Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>
2026-01-13 17:23:56 +01:00
Felix Kunde 97115d6e3d
add annotation to ignore resources thresholds (#3030)
* add annotation to ignore resources thresholds
* add test case when annotation key is set but value is not true
2026-01-13 09:33:24 +01:00
Mikkel Oscar Lyderik Larsen a585b17796
Generate postgresql CRD from go structs (#3007)
* Sort postgresql.crd.yaml
* Generate postgresql CRD from go structs
* Expand sidecars, env and initcontainers
* Embed CRD to be submitted by the operator

Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>

---------

Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>
2026-01-12 17:33:28 +01:00
Mikkel Oscar Lyderik Larsen 0a44252534
Generate CRD for postgresteam resource (#3004)
* Sort postgresteam.crd.yaml

Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>

* Generate CRD for postgresteam resource

Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>

---------

Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2026-01-10 19:39:08 +01:00
Mikkel Oscar Lyderik Larsen 55cc167fca
Regenerate code for 2026 header (#3029)
Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>
2026-01-09 15:38:16 +01:00
Mikkel Oscar Lyderik Larsen f6839f87b9
Modernize code generation (#3003)
Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>
2026-01-09 14:22:10 +01:00
Felix Kunde 1f4ee605ae
fix docker build for UI and bumped some outdated versions in docs and config (#3017)
* fix docker build for UI and bumped some outdated versions in docs and config
* update helm chart image again because of wrong format field
* switch to new registry ghcr.io for e2e test
* update e2e test runner Dockerfile
2025-12-18 12:12:53 +01:00
Felix Kunde c4f10ceadc
bump to v1.15.1 (#3011)
* bump to v1.15.1
2025-12-16 19:25:12 +01:00
Steven Berler cd05682482
fix switchover schedule tests (#2995)
* fix switchover schedule tests

Previously the tests would fail depending on the local time zone and the
time of day the test was being run.

---------

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
Co-authored-by: Mikkel Oscar Lyderik Larsen <mikkeloscar@users.noreply.github.com>
2025-12-11 10:22:40 +01:00
Felix Kunde 04ad66f701
stop retention user cleanup early again when DB connection attempt fails (#2999)
* stop retention user cleanup early again when DB connection attempt fails
* add unit test and new returned error from updateSecret
2025-12-10 10:01:07 +01:00
ovnozdrach 42bbead4c9
Fix Sidecar without image specification issue (#2977)
Co-authored-by: Oleg Nozdrachev <ovnozdrach@mts.ru>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2025-12-09 09:37:11 +01:00
Felix Kunde 2c57498e43
skip db user actions when its secret failed to sync on update (#2969)
* skip db user actions when its secret failed to sync on update
* need to add new pgUser field to e2e test
* lets collect errors of syncSecret so we still get status updateFailed
2025-11-05 16:28:37 +01:00
Felix Kunde 1af4c50ed0
bump to v1.15.0 (#2965)
* bump to v1.15.0
* more linter hints
* update dependencies of kubectl-pg plugin
2025-10-21 11:56:33 +02:00
Felix Kunde 3bc244fe39
bump dependencies and reflect linter suggestions (#2963) 2025-10-16 10:23:36 +02:00
Eng Zer Jun eddf521227
Replace `golang.org/x/exp` with stdlib (#2857)
* Replace `golang.org/x/exp` with stdlib

These experimental packages are now available in the Go standard
library since Go 1.21.

	1. golang.org/x/exp/slices -> slices [1]
	2. golang.org/x/exp/maps -> maps [2]

[1]: https://go.dev/doc/go1.21#slices
[2]: https://go.dev/doc/go1.21#maps

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* Run go mod tidy

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

---------

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2025-10-14 11:59:48 +02:00
Felix Kunde dc29425969
include external traffic policy comparison into service diffing (#2956) 2025-09-23 14:30:06 +02:00
Polina Bungina bcd729b2cc
Add selector to master service when switching to CM (#2955)
Add service selector comparison to compareServices
This is necessary for the proper switch of `kubernetes_use_configmaps` configuration value, as master service should have different label selector setup for those.
2025-09-19 14:44:17 +02:00
Morten Lied Johansen ad7e590916
Skip creation of OwnerReference if user is in a different namespace (#2912)
Instead of doing a string compare on the username, check the actual namespace of the user to determine if an owner reference can be created.
2025-09-17 15:57:36 +02:00
Jociele Padilha fa4bc21538
upgrade Go from 1.23.4 to 1.25.0 (#2945)
* upgrade go to 1.25
* add minor version to be Go 1.25.0
* revert the Go version on README to keep the history of the release
2025-08-19 14:40:39 +02:00
Polina Bungina 68c4b49636
Fix wrong condition for bootstrap labels (#2875) 2025-03-10 17:05:27 +01:00
Polina Bungina c7a586d0f8
Configure (upcoming) Patroni bootstrap labels feature (#2872)
Set the value from the critical-operation-pdb's selector if PDBs are enabled
2025-03-10 10:16:01 +01:00
Felix Kunde 746df0d33d
do not remove publications of slot defined in manifest (#2868)
* do not remove publications of slot defined in manifest
* improve condition to sync streams
* init publication tables map when adding manifest slots
* need to update c.Stream when there is no update
2025-02-26 17:31:37 +01:00
Felix Kunde 2a4be1cb39
fix creating secrets for rotation users (#2863)
* fix creating secrets for rotation users
* rework annotation comparison on update to decide on when to call syncSecrets
2025-02-14 09:44:09 +01:00
Polina Bungina c8063eb78a
Protect Pods from disruptions during upgrades (#2844)
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2025-01-30 10:41:58 +01:00
Polina Bungina a56ecaace7
Critical operation PDB (#2830)
Create the second PDB to cover Pods with a special "critical operation" label set.

This label is going to be assigned to all pg cluster's Pods by the Operator during a PG major version upgrade, by Patroni during a cluster/replica bootstrap. It can also be set manually or by any other automation tool.
2025-01-29 12:41:08 +01:00
Polina Bungina f49b4f1e97
Ensure podAnnotations are removed from pods if reset in the config (#2826) 2025-01-24 16:53:14 +01:00
Polina Bungina b0cfeb30ea
Partially revert #2810 (#2849)
Only schedule switchover for pod migration, consider mainWindow for PGVERSION env change
2025-01-23 16:35:33 +01:00