Commit Graph

489 Commits

Author SHA1 Message Date
Felix Kunde 7951e9cb02 resolve conflict 2022-05-17 18:23:28 +02:00
Felix Kunde 268a86a045
removing inner goroutine in cluster.Switchover (#1876)
* removing inner goroutine in cluster.Switchover and resolve race between processPodEvent and unregisterPodSubscriber
* unlock mutex after handling event, now with non-blocking default case
2022-05-17 18:10:39 +02:00
Felix Kunde c6f2c68588
ignore case when checking for envVar existence but do not change it (#1889) 2022-05-12 11:59:05 +02:00
Felix Kunde a77d5df158
reverse membership for additional owner roles (#1862)
* reverse membership for additional owner roles
* remove type RoleOriginSpilo
* use e2e images with cron_admin inside
* let operator resolve reversed membership
* make additional owner roles part of the sync user strategy
* add more context in the docs about additional_owner_roles
2022-04-28 11:15:40 +02:00
Felix Kunde 8b6664f1a2
fix container ports (#1864) 2022-04-21 18:52:53 +02:00
Felix Kunde 532772c5cd
do not call EBS api when there are no pvs (#1851)
* do not call EBS api when there are no pvs
* no extra aws api call in executeEBSMigration, operate on fetched cluster.EBSVolumes
2022-04-20 12:12:02 +02:00
Felix Kunde eecd13169c
refactor spilo env var generation (#1848)
* refactor spilo env generation
* enhance docs on env vars
* add unit test for appendEnvVar
2022-04-14 11:47:33 +02:00
Jociele Padilha 483bf624ee
add test team member (#1842)
* return err if teams API fails with StatusCode other than 404
* add unit test for 404 at team members

Co-authored-by: Jociele Padilha <jociele.padilha@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2022-04-14 10:02:54 +02:00
Dmitry Volodin 9bcb25ac7e
Ability to set pod environment variables on cluster resource (#1794)
* Ability to set pod environment variables on cluster resource

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2022-04-11 10:16:35 +02:00
Felix Kunde 0dc370f15d
standby cluster that streams from a remote primary (#1830)
* add the possibility to create a standby cluster that streams from a remote primary
* extending unit tests
* add more docs and e2e test

Co-authored-by: machine424 <ayoubmrini424@gmail.com>
2022-04-04 15:41:11 +02:00
Jociele Padilha 2dfb11ad4c
update team message (#1818)
* return only warning if team can't be found

Co-authored-by: Jociele Padilha <jociele.padilha@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2022-04-04 13:08:03 +02:00
neelasha-09 f5cca1a093
major version upgrade for rootless and ocp : solving #1689 (#1770)
* major version upgrade for rootless and ocp : solving #1689

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2022-03-31 14:53:02 +02:00
Felix Kunde 2333d531d3
Fix deletion of event streams resources (#1831)
* fix deletion of event streams
* create cluster field to store stream application ids
2022-03-31 11:48:37 +02:00
Felix Kunde 60e0685c32
define readinessProbe on statefulSet (#1825)
* define readinessProbe on statefulSet 
* do not error out on deleting Patroni cluster objects
* change delete order for patroni objects
2022-03-30 18:19:34 +02:00
evsasha 30f2ba6525
do not create endpoints when use config maps (#1760)
* do not create endpoints when use config maps
* delete cluster objects with 'leader' suffix

Co-authored-by: Евграфов Александр Александрович <aevgrafov@cmx.ru>
2022-03-28 10:09:26 +02:00
Felix Kunde 654d22d04a
Configure annotations to be ignored in comparisons during sync (#1823)
* feat: add ignored annotations when comparing during sync

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
Co-authored-by: Moshe Immerman <moshe@flanksource.com>
2022-03-24 18:38:37 +01:00
Felix Kunde 36df1bc87c
refactor GenerateResourceRequirements and provide unit tests (#1822)
* refactor GenerateResourceRequirements and provide unit tests
2022-03-24 17:35:00 +01:00
Felix Kunde a020708ef1
fix unit test and improve stability in e2e test (#1819)
* fix unit test and improve stability in e2e test
* fix resource handling
2022-03-21 10:05:20 +01:00
Jakob Gillich f3b83c0b05
Fix empty resources spec field failing schema validation (#1589)
In Go, when a struct field is not set, it becomes a struct with
default values for all fields. These default values are included
during serialization. This causes issues with schema validation
where optional fields cannot be omitted because default values
are considered invalid.

This patch addresses this issue for `Resources` fields on several
types by using a pointer value.
2022-03-18 16:16:32 +01:00
Felix Kunde 1d88009ec4
fix comparison of event stream array (#1817)
* fix comparison of event stream array
* turn optional stream fields to pointers
2022-03-18 15:06:17 +01:00
Felix Kunde 2719d411c3
grant db owners to cron_admin (#1805)
* grant db owners to cron_admin
* allow specifiying more extra owner roles
* add unit test for InitAdditionalOwnerRoles
* add e2e test
2022-03-18 12:36:12 +01:00
Felix Kunde 6ba05fee22
Pooler sync fix (#1811)
* always sync pooler objects
* do not capitalize log messages in Go
* stretch pooler e2e test
2022-03-17 19:22:18 +01:00
Jociele Padilha 69254abeba
add new parameter for Patroni API (PatroniAPICheckInterval, PatroniAPICheckTimeout) (#1803)
Co-authored-by: Jociele Padilha <jociele.padilha@zalando.de>
2022-03-15 11:34:09 +01:00
Felix Kunde d032e4783e
LoadBalancer toggles for master and replica pooler pods (#1799)
* Add support for pooler load balancer

Signed-off-by: Sergey Shatunov <me@prok.pw>

* Rename to enable_master_pooler_load_balancer

Signed-off-by: Sergey Shatunov <me@prok.pw>

* target port should be intval
* enhance pooler e2e test
* add new options to crds.go

Co-authored-by: Sergey Shatunov <me@prok.pw>
2022-03-04 13:36:17 +01:00
A. Stoewer 695ad44caf
Logical backup retention time (#1337)
* Add optional logical backup retention time
* Set defaults for potentially unbound variables, so that the script will work with older operator versions
* Document retention time parameter for logical backups
* Add retention time parameter to resources and charts

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2022-03-02 17:39:33 +01:00
david amick ca0c27a51b
Retry when getting the pod_environment_secret (#1777)
* Retry when getting the pod_environment_secret
2022-03-01 17:56:16 +01:00
Dmitry Volodin da83982313
inherited_labels and inherited_annotations not passed to PVC (#1784)
* inherited_labels and inherited_annotations not passed to PVC
* Fix developer.md related to the local operator deployment
2022-03-01 17:07:37 +01:00
Maksim Zhylinski fb8a6c7a68
Compare container ports in a smarter way (#1755)
* Compare ports ingoring order and considering protocol defaults

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2022-02-28 11:35:41 +01:00
Felix Kunde d8a159ef1a
create CDC event stream CRD (#1570)
* provide event stream API
* check manifest settings for logical decoding before creating streams
* operator updates Postgres config and creates replication user
* name FES like the Postgres cluster
* add delete case and fix updating streams + update unit test
* check if fes CRD exists before syncing
* existing slot must use the same plugin
* make id and payload columns configurable
* sync streams only when they are defined in manifest
* introduce applicationId for separate stream CRDs
* add FES to RBAC in chart
* disable streams in chart
* switch to pgoutput plugin and let operator create publications
* reflect code review and additional refactoring

Co-authored-by: Paŭlo Ebermann <paul.ebermann@zalando.de>
2022-02-28 10:09:42 +01:00
Felix Kunde 8b404fd049
minor fixes to password rotation (#1796)
* minor fixes to password rotation
* rework unit test
2022-02-25 17:46:26 +01:00
Menzorg 06c28da97d
synchronous_node_count support (#1484)
* synchronous_node_count support
* notification about Patroni image version
* default synchronous_node_count to 1

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2022-02-25 17:21:42 +01:00
Felix Kunde 46547c4088
do not recreate pods if previous Patroni API calls fail (#1767)
* do not recreate pods if previous Patroni API calls fail
* move retry reads against Patroni API to pod.go
* remove final failover check in node affinity test
* make test_min_resource_limits more robust
2022-02-25 09:33:04 +01:00
Felix Kunde 658923d10d
Password rotation in secrets (#1749)
* password rotation in K8s secrets
* add db connection to syncSecrets
* add user retention
* add e2e test
* cleanup on username mismatch if rotation was switched off
* add unit test for syncSecrets + new updateSecret func
2022-02-18 11:54:47 +01:00
Felix Kunde a78a619e90
toleration diff and nodeReadinessLabel merge with manifest matchExpressions (#1729)
* include tolerations in statefulset comparison
* provide alternative merge behavior of nodeSelectorTerms for node readiness label
* add config option to change affinity merge behavior
* reworked e2e tests around node affinity
2022-01-27 15:57:24 +01:00
Felix Kunde 411abbe31e
handle case when Patroni returns that lag is unknown (#1724)
* handle case when Patroni returns that lag is unknown
* remove some prints from e2e test
2021-12-17 12:36:23 +01:00
Felix Kunde 07fd4ec00b
choose switchover candidate based on lag and role (#1700)
* choose switchover candidate based on lowest lag in MB and role (in synchronous mode)
2021-12-14 10:35:21 +01:00
James McDonald def9e1d688
Support standby replication from GS (GCS) (#1446)
* Add support for manual gs_wal_path in standby
* Remove separate standby version configuration
* Remove setting standby path via cluster/uid/version
Picking up the version doesn't work reliably without making changes to
Spilo. It's clearer to just specify the full S3/GS bucket path.

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2021-12-03 11:24:29 +01:00
Felix Kunde 1ed16fadca
make sure upgrade script runs on the master (#1715)
* make sure upgrade script runs on the master
* show a bit more logs from upgrade script
2021-12-02 14:10:58 +01:00
Felix Kunde f7858ffb70
Initialize arrays of errors / error messages + minor refactoring (#1701)
* init error arrays correctly
* avoid nilPointer when syncing connectionPooler
* getInfrastructureRoles should return error
* fix unit tests and return type for getInfrastructureRoles
2021-11-29 12:49:12 +01:00
Jan Mussler 3e275d122a
Allow individual teams to do auto upgrade via operator. (#1699)
* Allow whitelisting of teams to do auto upgrade upgrade via operator.

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2021-11-29 12:47:18 +01:00
Rafia Sabih e98439e5b6
Add log messages for usernames (#1692)
* add log messages for usernames
* document behavior better in logs

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2021-11-18 09:55:33 +01:00
Felix Kunde 1eafd688d0
restart master first in some edge cases (#1655)
* restart master first in some edge cases

* edge case is when desired is lower than effective

* wait after config patch and restart on sync whenever we see pending_restart

* convert options to int to check decrease and add unit test

* minor update to e2e tests

* wait only after restart not every sync

* using spilo 14 e2e images
2021-10-26 16:43:19 +02:00
Felix Kunde 2a33bf3313
improve Patroni config sync (#1635)
* improve Patroni config sync
* collect new and updated slots to patch patroni
* refactor httpGet in Patroni and extend unit tests
* GetMemberData should call the patroni endpoint
* add PATCH test
2021-10-13 17:17:26 +02:00
Felix Kunde ab25fb29b7
make Postgres 14 available (#1636)
* make Postgres 14 available
* don't be too hard to 9.5
* bump Spilo image and more docs updates
* update e2e test upgrading to 14
2021-10-12 12:00:59 +02:00
Jan Mussler d0d7a32d52
Clearing up error on resize failure message. (#1641)
* Clearing up error message.
2021-10-08 17:11:21 +02:00
Felix Kunde 62ed7e470f
improve pooler sync (#1593)
* remove role from installLookupFunction and run it on database sync, too
* fix condition to decide on syncing pooler
* trigger lookup from database sync only if pooler is set
* use empty spec everywhere and do not sync if one lookupfunction was passed
* do not sync pooler after being disabled
2021-08-27 12:41:37 +02:00
Aaron Peschel 1dd0cd9691
Add Support for Azure WAL-G Backups (#1537)
This commit adds support for using an Azure storage account as a backup
location.

It uses the existing GCS functionality as a reference for what to do,
and follows the example set by GCS as closely as possible.

The decision to name the cloud provider key "aws_or_gcp" is unfortunate
while adding support for Azure, but I have left it alone to allow for
this changeset to be backwards compatible.
2021-08-26 14:59:03 +02:00
John Rood 2d2ce6197b
Add volume selector (#1385)
* Add volume selector
* Add slightly better documentation and gofmt changes
* Update generated deepcopy
* Add test for PV selector

Co-authored-by: John Rood <j.rood@picturae.com>
2021-08-26 14:57:54 +02:00
Quan Hoang 1b3366e9f4
Support affinity in connection pooler deployments (#1464) 2021-08-24 15:25:03 +02:00
Felix Kunde f0815fc5bd
remove debug log of Spilo env vars (#1591) 2021-08-23 15:44:34 +02:00