Commit Graph

465 Commits

Author SHA1 Message Date
A. Stoewer 695ad44caf
Logical backup retention time (#1337)
* Add optional logical backup retention time
* Set defaults for potentially unbound variables, so that the script will work with older operator versions
* Document retention time parameter for logical backups
* Add retention time parameter to resources and charts

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2022-03-02 17:39:33 +01:00
david amick ca0c27a51b
Retry when getting the pod_environment_secret (#1777)
* Retry when getting the pod_environment_secret
2022-03-01 17:56:16 +01:00
Dmitry Volodin da83982313
inherited_labels and inherited_annotations not passed to PVC (#1784)
* inherited_labels and inherited_annotations not passed to PVC
* Fix developer.md related to the local operator deployment
2022-03-01 17:07:37 +01:00
Maksim Zhylinski fb8a6c7a68
Compare container ports in a smarter way (#1755)
* Compare ports ingoring order and considering protocol defaults

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2022-02-28 11:35:41 +01:00
Felix Kunde d8a159ef1a
create CDC event stream CRD (#1570)
* provide event stream API
* check manifest settings for logical decoding before creating streams
* operator updates Postgres config and creates replication user
* name FES like the Postgres cluster
* add delete case and fix updating streams + update unit test
* check if fes CRD exists before syncing
* existing slot must use the same plugin
* make id and payload columns configurable
* sync streams only when they are defined in manifest
* introduce applicationId for separate stream CRDs
* add FES to RBAC in chart
* disable streams in chart
* switch to pgoutput plugin and let operator create publications
* reflect code review and additional refactoring

Co-authored-by: Paŭlo Ebermann <paul.ebermann@zalando.de>
2022-02-28 10:09:42 +01:00
Felix Kunde 8b404fd049
minor fixes to password rotation (#1796)
* minor fixes to password rotation
* rework unit test
2022-02-25 17:46:26 +01:00
Menzorg 06c28da97d
synchronous_node_count support (#1484)
* synchronous_node_count support
* notification about Patroni image version
* default synchronous_node_count to 1

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2022-02-25 17:21:42 +01:00
Felix Kunde 46547c4088
do not recreate pods if previous Patroni API calls fail (#1767)
* do not recreate pods if previous Patroni API calls fail
* move retry reads against Patroni API to pod.go
* remove final failover check in node affinity test
* make test_min_resource_limits more robust
2022-02-25 09:33:04 +01:00
Felix Kunde 658923d10d
Password rotation in secrets (#1749)
* password rotation in K8s secrets
* add db connection to syncSecrets
* add user retention
* add e2e test
* cleanup on username mismatch if rotation was switched off
* add unit test for syncSecrets + new updateSecret func
2022-02-18 11:54:47 +01:00
Felix Kunde a78a619e90
toleration diff and nodeReadinessLabel merge with manifest matchExpressions (#1729)
* include tolerations in statefulset comparison
* provide alternative merge behavior of nodeSelectorTerms for node readiness label
* add config option to change affinity merge behavior
* reworked e2e tests around node affinity
2022-01-27 15:57:24 +01:00
Felix Kunde 411abbe31e
handle case when Patroni returns that lag is unknown (#1724)
* handle case when Patroni returns that lag is unknown
* remove some prints from e2e test
2021-12-17 12:36:23 +01:00
Felix Kunde 07fd4ec00b
choose switchover candidate based on lag and role (#1700)
* choose switchover candidate based on lowest lag in MB and role (in synchronous mode)
2021-12-14 10:35:21 +01:00
James McDonald def9e1d688
Support standby replication from GS (GCS) (#1446)
* Add support for manual gs_wal_path in standby
* Remove separate standby version configuration
* Remove setting standby path via cluster/uid/version
Picking up the version doesn't work reliably without making changes to
Spilo. It's clearer to just specify the full S3/GS bucket path.

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2021-12-03 11:24:29 +01:00
Felix Kunde 1ed16fadca
make sure upgrade script runs on the master (#1715)
* make sure upgrade script runs on the master
* show a bit more logs from upgrade script
2021-12-02 14:10:58 +01:00
Felix Kunde f7858ffb70
Initialize arrays of errors / error messages + minor refactoring (#1701)
* init error arrays correctly
* avoid nilPointer when syncing connectionPooler
* getInfrastructureRoles should return error
* fix unit tests and return type for getInfrastructureRoles
2021-11-29 12:49:12 +01:00
Jan Mussler 3e275d122a
Allow individual teams to do auto upgrade via operator. (#1699)
* Allow whitelisting of teams to do auto upgrade upgrade via operator.

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2021-11-29 12:47:18 +01:00
Rafia Sabih e98439e5b6
Add log messages for usernames (#1692)
* add log messages for usernames
* document behavior better in logs

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2021-11-18 09:55:33 +01:00
Felix Kunde 1eafd688d0
restart master first in some edge cases (#1655)
* restart master first in some edge cases

* edge case is when desired is lower than effective

* wait after config patch and restart on sync whenever we see pending_restart

* convert options to int to check decrease and add unit test

* minor update to e2e tests

* wait only after restart not every sync

* using spilo 14 e2e images
2021-10-26 16:43:19 +02:00
Felix Kunde 2a33bf3313
improve Patroni config sync (#1635)
* improve Patroni config sync
* collect new and updated slots to patch patroni
* refactor httpGet in Patroni and extend unit tests
* GetMemberData should call the patroni endpoint
* add PATCH test
2021-10-13 17:17:26 +02:00
Felix Kunde ab25fb29b7
make Postgres 14 available (#1636)
* make Postgres 14 available
* don't be too hard to 9.5
* bump Spilo image and more docs updates
* update e2e test upgrading to 14
2021-10-12 12:00:59 +02:00
Jan Mussler d0d7a32d52
Clearing up error on resize failure message. (#1641)
* Clearing up error message.
2021-10-08 17:11:21 +02:00
Felix Kunde 62ed7e470f
improve pooler sync (#1593)
* remove role from installLookupFunction and run it on database sync, too
* fix condition to decide on syncing pooler
* trigger lookup from database sync only if pooler is set
* use empty spec everywhere and do not sync if one lookupfunction was passed
* do not sync pooler after being disabled
2021-08-27 12:41:37 +02:00
Aaron Peschel 1dd0cd9691
Add Support for Azure WAL-G Backups (#1537)
This commit adds support for using an Azure storage account as a backup
location.

It uses the existing GCS functionality as a reference for what to do,
and follows the example set by GCS as closely as possible.

The decision to name the cloud provider key "aws_or_gcp" is unfortunate
while adding support for Azure, but I have left it alone to allow for
this changeset to be backwards compatible.
2021-08-26 14:59:03 +02:00
John Rood 2d2ce6197b
Add volume selector (#1385)
* Add volume selector
* Add slightly better documentation and gofmt changes
* Update generated deepcopy
* Add test for PV selector

Co-authored-by: John Rood <j.rood@picturae.com>
2021-08-26 14:57:54 +02:00
Quan Hoang 1b3366e9f4
Support affinity in connection pooler deployments (#1464) 2021-08-24 15:25:03 +02:00
Felix Kunde f0815fc5bd
remove debug log of Spilo env vars (#1591) 2021-08-23 15:44:34 +02:00
Felix Kunde 282b6d2863
allow secrets of default users in a different namespace (#1581)
* allow secrets of default users in a different namespace
* add warning in case secretNamespace is ignored
2021-08-18 16:00:26 +02:00
Felix Kunde 66620d5049
refactor restarting instances (#1535)
* refactor restarting instances and reduce listPods calls
* only add parameters to set if it differs from effective config
* update e2e test for updating Postgres config
* patch config only once
2021-08-09 16:23:41 +02:00
Felix Kunde 58bab073da
fix searching for users with namespace in name (#1569)
* fix searching for users with namespace in name and improve e2e test
* remove reformatting username to query
2021-07-27 09:46:55 +02:00
Rafia Sabih fa604027cf
Move flag to configmap (#1540)
* Move flag to configmap

Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2021-07-02 08:46:21 +02:00
Jan Mussler 330c2c4c0b
Do not modify if values are below gp3 minimum throughput. (#1543)
* Do not modify if values are below gp3 minimum throughput.
2021-06-30 15:01:55 +02:00
Felix Kunde 54e506c00b
define default access privileges for default users too (#1512)
* define default access privileges for default users too
* extend docs on defaultUsers
2021-06-22 16:45:28 +02:00
Sergey Dudoladov 53fb540c35
Add basic retry around switchover (#1510)
* add basic retry around switchover


Co-authored-by: Sergey Dudoladov <sergey.dudoladov@gmail.com>
2021-06-17 08:48:26 +02:00
Igor Yanchenko ebb3204cdd
restart instances via rest api instead of recreating pods, fixes bug with being unable to decrease some values, like max_connections (#1103)
* restart instances via rest api instead of recreating pods
* Ignore differences in bootstrap.dcs when compare SPILO_CONFIGURATION
* isBootstrapOnlyParameter is rewritten, instead of whitelist it uses blacklist
* added e2e test for max_connections decreasing
* documentation updated
* pending_restart flag added to restart api call, wait fot ttl seconds after restart
* refactoring, /restart returns error if pending_restart is set to true and patroni is not pending restart
* restart postgresql instances within pods only if pod's restart is not required
* patroni might need to restart postgresql after pods were recreated if values like max_connections decreased
* instancesRestart is not critical, try to restart pods if not successful
* cleanup

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2021-06-14 11:00:58 +02:00
Rafia Sabih 75a9e2be38
Create cross namespace secrets (#1490)
* Create cross namespace secrets

* add test cases

* fixes

* Fixes
- include namespace in secret name only when namespace is provided
- use username.namespace as key to pgUsers only when namespace is
  provided
- avoid conflict in the role creation in db by checking namespace
  alongwith the username

* Update unit tests

* Fix test case

* Fixes

- update regular expression for usernames
- add test to allow check for valid usernames
- create pg roles with namespace (if any) appended in rolename

* add more test cases for valid usernames

* update docs

* fixes as per review comments

* update e2e

* fixes

* Add toggle to allow namespaced secrets

* update docs

* comment update

* Update e2e/tests/test_e2e.py

* few minor fixes

* fix unit tests

* fix e2e

* fix e2e attempt 2

* fix e2e

Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2021-06-11 10:35:30 +02:00
Felix Kunde dd9c3907b7
pick first container if postgres is not found (#1505)
* pick first container if postgres is not found

* minor change
2021-05-28 11:44:10 +02:00
Felix Kunde 7884af2d59
get postgres container by name, not index (#1504) 2021-05-27 18:56:58 +02:00
Felix Kunde 48cdca645d
rework additional volume test (#1502) 2021-05-27 18:37:30 +02:00
Quan Hoang af5378eea5
Mount additional volumes to 'postgres' container when 'targetContains` is an empty list (#1475)
* Mount additional volumes to 'postgres' container when 'targetContainers' is an empty list

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2021-05-27 14:56:14 +02:00
Felix Kunde eeb59c5bfd
Rename roles that are removed from PostgresTeam CRD (#1457)
* rename db roles that are removed from manifests

* extend PostgresTeam e2e test

* make suffix configurable and add deprecated field to pgUser struct

* deny LOGIN from deprecated roles

* update feature documentation
2021-05-21 15:49:39 +02:00
Quan Hoang 18e2efe4e3
Update sts when modifying additionalVolumes (#1474) 2021-05-10 12:16:47 +02:00
Felix Kunde f0f7f25d30
Fix go lint errors (#1468)
* fix linter errors
* fix linter errors in kubectl plugin
* update PyYAML dependency in e2e tests
* declare a testVolume in volume_test
2021-05-10 11:48:03 +02:00
Felix Kunde 32e6c135b9
replace statefulset on annotation diff (#1449)
* replace statefulset on annotation diff
* remove update annotation function for statefulset
* add unit test for syncing annotations
* add inherited annotation to unit test
2021-04-22 11:22:52 +02:00
Felix Kunde 6b73ac4282
fix pooler sync with empty cluster name (#1448) 2021-04-09 14:08:28 +02:00
Felix Kunde c18241f187
Bump v1.6.2 (#1433)
* helm chart remove 1.6.0 archive from 1.6.0 archive

* bump operator to v1.6.2

* fix pointer deref

* skip connection pooler sync when empty

* revert pooler change and minor update to version msg

* do not log query on error when creating or altering users
2021-04-01 11:53:07 +02:00
neelasha-09 9e93c0a4ef
Fix for AllowPrivilegeEscalation : issue-1403 (#1412)
* Fix for AllowPrivilegeEscalation : issue-1403

* fixed syntax error

* Aligned the value for parameter

* Aligned the value for parameter

* Update crds.go

* Aligned the parameter spilo_allow_privilege_escalation

* Parameters sorted in Alphabetical order in manifests yaml

* Parameters sorted in Alphabetical order in manifests yaml

* Update pkg/controller/operator_config.go

* Update docs/reference/operator_parameters.md

Co-authored-by: Neelam Sharma <neelasha@amdocs.com>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2021-03-29 10:37:59 +02:00
machine424 78bfba85d2
create global default privileges in the appropriate prepared databases (#1421) 2021-03-26 14:19:26 +01:00
Felix Kunde c9acd52700
Major version upgrade config (#1386)
* reflect new major version upgrade options everywhere

* emit events during major version upgrade
2021-03-09 15:28:15 +01:00
Felix Kunde ff8143770c
Improve rolling upgrades and rolling upgrade continue (#1341)
* add TODOs for moving rooling update label on pods
* steer rolling update via pod annotation
* rename patch method and fix reading flag on sync
* pass only pods to recreatePods function
* do not take address of iterator if you use it later
* add e2e test and pass switchover targets to recreatePods
* add wait_for_pod_failover for e2e test
* add one more e2e test case
* helm chart remove 1.6.0 archive from 1.6.0 archive
* reflect code review feedback
2021-02-26 15:38:58 +01:00
Jan Mussler e837751ae0
Log result. (#1387) 2021-02-26 14:54:26 +01:00