Commit Graph

403 Commits

Author SHA1 Message Date
Jan Mußler 1e10549e43 More debug output for trouble shooting of missing sync. 2020-11-04 21:32:09 +01:00
Rafia Sabih 9de57f0182 fix unit tests and other minor changes 2020-11-04 07:56:34 +01:00
Rafia Sabih 2b44eff294 fix unit test and delete secret when required only 2020-11-03 15:57:24 +01:00
Rafia Sabih 041b9eec52 Enhance e2e test and other fixes 2020-11-03 15:36:52 +01:00
Rafia Sabih fd18e02f2a merge master 2020-10-29 13:36:07 +01:00
Jan Mussler c694a72352
Make failure in retry a warning not an error. (#1188) 2020-10-29 13:12:25 +01:00
Rafia Sabih 133f77c330
Refactor connection pooler (#1159)
Refactor connection_pooler code

- Move all the relevant code to a separate file
- Move all the related tests to a separate file
- Avoid using cluster where not required
- Simplify the logic in sync and other methods
- Cleanup of duplicated or unused code

Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de>
2020-10-29 12:05:53 +01:00
Felix Kunde d658b9672e
PostgresTeam CRD for advanced team management (#1165)
* PostgresTeamCRD for advanced team management

* rework internal structure to be closer to CRD

* superusers instead of admin

* add more util functions and unit tests

* fix initHumanUsers

* check for superusers when creating normal teams

* polishing and fixes

* adding the essential missing pieces

* add documentation and update rbac

* reflect some feedback

* reflect more feedback

* fixing debug logs and raise QueueResyncPeriodTPR

* add two more flags to disable CRD and its superuser support

* fix chart

* update go modules

* move to client 1.19.3 and update codegen
2020-10-28 10:40:10 +01:00
Jan Mussler 3a86dfc8bb
End 2 End tests speedup (#1180)
* Improving end 2 end tests, especially speed of execution and error, by implementing proper eventual asserts and timeouts.
* Add documentation for running individual tests
* Fixed String encoding in Patorni state check and error case
* Printing config as multi log line entity, makes it readable and grepable on startup
* Cosmetic changes to logs. Removed quotes from diff. Move all object diffs to text diff. Enabled padding for log level.
* Mount script with tools for easy logaccess and watching objects.
* Set proper update strategy for Postgres operator deployment.
* Move long running test to end. Move pooler test to new functions.
* Remove quote from valid K8s identifiers.
2020-10-28 10:04:33 +01:00
Rafia Sabih 2d20982913 conflict resolution for master merge 2020-10-27 17:42:23 +01:00
preved911 d9f5d1c9df
changed PodEnvironmentSecret location namespace (#1177)
Signed-off-by: Ildar Valiullin <preved.911@gmail.com>
2020-10-22 08:49:30 +02:00
Dmitry Dolgov 1f5d0995a5
Lookup function installation (#1171)
* Lookup function installation

Due to reusing a previous database connection without closing it, lookup
function installation process was skipping the first database in the
list, installing twice into postgres db instead. To prevent that, make
internal initDbConnWithName to overwrite a connection object, and return
the same object only from initDbConn, which is sort of public interface.

Another solution for this would be to modify initDbConnWithName to
return a connection object and then generate one temporary connection
for each db. It sound feasible but after one attempt it seems it
requires a bit more changes around (init, close connections) and
doesn't bring anything significantly better on the table. In case if
some future changes will prove this wrong, do not hesitate to refactor.

Change retry strategy to more insistive one, namely:

* retry on the next sync even if we failed to process one database and
install pooler appliance.

* perform the whole installation unconditionally on update, since the
list of target databases could be changed.

And for the sake of making it even more robust, also log the case when
operator decides to skip installation.

Extend connection pooler e2e test with verification that all dbs have
required schema installed.
2020-10-19 16:18:58 +02:00
Dmitry Dolgov d15f2d3392
Readiness probe (#1169)
Right now there are no readiness probes defined for connection pooler,
which means after a pod restart there is a short time window (between a
container start and connection pooler starting listening to a socket)
when a service can send queries to a new pod, but connection will be
refused. The pooler container is rather lightweight and it start to
listen immediately, so the time window is small, but still.

To fix this add a readiness probe for tcp socket opened by connection
pooler.
2020-10-15 10:16:42 +02:00
Rafia Sabih d2c410d72b Add labels 2020-10-05 12:10:02 +02:00
Rafia Sabih a6ffdbae36 Cleanup deleteConnectionPooler 2020-10-05 12:06:58 +02:00
Rafia Sabih 86e6a51fa9 Merge branch 'master' into replica-pooler 2020-10-05 11:47:47 +02:00
Sergey Dudoladov 2a21cc4393
Compare Postgres pod priority on Sync (#1144)
* compare Postgres pod priority on Sync

Co-authored-by: Sergey Dudoladov <sergey.dudoladov@zalando.de>
2020-09-23 17:26:56 +02:00
Rafia Sabih 683cb15b0d Add and update tests 2020-09-23 17:09:36 +02:00
Rafia Sabih 1ee79887cf Add sync test 2020-09-23 11:19:01 +02:00
neelasha-09 ab95eaa6ef
Fixes #1130 (#1139)
* Fixes #1130

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-09-22 17:16:05 +02:00
Rafia Sabih 033a7f4e24 Update tests 2020-09-21 22:55:10 +02:00
Rafia Sabih 2936ed0060 Refactor needConnectionPooler
Have one unified function to tell if any connection pooler is required

Add a helper function to list the roles that require connection pooler,
helps in avoiding duplication of code
2020-09-21 16:40:40 +02:00
Rico Berger d09e418b56
Set user and group in security context (#1083)
* Set user and group in security context
2020-09-15 13:27:59 +02:00
Igor Yanchenko d8884a4003
Allow to overwrite default ExternalTrafficPolicy for the service (#1136)
* Allow to overwrite default ExternalTrafficPolicy for the service
2020-09-15 13:19:22 +02:00
Rafia Sabih 7c9b459919 Minor fix 2020-09-11 10:22:29 +02:00
Rafia Sabih 46ff4bb738 Resolve review comments 2020-09-11 10:17:43 +02:00
Rafia Sabih 770fc1e612 Improvements in tests
- Fixed the issue with failing test cases
- Add more test cases for replica connection pooler
- Added docs about the new flag
2020-09-08 17:32:47 +02:00
Rafia Sabih b3dbac5b81 Adding test cases and other changes
- Refactor needConnectionPooler for master and replica separately
- Improve sync function
- Add test cases to create, delete and sync with repplica connection
  pooler

Other changes
2020-09-04 17:32:35 +02:00
Rafia Sabih 374dd00538 Fix sync 2020-09-04 08:02:27 +02:00
Rafia Sabih 1814342dc3 Assorted changes
- Update deleteConnectionPooler to include role
- Rename EnableMasterConnectionPooler back to original name for backward
  compatiility
- other minor chnages and code improvements
2020-09-03 16:01:25 +02:00
Rafia Sabih 503082cf1a fix for labels selector 2020-09-03 09:32:48 +02:00
Rafia Sabih a9248b1379 Fix labels for the replica pods 2020-09-02 17:17:56 +02:00
Rafia Sabih fb49376085 Enable connection pooler for replica
- Refactor code for connection pooler deployment and services
- Refactor sync code for connection pooler
- Rename EnableConnectionPooler to EnableMasterConnectionPooler
- Update yamls and tests
2020-09-02 13:46:36 +02:00
Rafia Sabih 83ddd5c85b Add new pooler service for replica 2020-08-28 14:55:49 +02:00
Rafia Sabih 3a906aba93 Add pooler for replica 2020-08-28 13:25:46 +02:00
Felix Kunde dfd0dd90ed
set search_path for default roles (#1065)
* set search_path for default roles

* deployment back to 1.5.0

Co-authored-by: Felix Kunde <felix.kunde@zalando.de>
2020-08-11 10:42:31 +02:00
Felix Kunde 0508266219
Remove all secrets on delete incl. pooler (#1091)
* fix syncSecrets and remove pooler secret

* update log for deleteSecret

* use c.credentialSecretName(username)

* minor fix
2020-08-10 18:26:26 +02:00
Felix Kunde 43163cf83b
allow using both infrastructure_roles_options (#1090)
* allow using both infrastructure_roles_options

* new default values for user and role definition

* use robot_zmon as parent role

* add operator log to debug

* right name for old secret

* only extract if rolesDefs is empty

* set password1 in old infrastructure role

* fix new infra rile secret

* choose different role key for new secret

* set memberof everywhere

* reenable all tests

* reflect feedback

* remove condition for rolesDefs
2020-08-10 15:08:03 +02:00
Felix Kunde f3ddce81d5
fix random order for pod environment tests (#1085) 2020-07-30 17:48:15 +02:00
hlihhovac 47b11f7f89
change Clone attribute of PostgresSpec to *CloneDescription (#1020)
* change Clone attribute of PostgresSpec to *ConnectionPooler

* update go.mod from master

* fix TestConnectionPoolerSynchronization()

* Update pkg/apis/acid.zalan.do/v1/postgresql_type.go

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>

Co-authored-by: Pavlo Golub <pavlo.golub@gmail.com>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-07-30 16:31:29 +02:00
Felix Kunde 3bee590d43
fix index in TestGenerateSpiloPodEnvVarswq (#1084)
Co-authored-by: Felix Kunde <felix.kunde@zalando.de>
2020-07-30 13:35:37 +02:00
Christian Rohmann ece341d516
Allow pod environment variables to also be sourced from a secret (#946)
* Extend operator configuration to allow for a pod_environment_secret just like pod_environment_configmap

* Add all keys from PodEnvironmentSecrets as ENV vars (using SecretKeyRef to protect the value)

* Apply envVars from pod_environment_configmap and pod_environment_secrets before doing the global settings from the operator config. This allows them to be overriden by the user (via configmap / secret)

* Add ability use a Secret for custom pod envVars (via pod_environment_secret) to admin documentation

* Add pod_environment_secret to Helm chart values.yaml

* Add unit tests for PodEnvironmentConfigMap and PodEnvironmentSecret - highly inspired by @kupson and his very similar PR #481

* Added new parameter pod_environment_secret to operatorconfig CRD and configmap examples

* Add pod_environment_secret to the operationconfiguration CRD

Co-authored-by: Christian Rohmann <christian.rohmann@inovex.de>
2020-07-30 10:48:16 +02:00
Igor Yanchenko 002b47ec32
Use scram-sha-256 hash if postgresql parameter password_encryption set to do so. (#995)
* Use scram-sha-256 hash if postgresql parameter password_encryption set to do so.

* test fixed

* Refactoring

* code style
2020-07-16 14:43:57 +02:00
Felix Kunde 375963424d
delete secrets the right way (#1054)
* delete secrets the right way

* make a one function

* continue deleting secrets even if one delete fails

Co-authored-by: Felix Kunde <felix.kunde@zalando.de>
2020-07-10 15:07:42 +02:00
Igor Yanchenko 88735a798a
Resize volume by changing pvc size if enabled in config. (#958)
* Try to resize pvc if resizing pv has failed

* added config option to switch between storage resize strategies

* changes according to requests

* Update pkg/controller/operator_config.go

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>

* enable_storage_resize documented

added examples to the default configuration and helm value files

* enable_storage_resize renamed to volume_resize_mode, off by default

* volume_resize_mode renamed to storage_resize_mode

* Update pkg/apis/acid.zalan.do/v1/crds.go

* pkg/cluster/volumes.go updated

* Update docs/reference/operator_parameters.md

* Update manifests/postgresql-operator-default-configuration.yaml

* Update pkg/controller/operator_config.go

* Update pkg/util/config/config.go

* Update charts/postgres-operator/values-crd.yaml

* Update charts/postgres-operator/values.yaml

* Update docs/reference/operator_parameters.md

* added logging if no changes required

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-07-03 10:53:37 +02:00
Felix Kunde 0c6655a22d
skip creation later to improve visibility of errors (#1013)
* try to emit error for missing team name in cluster name

* skip creation after new cluster object

* move SetStatus to k8sclient and emit event when skipping creation and rename to SetPostgresCRDStatus

Co-authored-by: Felix Kunde <felix.kunde@zalando.de>
2020-06-17 13:32:16 +02:00
Felix Kunde fa6929f028
do not block rolling updates with lazy spilo update enabled (#1012)
* do not block rolling updates with lazy spilo update enabled

* treat initContainers like Spilo image

Co-authored-by: Felix Kunde <felix.kunde@zalando.de>
2020-06-11 12:23:39 +02:00
Felix Kunde fe7ffaa112
trigger rolling update when securityContext of PodTemplate changes (#1007)
Co-authored-by: Felix Kunde <felix.kunde@zalando.de>
2020-06-09 10:27:57 +02:00
alfredw33 2b0def5bc8
Support for GCS WAL-E backups (#620)
* Support for WAL_GS_BUCKET and GOOGLE_APPLICATION_CREDENTIALS environtment variables

* Fixed merge issue but also removed all changes to support macos.

* Updated test to new format

* Missed macos specific changes

* Added documentation and addressed comments

* Update docs/administrator.md

* Update docs/administrator.md

* Update e2e/run.sh

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-06-03 17:33:48 +02:00
Steffen Pøhner Henriksen 0fa61a6ab3
Changed order of sidecar env vars (#980)
* Changed order of sidecar env vars

* Cleaned up test code
2020-05-25 16:32:33 +02:00