Commit Graph

1077 Commits

Author SHA1 Message Date
Felix Kunde fe7ffaa112
trigger rolling update when securityContext of PodTemplate changes (#1007)
Co-authored-by: Felix Kunde <felix.kunde@zalando.de>
2020-06-09 10:27:57 +02:00
Felix Kunde 3c352fb460
bump pooler image and more coalescing for CRD config (#1004)
Co-authored-by: Felix Kunde <felix.kunde@zalando.de>
2020-06-05 11:14:17 +02:00
Kamil Solecki 9acdcd8bbf
Make selector match labels defined in the deployment (#1001)
Currently, the deployment manifest specifies two labels: `name` and `team`.
This fixes the service not matching the deployed pods by chosing a correct selector.
2020-06-04 16:49:22 +02:00
alfredw33 2b0def5bc8
Support for GCS WAL-E backups (#620)
* Support for WAL_GS_BUCKET and GOOGLE_APPLICATION_CREDENTIALS environtment variables

* Fixed merge issue but also removed all changes to support macos.

* Updated test to new format

* Missed macos specific changes

* Added documentation and addressed comments

* Update docs/administrator.md

* Update docs/administrator.md

* Update e2e/run.sh

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-06-03 17:33:48 +02:00
Steffen Pøhner Henriksen 0fa61a6ab3
Changed order of sidecar env vars (#980)
* Changed order of sidecar env vars

* Cleaned up test code
2020-05-25 16:32:33 +02:00
Felix Kunde 3a49b485e5
delete secrets of system users too (#974) 2020-05-14 11:34:02 +02:00
Christian Rohmann 8ff7658ed3
Fix pooler delete (#960)
deleteConnectionPooler function incorrectly checks that the delete api response is ResourceNotFound. Looks like the only consequence is a confusing log message, but obviously it's wrong. Remove negation, since having ResourceNotFound as error is the good case.

Co-authored-by: Christian Rohmann <christian.rohmann@inovex.de>
2020-05-13 14:55:54 +02:00
Ask Bjørn Hansen 852f29274a
Fix typo in error message (#969) 2020-05-12 10:05:42 +02:00
Damiano Albani a5bb8d913c
Fix typo (#965) 2020-05-12 09:20:09 +02:00
Felix Kunde 62bde6faa2
fix env var in UI chart (#967)
* fix env var in UI chart

* re-include 1.4.0 in helm chart index

* fix import in UI main.py and updated images
2020-05-08 13:02:27 +02:00
Felix Kunde bb3d2fa678
Bump v1.5.0 (#954)
* bump to v1.5.0

* update helm charts and docs

* update helm charts and packages

* update images for spilo, logical-backup and pooler
2020-05-05 12:52:54 +02:00
Felix Kunde 76d43525f7
define more default values for opConfig CRD (#955) 2020-05-04 16:23:21 +02:00
Rafia Sabih d52296c323
Propagate annotations to the StatefulSet (#932)
* Initial commit

* Corrections

- set the type of the new  configuration parameter to be array of
  strings
- propagate the annotations to statefulset at sync

* Enable regular expression matching

* Improvements

-handle rollingUpdate flag
-modularize code
-rename config parameter name

* fix merge error

* Pass annotations to connection pooler deployment

* update code-gen

* Add documentation and update manifests

* add e2e test and introduce option in configmap

* fix service annotations test

* Add unit test

* fix e2e tests

* better key lookup of annotations tests

* add debug message for annotation tests

* Fix typos

* minor fix for looping

* Handle update path and renaming

- handle the update path to update sts and connection pooler deployment.
  This way no need to wait for sync
- rename the parameter to downscaler_annotations
- handle other review comments

* another try to fix python loops

* Avoid unneccessary update events

* Update manifests

* some final polishing

* fix cluster_test after polishing

Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-05-04 14:46:56 +02:00
Petr Barborka be208b61f1
Fix S3 backup list (#880)
* Fix S3 backup list

Co-authored-by: Petr Barborka <petr.barborka@orgis.cz>
2020-04-30 17:10:16 +02:00
Felix Kunde 5af4379118
[UI] add toggle for connection pooler (#953)
* [UI] add toggle for connection pooler

* remove team service logger

* fix new.tag.pug and change port in Makefile
2020-04-30 09:58:07 +02:00
Felix Kunde 865d5b41a7
set event broadcasting to Infof and update rbac (#952) 2020-04-29 17:26:46 +02:00
Felix Kunde d76203b3f9
Bootstrapped databases with best practice role setup (#843)
* PreparedDatabases with default role setup

* merge changes from master

* include preparedDatabases spec check when syncing databases

* create a default preparedDB if not specified

* add more default privileges for schemas

* use empty brackets block for undefined objects

* cover more default privilege scenarios and always define admin role

* add DefaultUsers flag

* support extensions and defaultUsers for preparedDatabases

* remove exact version in deployment manifest

* enable CRD validation for new field

* update generated code

* reflect code review

* fix typo in SQL command

* add documentation for preparedDatabases feature + minor changes

* some datname should stay

* add unit tests

* reflect some feedback

* init users for preparedDatabases also on update

* only change DB default privileges on creation

* add one more section in user docs

* one more sentence
2020-04-29 10:56:06 +02:00
Sergey Dudoladov cc635a02e3
Lazy upgrade of the Spilo image (#859)
* initial implementation

* describe forcing the rolling upgrade

* make parameter name more descriptive

* add missing pieces

* address review

* address review

* fix bug in e2e tests

* fix cluster name label in e2e test

* raise test timeout

* load spilo test image

* use available spilo image

* delete replica pod for lazy update test

* fix e2e

* fix e2e with a vengeance

* lets wait for another 30m

* print pod name in error msg

* print pod name in error msg 2

* raise timeout, comment other tests

* subsequent updates of config

* add comma

* fix e2e test

* run unit tests before e2e

* remove conflicting dependency

* Revert "remove conflicting dependency"

This reverts commit 65fc09054b.

* improve cdp build

* dont run unit before e2e tests

* Revert "improve cdp build"

This reverts commit e2a8fa12aa.

Co-authored-by: Sergey Dudoladov <sergey.dudoladov@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-04-29 10:07:14 +02:00
Marcus Portmann 0016ebf7df
Allow nodePort value for the postgres-operator-ui service (#928)
* Added support for specifying a nodePort value for the postgres-operator-ui service when the type is NodePort

Co-authored-by: Marcus Portmann <marcusp@discovery.co.za>
2020-04-29 07:28:35 +02:00
Felix Kunde 1d009d9595
bump spilo and pooler version + update docs (#945) 2020-04-28 16:01:13 +02:00
Sergey Dudoladov 0ca30ba3d9
fix params in function call (#939)
Co-authored-by: Sergey Dudoladov <sergey.dudoladov@zalando.de>
2020-04-28 09:31:41 +02:00
Björn Fischer 168abfe37b
Fully speced global sidecars (#890)
* implement fully speced global sidecars

* fix issue #924
2020-04-27 17:40:22 +02:00
siku4 f32c615a53
fix typo in additionalVolume struct (#933)
* fix typo in additionalVolume struct

Co-authored-by: siku4 <sk@sik-net.de>
2020-04-27 12:22:42 +02:00
Christian Rohmann 21b9b6fcbe
Emit K8S events to the postgresql CR as feedback to the requestor / user (#896)
* Add EventsGetter to KubeClient to enable to sending K8S events

* Add eventRecorder to the controller, initialize it and hand it down to cluster via its constructor to enable it to emit events this way

* Add first set of events which then go to the postgresql custom resource the user interacts with to provide some feedback

* Add right to "create" events to operator cluster role

* Adapt cluster tests to new function sigurature with eventRecord (via NewFakeRecorder)

* Get a proper reference before sending events to a resource

Co-authored-by: Christian Rohmann <christian.rohmann@inovex.de>
2020-04-27 08:22:07 +02:00
Sergey Dudoladov 3c91bdeffa
Re-create pods only if all replicas are running (#903)
* adds a Get call to Patroni interface to fetch state of a Patroni member
* postpones re-creating pods if at least one replica is currently being created  

Co-authored-by: Sergey Dudoladov <sergey.dudoladov@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-04-20 15:14:11 +02:00
ReSearchITEng 5014eebfb2
when kubernetes_use_configmaps -> skip further endpoints actions even delete (#921)
* further compatibility with k8sUseConfigMaps - skip further endpoints related actions

* Update pkg/cluster/cluster.go

thanks!

Co-Authored-By: Felix Kunde <felix-kunde@gmx.de>

* Update pkg/cluster/cluster.go

Co-Authored-By: Felix Kunde <felix-kunde@gmx.de>

* Update pkg/cluster/cluster.go

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-04-16 16:47:59 +02:00
Dmitry Dolgov 6a689cdc1c
Prevent empty syncs (#922)
There is a possibility to pass nil as one of the specs and an empty spec
into syncConnectionPooler. In this case it will perfom a syncronization
because nil != empty struct. Avoid such cases and make it testable by
returning list of syncronization reasons on top together with the final
error.
2020-04-16 15:14:31 +02:00
ReSearchITEng 7e8f6687eb
make tls pr798 use additionalVolumes capability from pr736 (#920)
* make tls pr798 use additionalVolumes capability from pr736

* move the volume* sections lower

* update helm chart crds and docs

* fix user.md typos
2020-04-15 15:24:55 +02:00
Thierry Sallé ea3eef45d9
Additional volumes capability (#736)
* Allow additional Volumes to be mounted

* added TargetContainers option to determine if additional volume need to be mounter or not

* fixed dependencies

* updated manifest additional volume example

* More validation

Check that there are no volume mount path clashes or "all" vs ["a", "b"]
mixtures. Also change the default behaviour to mount to "postgres"
container.

* More documentation / example about additional volumes

* Revert go.sum and go.mod from origin/master

* Declare addictionalVolume specs in CRDs

* fixed k8sres after rebase

* resolv conflict

Co-authored-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Co-authored-by: Thierry <thierry@malt.com>
2020-04-15 09:13:35 +02:00
Dmitry Dolgov a1f2bd05b9
Prevent superuser from being a connection pool user (#906)
* Protected and system users can't be a connection pool user

It's not supported, neither it's a best practice. Also fix potential null
pointer access. For protected users it makes sense by intent of protecting this
users (e.g. from being overriden or used as something else than supposed). For
system users the reason is the same as for superuser, it's about replicastion
user and it's under patroni control.

This is implemented on both levels, operator config and postgresql manifest.
For the latter we just use default name in this case, assuming that operator
config is always correct. For the former, since it's a serious
misconfiguration, operator panics.
2020-04-09 09:21:45 +02:00
ReSearchITEng 7232326159
Fix val docs (#901)
* missing quotes in pooler configmap in values.yaml

* missing quotes in pooler configmap in values-crd.yaml

* docs clarifications

* helm3 --skip-crds

* Update docs/user.md

Co-Authored-By: Felix Kunde <felix-kunde@gmx.de>

* details moved in docs

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-04-09 09:16:45 +02:00
Leon Albers 4dee8918bd
Allow configuration of patroni's replication mode (#869)
* Add patroni parameters for `synchronous_mode`

* Update complete-postgres-manifest.yaml, removed quotation marks

* Update k8sres_test.go, adjust result for `Patroni configured`

* Update k8sres_test.go, adjust result for `Patroni configured`

* Update complete-postgres-manifest.yaml, set synchronous mode to false in this example

* Update pkg/cluster/k8sres.go

Does the same but is shorter. So we fix that it if you like.

Co-Authored-By: Felix Kunde <felix-kunde@gmx.de>

* Update docs/reference/cluster_manifest.md

Co-Authored-By: Felix Kunde <felix-kunde@gmx.de>

* Add patroni's `synchronous_mode_strict`

* Extend `TestGenerateSpiloConfig` with `SynchronousModeStrict`

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-04-06 14:27:17 +02:00
Felix Kunde 64389b8bad
update image and docs for connection pooler (#898) 2020-04-03 16:28:36 +02:00
ReSearchITEng 1249626a60
kubernetes_use_configmap (#887)
* kubernetes_use_configmap

* Update manifests/postgresql-operator-default-configuration.yaml

Co-Authored-By: Felix Kunde <felix-kunde@gmx.de>

* Update manifests/configmap.yaml

Co-Authored-By: Felix Kunde <felix-kunde@gmx.de>

* Update charts/postgres-operator/values.yaml

Co-Authored-By: Felix Kunde <felix-kunde@gmx.de>

* go.fmt

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-04-02 13:20:45 +02:00
Felix Kunde b43b22dfcc
Call me pooler, not pool (#883)
* rename pooler parts and add example to manifest
* update codegen
* fix manifest and add more details to docs
* reflect renaming also in e2e tests
2020-04-01 10:34:03 +02:00
Felix Kunde e6eb10d28a
fix TestTLS (#894) 2020-04-01 10:31:31 +02:00
ReSearchITEng 6ed1030838
TLS - add OpenShift compatibility (#885)
* solves https://github.com/zalando/postgres-operator/pull/798#issuecomment-605201260
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-04-01 09:39:54 +02:00
Felix Kunde 64d816c556
add short sleep before redistributing pods (#891) 2020-03-31 11:47:39 +02:00
Felix Kunde 66f2cda87f
Move operator to go 1.14 (#882)
* update go modules march 2020
* update to GO 1.14
* reflect k8s client API changes
2020-03-30 15:50:17 +02:00
Felix Kunde ba9cf68650
Change type of pod environment config map to NamespacedName (#870)
* allow PodEnvironmentConfigMap in other namespaces
* update codegen
* update docs and comments
2020-03-25 15:59:31 +01:00
Dmitry Dolgov 9dfa433363
Connection pooler (#799)
Connection pooler support

Add support for a connection pooler. The idea is to make it generic enough to
be able to switch between different implementations (e.g. pgbouncer or
odyssey). Operator needs to create a deployment with pooler and a service for
it to access.

For connection pool to work properly, a database needs to be prepared by
operator, namely a separate user have to be created with an access to an
installed lookup function (to fetch credential for other users).

This setups is supposed to be used only by robot/application users. Usually a
connection pool implementation is more CPU bounded, so it makes sense to create
several pods for connection pool with more emphasize on cpu resources. At the
moment there are no special affinity or tolerations assigned to bring those
pods closer to the database. For availability purposes minimal number of
connection pool pods is 2, ideally they have to be distributed between
different nodes/AZ, but it's not enforced in the operator itself. Available
configuration supposed to be ergonomic and in the normal case require minimum
changes to a manifest to enable connection pool. To have more control over the
configuration and functionality on the pool side one can customize the
corresponding docker image.

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-03-25 12:57:26 +01:00
Felix Kunde 579f78864b
pass cluster labels as JSON to Spilo (#877) 2020-03-25 09:59:54 +01:00
Felix Kunde cc1ffdc7b6
enable controllerID for chart and allow configurable pod cluster role (#876) 2020-03-25 09:31:30 +01:00
Felix Kunde 07c5da35e3
fix minor issues in docs and manifests (#866)
* fix minor issues in docs and manifests
* double retry_timeout_sec
2020-03-18 15:02:13 +01:00
Fredrik Østrem 9ddee8f302
Use cryptographically secure password generation (#854)
The current password generation algorithm is extremely deterministic, due to being based on the standard random number generator with a deterministic seed based on the current Unix timestamp (in seconds).

This can lead to a number of security issues, including:

The same passwords being used in different Kubernetes clusters if the operator is deployed in parallel. (This issue was discovered because of four deployments having the same generated passwords due to automatically being deployed in parallel.)
The passwords being easily guessable based on the time the operator pod started when the database was created. (This would typically be present in logs, metrics, etc., that may typically be accessible to more people than should have database access.)
Fix this issue by replacing the current randomness source with crypto/rand, which should produce cryptographically secure random data that is virtually unguessable. This will avoid both of the above problems as each deployment will be guaranteed to have unique, indeterministic passwords.
2020-03-18 10:28:39 +01:00
Felix Kunde cf829df1a4
define ownership between operator and clusters via annotation (#802)
* define ownership between operator and postgres clusters
* add documentation
* add unit test
2020-03-17 16:34:31 +01:00
Dmitry Dolgov d666c52172
ClusterDomain default (#863)
* add json:omitempty option to ClusterDomain

* Add default value for ClusterDomain

Unfortunately, omitempty in operator configuration CRD doesn't mean that
defauls from operator config object will be picked up automatically.
Make sure that ClusterDomain default is specified, so that even when
someone will set cluster_domain = "", it will be overwritted with a
default value.

Co-authored-by: mlu42 <mlu42pro@gmail.com>
2020-03-13 11:51:39 +01:00
Felix Kunde b66734a0a9
omit PgVersion diff on sync (#860)
* use PostgresParam.PgVersion everywhere
* on sync compare pgVersion with SpiloConfiguration
* update getNewPgVersion and added tests
2020-03-13 11:48:19 +01:00
zimbatm 65fb2ce1a6
add support for custom TLS certificates (#798)
* add support for custom TLS certificates
2020-03-13 11:44:38 +01:00
grantlanglois 650b8daf77
add json:omitempty option to ClusterDomain (#851)
Co-authored-by: mlu42 <mlu42pro@gmail.com>
2020-03-12 12:12:53 +01:00