Commit Graph

523 Commits

Author SHA1 Message Date
Jonathan Juares Beber ba60e15d07 Add ServiceAnnotations cluster config (#803)
The [operator parameters][1] already support the
`custom_service_annotations` config.With this parameter is possible to
define custom annotations that will be used on the services created by the
operator. The `custom_service_annotations` as all the other
[operator parameters][1] are defined on the operator level and do not allow
customization on the cluster level. A cluster may require different service
annotations, as for example, set up different cloud load balancers
timeouts, different ingress annotations, and/or enable more customizable
environments.

This commit introduces a new parameter on the cluster level, called
`serviceAnnotations`, responsible for defining custom annotations just for
the services created by the operator to the specifically defined cluster.
It allows a mix of configuration between `custom_service_annotations` and
`serviceAnnotations` where the latest one will have priority. In order to
allow custom service annotations to be used on services without
LoadBalancers (as for example, service mesh services annotations) both
`custom_service_annotations` and `serviceAnnotations` are applied
independently of load-balancing configuration. For retro-compatibility
purposes, `custom_service_annotations` is still under
[Load balancer related options][2]. The two default annotations when using
LoadBalancer services, `external-dns.alpha.kubernetes.io/hostname` and
`service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout` are
still defined by the operator.
`service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout` can
be overridden by `custom_service_annotations` or `serviceAnnotations`,
allowing a more customizable environment.
`external-dns.alpha.kubernetes.io/hostname` can not be overridden once
there is no differentiation between custom service annotations for
replicas and masters.

It updates the documentation and creates the necessary unit and e2e
tests to the above-described feature too.

[1]: https://github.com/zalando/postgres-operator/blob/master/docs/reference/operator_parameters.md
[2]: https://github.com/zalando/postgres-operator/blob/master/docs/reference/operator_parameters.md#load-balancer-related-options
2020-02-10 12:03:25 +01:00
Vito Botta a660d758a5 Add region setting for logical backups to non-AWS storage (#813)
* Add region setting for logical backups to non-AWS storage
2020-02-10 11:48:24 +01:00
Felix Kunde 1f0312a014
make minimum limits boundaries configurable (#808)
* make minimum limits boundaries configurable
* add e2e test
2020-02-03 11:43:18 +01:00
Felix Kunde 7fb163252c
standby clusters can only have 1 pod for now (#797) 2020-01-16 10:47:34 +01:00
Felix Kunde cd110aabf4
Enforce minimum cpu and memory limits (#731)
* add validation for PG resources and volume size
* check resource requests also on UPDATE and SYNC + update docs
* if cluster was running don't error on sync
2019-12-12 16:43:55 +01:00
Felix Kunde 107334fe71
Add global option to enable/disable init containers and sidecars (#478)
* Add global option to enable/disable init containers and sidecars
* update dependencies
2019-12-10 15:45:54 +01:00
Armin Nesiren 5f87384d7f Passing endpoint, access and secret key to logical-backup container (#628)
* Added possibility to add custom annotations to LoadBalancer service.

* Added parameters for custom endpoint, access and secret key for logical backup.

* Modified dump.sh so it knows how to handle new features. Configurable S3 SSE
2019-11-26 10:40:49 +01:00
Felix Kunde 2ce602fcd7 fix errors when changing service type (#716)
* fix errors when changing service type

* nullify service and endpoint before recreation

* improve wait for delete logic and reuse config parameters
2019-11-26 10:28:32 +01:00
Felix Kunde f9487e41c1 inject cluster name label into logical backup pod (#725)
* inject cluster name label into logical backup pod
2019-11-20 13:58:41 +01:00
Felix Kunde 0b544ae43f
pass additionalSecretMount to logical backup pod (#714) 2019-11-19 18:06:55 +01:00
Thomas Runyon 535517cd1b Custom annotations 329 (#657)
* Add ability for custom annotations to database pods
2019-11-11 10:45:35 +01:00
Eric 6e682fd6b5 Fixing spelling mistake in delete PVC function name (#691) 2019-10-18 16:41:56 +02:00
Felix Kunde f0e29060b1
move StatefulSet to apps/v1 (#675) 2019-09-30 16:42:04 +02:00
Felix Kunde 4a863d2280 Avoid orphaned objects on delete (#654)
* Make setSpec function work correctly when updating cluster status fails
2019-08-27 12:54:35 +02:00
Felix Kunde 1d45a6aec3
change app label for logical backup pod (#621)
* change app label for logical backup pod
2019-07-23 15:43:07 +02:00
Felix Kunde 2c3c7fd244
query namespaced K8s API in logical backup script (#623) 2019-07-18 14:00:30 +02:00
Felix Kunde 3a914f9a3c
camelCasing all manifest parameters (#602)
* deprecate snake_case manifest parameters
* move backward compatible check and update test
2019-07-05 18:14:03 +02:00
Felix Kunde 36003b8264
enable shmVolume setting in OperatorConfiguration (#605)
* enable shmVolume setting in OperatorConfiguration
2019-07-05 16:48:37 +02:00
Rafia Sabih 540d58d5bd
Adding the support for standby cluster
This will set up a continuous wal streaming cluster, by adding the corresponding section in postgres manifest. Instead of having a full-fledged standby cluster as in Patroni, here we use only the wal path of the source cluster and stream from there.

Since, standby cluster is streaming from the master and does not require to create or use databases of it's own. Hence, it bypasses the creation of users or databases.

There is a separate sample manifest added to set up a standby-cluster.
2019-06-21 10:11:39 +02:00
Markus 93bfed3e75 Add secret mount to operator (#535)
* add secret mount to operator
2019-06-19 12:40:49 +02:00
Felix Kunde 6918394562
Add PDB configuration toggle (#583)
* Don't create an impossible disruption budget for smaller clusters.
* sync PDB also on update
2019-06-18 10:48:21 +02:00
Maxim Ivanov 3553144cda Support subPath in generated container (#452)
* mounted volumes now provide a subPath
2019-06-17 15:49:01 +02:00
Erik Inge Bolsø c65a9baedf specify ReadOnlyRootFilesystem: false for pod security policies (#560)
Explicitly specify ReadOnlyRootFilesystem: false so kubernetes can pick
a less restrictive policy the operator has access to.
2019-06-17 14:03:33 +02:00
teuto.net Netzdienste GmbH bbf28c4df7 Add additional S3 settings for cloning (#497) 2019-06-14 12:28:00 +02:00
Rafia Sabih 2886027516
Some typos/spelling mistakes fix (#580)
Harmless typos fix.
2019-06-06 14:20:15 +02:00
Aaron Miller ec5b1d4d58 StatefulSet fsGroup config option to allow non-root spilo (#531)
* StatefulSet fsGroup config option to allow non-root spilo

* Allow Postgres CRD to overide SpiloFSGroup of the Operator.

* Document FSGroup of a Pod cannot be changed after creation.
2019-06-04 16:38:26 +02:00
Erik Inge Bolsø ebda39368e database.go: remove hardcoded .svc.cluster.local dns suffix (#561)
* database.go: substitute hardcoded .svc.cluster.local dns suffix with config parameter

Use the pod's configured dns search path, for clusters where .svc.cluster.local is not correct.
2019-05-31 16:32:00 +02:00
Felix Kunde 24d412a562
generate spilo config can return error (with test) (#570)
* fix: raise explicit error when failing to generate spilo config
Signed-off-by: Stephane Tang <hi@stang.sh>
2019-05-22 17:35:03 +02:00
Stephane T 1f4267eb05 fix: remove headless service config when deleting cluster (#567)
see: https://github.com/zalando/postgres-operator/issues/566

Signed-off-by: Stephane Tang <hi@stang.sh>
2019-05-21 13:49:34 +02:00
Sergey Dudoladov f3e1e80aaf
Add logical backup (#442)
* Add k8s cron job to spawn logical backups

* Minor doc updates
2019-05-16 15:52:01 +02:00
Sergey Dudoladov 2c02b371e2
fix statefulset sync (#563) 2019-05-14 11:15:47 +02:00
Dmitry Dolgov f29bdaf96a
Override clone s3 bucket path (#487)
Override clone s3 bucket path

Add possibility to use a custom s3 bucket path for cloning a cluster
from an arbitrary bucket (e.g. from another k8s cluster). For that
a new config options is introduced `s3_wal_path`, that should point
to a location that spilo would understand.
2019-05-10 12:52:42 +02:00
Felix Kunde 0fbfbb23bb
Use /status subresource instead of plain manifest field (#534)
* turns PostgresStatus type into a struct with field PostgresClusterStatus
* setStatus patch target is now /status subresource
* unmarshalling PostgresStatus takes care of previous status field convention
* new simple bool functions status.Running(), status.Creating()
2019-05-07 12:01:45 +02:00
Aaron Miller 15ec6a920d Config option to allow Spilo container to run non-privileged. (#525)
* Config option to allow Spilo container to run non-privileged.

Runs non-privileged by default.

Fixes #395

* add spilo_privileged to manifests/configmap.yaml

* add spilo_privileged to helm chart's values.yaml
2019-04-03 17:13:39 +02:00
Stephane T edeb06d39c fix: update init_containers (#518)
* fix: PATH expension in Makefile

Signed-off-by: Stephane Tang <hi@stang.sh>

* refact: pass list of containers to compareContainers()

Signed-off-by: Stephane Tang <hi@stang.sh>

* compare initContainers while comparing StatefulSet

  Fixes #517

Signed-off-by: Stephane Tang <hi@stang.sh>

* refact: compareContainers()

Signed-off-by: Stephane Tang <hi@stang.sh>
2019-03-19 17:46:12 +01:00
Sergey Dudoladov 0b53dbe5dc
Set statefulset update and management policy explicitly (#515)
* fix logging in retry

* explicitly set the stateful set update strategy to onDelete

* add podManagementPolicy
2019-03-13 11:49:18 +01:00
Vineeth Reddy db72d82f14 gofmt and golint fixes (#506)
* fix gofmt and golint issues
2019-03-04 13:13:55 +01:00
Sergey Dudoladov 587d9091e7
Set HUMAN_ROLE Spilo env var (#409)
* Set HUMAN_ROLE Spilo env var
2019-02-27 13:40:42 +01:00
Felix Kunde 31e568157b reflect change in github url (#496)
Project was moved from the incubator to the Zalando main org, hence the rename
2019-02-25 11:26:55 +01:00
teuto.net Netzdienste GmbH 26a7fdfa9f Add Pod Anti Affinity (#489)
* Add Pod Anti Affinity
2019-02-21 16:37:03 +01:00
Stephane T d11b23bd71 Add inherited_labels (#459)
* add support for inherited_labels

Signed-off-by: Stephane Tang <hi@stang.sh>

* update docs with inherited_labels

Signed-off-by: Stephane Tang <hi@stang.sh>
2019-02-14 12:29:06 +01:00
Maxim Ivanov ed6acc1178 Correctly report success in .status on Update (#469) 2019-01-31 13:09:17 +01:00
Maxim Ivanov 3544cc90fa Allow specifying init_containers in Postgres CRD (#445)
* Add support for init_containers
2019-01-29 11:08:44 +01:00
Armin Nesiren 6f6a599c90 Added possibility to add custom annotations to LoadBalancer service. (#461)
* Added possibility to add custom annotations to LoadBalancer service.
2019-01-25 11:35:27 +01:00
Maxim Ivanov 8330905ce7 Don't panic if Service for the role was not found (#451) 2019-01-18 13:38:47 +01:00
Jan Mussler c70905ae8b Modifying some of the logging to be more descriptive. (#440)
* Modifying some of the logging to be more descriptive.
2019-01-08 13:07:36 +01:00
zerg-junior 4b5d3cd121
Fix golint failures
* Fix golint fails based on the original work from  the user u5surf

* Skip installing Docker as CDP now have one pre-installed (repairs builds on CDP)
2019-01-08 13:04:48 +01:00
Arve Knudsen f7058c754d Pass more variables to Spilo container (#437)
Pass KUBERNETES_SCOPE_LABEL, KUBERNETES_ROLE_LABEL and KUBERNETES_LABELS
to spilo container, so that they could be changed. Fix for #411
2019-01-04 13:42:52 +01:00
zerg-junior 5cfcc453a9
Update CRD configuration docs and fix the CDP build (#414)
* Update CRD configuration docs

* document resource consumption of the operator

* Add talks by Oleksii
2019-01-02 12:01:47 +01:00
zerg-junior c0b0b9a832
[WIP] Add 'admin' option to create role (#425)
* Add 'admin' option to create role

* Fix run_locally_script
2018-12-27 10:14:33 +01:00
Dmitry Dolgov d6e6b00770
Add shm_volume option (#427)
Add possibility to mount a tmpfs volume to /dev/shm to avoid issues like
[this](https://github.com/docker-library/postgres/issues/416). To achieve that
two new options were introduced:

* `enableShmVolume` to PostgreSQL manifest, to specify whether or not mount
this volume per database cluster

* `enable_shm_volume` to operator configuration, to specify whether or not mount
per operator.

The first one, `enableShmVolume` takes precedence to allow us to be more flexible.
2018-12-21 16:22:30 +01:00
zerg-junior 45c89b3da4
[WIP] Add set_memory_request_to_limit option (#406)
* Add set_memory_request_to_limit option
2018-11-15 14:00:08 +01:00
zerg-junior 96e3ea9511
Properly overwrite empty allowed source ranges for load balancers (#392)
* Properly overwrite empty allowed source ranges for load balancers
2018-11-06 11:08:45 +01:00
zerg-junior 86ba92ad02
Rename 'permanent_slots' field to 'slots' (#401) 2018-10-31 16:11:28 +01:00
zerg-junior 1b4181a724
[WIP] Add the ability to configure replications slots in Patroni (#398)
* Add the ability to configure replication slots in Patroni

* Add debugging  to Makefile for CDP builds
2018-10-31 13:10:56 +01:00
zerg-junior 7907f95d2f
Improve reporting about rolling updates (#391) 2018-09-24 11:57:43 +02:00
Noah Kantrowitz 688d252752 Some tweaks to ensure compat with newer Go. (#383) 2018-09-17 10:13:07 +02:00
Noah Kantrowitz 0b75a89920 Fix the casing of github.com/Sirupsen/logrus to match what the project itself uses. (#380)
Dep enforces this.
2018-09-06 10:26:48 +02:00
zerg-junior 25fa45fd58 [WIP] Grant 'superuser' to the members of Postgres admin teams (#371)
Added support for superuser team in addition to the admin team that owns the postgres cluster.
2018-08-30 10:51:37 +02:00
zerg-junior aeae0a6ef2
Use cluster's own namespace to patch the cluster manifest (#373) 2018-08-22 11:07:12 +02:00
Oleksii Kliukin e1ed4b847d
Use code-generation for CRD API and deepcopy methods (#369)
Client-go provides a https://github.com/kubernetes/code-generator package in order to provide the API to work with CRDs similar to the one available for built-in types, i.e. Pods, Statefulsets and so on.

Use this package to generate deepcopy methods (required for CRDs), instead of using an external deepcopy package; we also generate APIs used to manipulate both Postgres and OperatorConfiguration CRDs, as well as informers and listers for the Postgres CRD, instead of using generic informers and CRD REST API; by using generated code we can get rid of some custom and obscure CRD-related code and use a better API.

All generated code resides in /pkg/generated, with an exception of zz_deepcopy.go in apis/acid.zalan.do/v1

Rename postgres-operator-configuration CRD to OperatorConfiguration, since the former broke naming convention in the code-generator.

Moved Postgresql, PostgresqlList, OperatorConfiguration and OperatorConfigurationList and other types used by them into

Change the type of  the Error field in the Postgresql crd to a string, so that client-go could generate a deepcopy for it.

Use generated code to set status of CRD objects as well. Right now this is done with patch, however, Kubernetes 1.11 introduces the /status subresources, allowing us to set the status with
the special updateStatus call in the future. For now, we keep the code that is compatible with earlier versions of Kubernetes.

Rename postgresql.go to database.go and status.go to logs_and_api.go to reflect the purpose of each of those files.

Update client-go dependencies.

Minor reformatting and renaming.
2018-08-15 17:22:25 +02:00
Oleksii Kliukin e933908084
Configure pg_hba in the local postgresql configuration of Patroni. (#361)
Previously, the operator put pg_hba into the bootstrap/pg_hba key of
Patroni. That had 2 adverse effects:
 - pg_hba.conf was shadowed by Spilo default section in the local
   postgresql configuration
 - when updating pg_hba in the cluster manifest, the updated lines were
   not propagated to DCS, since the key was defined in the boostrap
   section of Patroni.

Include some minor refactoring, moving methods to unexported when
possible and commenting out usage of md5, so that gosec won't complain.

Per https://github.com/zalando-incubator/postgres-operator/issues/330

Review by @zerg-junior
2018-08-08 11:01:26 +02:00
Oleksii Kliukin acf46bfa62
Include CREATEROLE to the list of allowed flags. (#365)
Previously it has been supported by the operator, but the validity check
excluded it for no reason.
2018-08-08 10:53:08 +02:00
Oleksii Kliukin b06186eb41
Linter-induced code refactoring, run round 2. (#360)
Run more linters in the gometalinter, i.e. deadcode, megacheck,
nakedret, dup.

More consistent code formatting, remove two dead functions, eliminate
naked a bunch of naked returns, refactor a few functions to avoid code
duplication.
2018-08-06 12:09:19 +02:00
Oleksii Kliukin 59f0c5551e
Allow configuring pod priority globally and per cluster. (#353)
* Allow configuring pod priority globally and per cluster.

Allow to specify pod priority class for all pods managed by the operator,
as well as for those belonging to individual clusters.

Controlled by the pod_priority_class_name operator configuration
parameter and the podPriorityClassName manifest option.

See https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass
for the explanation on how to define priority classes since Kubernetes 1.8.

Some import order changes are due to go fmt.
Removal of OrphanDependents deprecated field.

Code review by @zerg-junior
2018-08-03 14:03:37 +02:00
Oleksii Kliukin ac7b132314
Refactoring inspired by gometalinter. (#357)
Among other things, fix a few issues with deepcopy implementation.
2018-08-03 11:09:45 +02:00
Oleksii Kliukin d2d3f21dc2 Client go upgrade v6 (#352)
There are shortcuts in this code, i.e. we created the deepcopy function
by using the deepcopy package instead of the generated code, that will
be addressed once migrated to client-go v8. Also, some objects,
particularly statefulsets, are still taken from v1beta, this will also
be addressed in further commits once the changes are stabilized.
2018-08-01 11:08:01 +02:00
Oleksii Kliukin 0181a1b5b1
Introduce a repair scan to fix failing clusters (#304)
A repair is a sync scan that acts only on those clusters that indicate
that the last add, update or sync operation on them has failed. It is
supposed to kick in more frequently than the repair scan. The repair
scan still remains to be useful to fix the consequences of external
actions (i.e. someone deletes a postgres-related service by mistake)
unbeknownst to the operator.

The repair scan is controlled by the new repair_period parameter in the
operator configuration. It has to be at least 2 times more frequent than
a sync scan to have any effect (a normal sync scan will update both last
synced and last repaired attributes of the controller, since repair is
just a sync underneath).

A repair scan could be queued for a cluster that is already being synced
if the sync period exceeds the interval between repairs. In that case a
repair event will be discarded once the corresponding worker finds out
that the cluster is not failing anymore.

Review by @zerg-junior
2018-07-24 11:21:45 +02:00
Oleksii Kliukin 1a0e5357dc
Improve generation of Scalyr container environment. (#346)
* Improve generting of Scalyr container environment.

Avoid duplicating POD_NAME and POD_NAMESPACE that already bundled
every sidecar.

Do not complain on the lack of SCLALYR_SERVER_HOST, since it is set to
https://upload.eu.scalyr.com in the container we use.

Do not mentioned SCALYR_SERVER_HOST in the error messages, since it is
derived from the cluster name automatically.
2018-07-24 11:16:24 +02:00
Oleksii Kliukin 12871aad1a
Avoid showing an extra error when resizing volume fails (#350)
Do not show 'persistent volumes are not compatible' errors for the
volumes that failed to be resized because of the other reasons (i.e.
the new size is smaller than the existing one).
2018-07-20 14:12:25 +02:00
zerg-junior 417f13c0bd
Submit RBAC credentials during initial Event processing (#344)
* During initial Event processing submit the service account for pods and bind it to a cluster role that allows Patroni to successfully start. The cluster role is assumed to be created by the k8s cluster administrator.
2018-07-19 16:40:40 +02:00
Oleksii Kliukin 3a9378d3b8
Allow configuring the operator via the YAML manifest. (#326)
* Up until now, the operator read its own configuration from the
configmap.  That has a number of limitations, i.e. when the
configuration value is not a scalar, but a map or a list. We use a
custom code based on github.com/kelseyhightower/envconfig to decode
non-scalar values out of plain text keys, but that breaks when the data
inside the keys contains both YAML-special elememtns (i.e. commas) and
complex quotes, one good example for that is search_path inside
`team_api_role_configuration`. In addition, reliance on the configmap
forced a flag structure on the configuration, making it hard to write
and to read (see
https://github.com/zalando-incubator/postgres-operator/pull/308#issuecomment-395131778).

The changes allow to supply the operator configuration in a proper YAML
file. That required registering a custom CRD to support the operator
configuration and provide an example at
manifests/postgresql-operator-default-configuration.yaml. At the moment,
both old configmap and the new CRD configuration is supported, so no
compatibility issues, however, in the future I'd like to deprecate the
configmap-based configuration altogether. Contrary to the
configmap-based configuration, the CRD one doesn't embed defaults into
the operator code, however, one can use the
manifests/postgresql-operator-default-configuration.yaml as a starting
point in order to build a custom configuration.

Since previously `ReadyWaitInterval` and `ReadyWaitTimeout` parameters
used to create the CRD were taken from the operator configuration, which
is not possible if the configuration itself is stored in the CRD object,
I've added the ability to specify them as environment variables
`CRD_READY_WAIT_INTERVAL` and `CRD_READY_WAIT_TIMEOUT` respectively.

Per review by @zerg-junior  and  @Jan-M.
2018-07-16 16:20:46 +02:00
Oleksii Kliukin e90a01050c
Switchover must wait for the inner goroutine before it returns. (#343)
* Switchover must wait for the inner goroutine before it returns.

Otherwise, two corner cases may happen:

 - waitForPodLabel writes to the podLabelErr channel that has been
   already closed by the outer routine

 - the outer routine exists and the caller subscribes to the pod
   the inner goroutine has already subscribed to, resulting in panic.

 The previous commit fe47f9ebea
 that touched that code added the cancellation channel, but didn't bother
 to actually wait for the goroutine to be cancelled.

 Per report and review from @valer-cara.
 Original issue: https://github.com/zalando-incubator/postgres-operator/issues/342
2018-07-16 11:50:35 +02:00
Oleksii Kliukin b7b950eb28
Use the StorageClassName field of the volumeClaimTemplate. (#338)
The old way of specifying it with the annotation is deprecated and not
available in recent Kubernetes versions. We will keep it there anyway
until upgrading to the new go-client that is incompatible with those
versions.

Per report from @schmitch
2018-07-16 11:49:58 +02:00
Oleksii Kliukin 25a306244f
Support for per-cluster and operator global sidecars (#331)
* Define sidecars in the operator configuration.

Right now only the name and the docker image can be defined, but with
the help of the pod_environment_configmap parameter arbitrary
environment variables can be passed to the sidecars.

* Refactoring around generatePodTemplate.

Original implementation of per-cluster sidecars by @theRealWardo 

Per review by @zerg-junior and @Jan-M
2018-07-02 16:25:27 +02:00
zerg-junior 7394c15d0a
Make AWS region configurable in the operator cofig map (#333) 2018-06-27 17:29:02 +02:00
Oleksii Kliukin 04b660519a Fix exec into pods to resize volumes for multi-container pods.
The original code assumed only one container per pod.
2018-06-04 14:51:39 +02:00
Oleksii Kliukin 48a5744314
Use Patroni API to set bootstrap-only options. (#299)
Call Patroni API /config in order to set special options that are
ignored when set in the configuration file, such as max_connections.
Per https://github.com/zalando-incubator/postgres-operator/issues/297

* Some minor refacoring:

Rename Cluster ManualFailover to Swithover
Rename Patroni Failover to Switchover
Add more details to error messages and comments introduced in this PR.

Review by @zerg-junior
2018-05-29 12:35:25 +02:00
Oleksii Kliukin 76ea754fc3 Be lenient when asked to shrink a persisten volume.
Do not hard error, emit a warning instead. The cluster is not going
to be broken because of our refusal to shrink a volume.
2018-05-24 11:17:42 +02:00
Oleksii Kliukin 1ea8b3bbe6 Fix a crash on node migration.
After an unsuccessful initial cluster sync it may happen that the
cluster statefulset is empty. This has been made more likely since
88d6a7be3, since it has introduced syncing volumes before statefulsets,
and the volume sync mail fail for different reasons (i.e. the volume has
been shrinked, or too many calls to Amazon).
2018-05-24 11:05:19 +02:00
Oleksii Kliukin e84ecb1d03 Address code review by @zerg-junior 2018-05-23 11:36:38 +02:00
Oleksii Kliukin f5550c337b Put special patroni parameters to the bootstrap.
Some special patroni postgresql parameters, like max_connections,
should reside in the bootstrap.dcs.postgresql.parameters section
to come into effect.
2018-05-22 18:27:12 +02:00
zerg-junior e6d12b3480
Merge pull request #295 from zalando-incubator/continue_on_delete_errors
Avoid terminating delete on errors.
2018-05-22 10:44:43 +02:00
Oleksii Kliukin 27c7245fed Avoid terminating delete on errors.
When there is an error happening upon deletion of the Kubernetes object
belonging to the cluster being removed, it makes no sense to abort the
deletion: the manifest will be removed anyway, therefore all the objects
after the one we aborted at will stay forever.
2018-05-18 18:10:37 +02:00
Oleksii Kliukin a8fdd3f2db Fix crash during sync.
Do not use statefulset number of pods to figure out running ones
for volume resizing, since the statefulset pointer could be nil.
Instead, look at the actual running pods.
2018-05-18 14:42:20 +02:00
Oleksii Kliukin 88d6a7be3f
Sync persistent volumes before statefulsets. (#293)
Avoid the condition of waiting for the pod that cannot start
PostgreSQL because it ran out of disk space.
2018-05-18 12:01:43 +02:00
Oleksii Kliukin 52ddcd25cc Sync persistent volumes before statefulsets.
Avoid the condition of waiting for the pod that cannot start
PostgreSQL because it ran out of disk space.
2018-05-18 11:43:45 +02:00
Oleksii Kliukin cf800aef90 Minor import fix 2018-05-15 16:53:12 +02:00
Oleksii Kliukin 11d568bf65 Address code review by @zerg-junior
- new info messages, rename the annotation flag.
2018-05-15 16:50:03 +02:00
Oleksii Kliukin 0c616a802f Merge branch 'master' into rolling_updates_with_statefulset_annotations
# Conflicts:
#	pkg/cluster/k8sres.go
2018-05-15 15:33:34 +02:00
Oleksii Kliukin 987b43456b
Deprecate old LB options, fix endpoint sync. (#287)
* Depreate old LB options, fix endpoint sync.

- deprecate useLoadBalancer, replicaLoadBalancer from the manifest
  and enable_load_balancer from the operator configuration. The old
  operator configuration options become no-op with this commit. For
  the old manifest options, `useLoadBalancer` and `replicaLoadBalancer`
  are still consulted,  but only in the absense of the new ones
  (enableMasterLoadBalancer and enableReplicaLoadBalancer).

- Make sure the endpoint being created during the sync receives proper
  addresses subset. This is more critical for the replicas, as for the
  masters Patroni will normally re-create the endpoint before the
  operator.

- Avoid creating the replica endpoint, since it will be created automatically
  by the corresponding service.
- Update the README and unit tests.

Code review by @mgomezch and @zerg-junior
2018-05-15 15:19:18 +02:00
Oleksii Kliukin 332dab5237 Merge branch 'rolling_updates_with_statefulset_annotations' of github.com:zalando-incubator/postgres-operator into rolling_updates_with_statefulset_annotations 2018-05-08 14:51:10 +02:00
Oleksii Kliukin f41a42f922 Merge branch 'rolling_updates_with_statefulset_annotations' of github.com:zalando-incubator/postgres-operator into rolling_updates_with_statefulset_annotations 2018-05-07 10:16:30 +02:00
Oleksii Kliukin ce0d4af91c Initial implementation for the statefulset annotations indicating rolling updates. 2018-05-07 08:07:37 +02:00
Oleksii Kliukin 1a20362c5b Initial implementation for the statefulset annotations indicating rolling updates. 2018-05-04 18:59:23 +02:00
Oleksii Kliukin 43a1db2128 Merge branch 'master' into pending_rolling_updates 2018-05-03 11:27:16 +02:00
Oleksii Kliukin fe47f9ebea
Improve the pod moving behavior during the Kubernetes cluster upgrade. (#281)
* Improve the pod moving behavior during the Kubernetes cluster upgrade.

Fix an issue of not waiting for at least one replica to become ready
(if the Statefulset indicates there are replicas) when moving the master
pod off the decomissioned node. Resolves the first part of #279.

Small fixes to error messages.

* Eliminate a race condition during the swithover.

When the operator initiates the failover (switchover) that fails and
then retries it for a second time it may happen that the previous
waitForPodChannel is still active. As a result, the operator subscribes
to the former master pod two times, causing a panic.

The problem was that the original code didn't bother to cancel the
waitForPodLalbel for the new master pod in the case when the failover
fails. This commit fixes it by adding a stop channel to that function.

Code review by @zerg-junior
2018-05-03 10:20:24 +02:00
Sergey Dudoladov 59ded0c212 Shorten bucket name 2018-05-02 14:05:57 +02:00
Sergey Dudoladov c45219bafa Set up an S3 bucket for the postgres daily logs 2018-05-02 12:52:42 +02:00
Oleksii Kliukin 37caa3f60b Fix a bug with syncing services
Avoid showing "there is no service in the cluster" when syncing a
service for the cluster if the operator has been restarted after
the cluster had been created.
2018-04-27 12:35:25 +02:00
zerg-junior 8f08bef67c
Merge pull request #277 from zalando-incubator/automatically-deploy-service-account
Deploy service account for pod creation on demand
2018-04-26 14:44:37 +02:00
Sergey Dudoladov 1b718fd4c2 Minor improvemets in reporting service account creation 2018-04-26 13:47:25 +02:00
Sergey Dudoladov d99b553ec1 Convert default account definiton into JSON 2018-04-25 12:35:16 +02:00
Sergey Dudoladov 485ec4b8ea Move service account to Controller 2018-04-24 15:13:08 +02:00
Sergey Dudoladov bc8b950da4 Tolerate issues of the Teams API 2018-04-23 16:31:53 +02:00
Sergey Dudoladov c31c76281c Make operator unaware of its own service account 2018-04-23 14:38:20 +02:00
Sergey Dudoladov 5daf0a4172 Fix error reporting during pod service account creation 2018-04-20 14:20:38 +02:00
Sergey Dudoladov bd51d2922b Turn ServiceAccount into struct value to avoid race conditon during account creation 2018-04-20 13:05:05 +02:00
Sergey Dudoladov 23f893647c Remove sync of pod service accounts 2018-04-19 15:48:58 +02:00
Sergey Dudoladov 214ae04aa7 Deploy service account for pod creation on demand 2018-04-18 16:20:20 +02:00
Oleksii Kliukin 0618723a61 Check rolling updates using controller revisions.
Compare pods controller revisions with the one for the statefulset
to determine whether the pod is running the latest revision and,
therefore, no rolling update is necessary. This is performed only
during the operator start, afterwards the rolling update status
that is stored locally in the cluster structure is used for all
rolling update decisions.
2018-04-09 18:07:24 +02:00
Manuel Gómez 88c68712b6
Fix statefulset label selector diffing (#273)
Otherwise, rolling updates are done unnecessarily.
2018-04-06 17:21:57 +02:00
Oleksii Kliukin 9bf80afa6b
Remove team from statefulset selector (#271)
* Remove 'team' label from the statefulset selector.

I was never supposed to be there, but implicitely statefulset
creates a selector out of meta.labels field. That is the problem
with recent Kubernetes, since statefulset cannot pick up pods
with non-matching label selectors, and we rely on statefulset
picking up old pods after statefulset replacement.

Make sure selector changes trigger replacement of the statefulset.

In the case new selector has more labels than the old one nothing
should be done with a statefulset, otherwise the new statefulset
won't see orphaned pods from the old one, as they won't match the
selector. 

See https://github.com/kubernetes/kubernetes/issues/46901#issuecomment-356418393
2018-04-06 13:58:47 +02:00
Oleksii Kliukin 26db91c53e
Improve infrastructure role definitions (#208)
Enhance definitions of infrastructure roles by allowing membership in multiple roles, role options and per-role configuration to be specified in the infrastructure role configmap, which must have the same name as the infrastructure role secret. See manifests/infrastructure-roles-configmap.yaml for the examples and updated README for the description of different types of database roles supposed by the operator and their purposes.

Change the logic of merging infrastructure roles with the manifest roles when they have the same name, to return the infrastructure role unchanged instead of merging. Previously, we used to propagate flags from the manifest role to the resulting infrastructure one, as there were no way to define flags for the infrastructure role; however, this is not the case anymore.

Code review and tests by @erthalion
2018-04-04 17:21:36 +02:00
zerg-junior d264be9faa
Merge pull request #261 from zalando-incubator/wal_bucket_scope_prefix
Fix clone for origins in non-default namespaces.
2018-04-03 17:47:18 +02:00
zerg-junior ff5793b584
Merge pull request #258 from zalando-incubator/always-create-replica-service
[WIP] Always create replica service
2018-03-29 14:42:26 +02:00
erthalion 8967a3be2c Add tests for load balancer function logic 2018-03-27 12:11:46 +02:00
Sergey Dudoladov ced770a827 Respond to code review 2018-03-26 11:07:32 +02:00
Sergey Dudoladov a8862aeee1 Enable backward compatibility for enable_load_balancer setting from operator configmap 2018-03-19 17:19:50 +01:00
Sergey Dudoladov 386d7b6bdb Implement backward compatibility with older load balancer settings 2018-03-16 13:27:38 +01:00
Sergey Dudoladov 20f30d3739 Update the method for deciding about load balancers 2018-03-14 12:46:58 +01:00
Sergey Dudoladov 0986e56226 Add separate params for master and replica load balancers to operator configuration 2018-03-14 12:12:28 +01:00
Sergey Dudoladov ac6c5bcf09 Explicitly name replica and master load balancer params in PostgresSpec 2018-03-14 12:03:27 +01:00
Sergey Dudoladov 5bc5e70c81 Log if replica service has no load balancer 2018-03-12 16:48:44 +01:00
Sergey Dudoladov 5ff562a607 Minor improvements 2018-03-02 14:03:41 +01:00
Sergey Dudoladov 2aeff096f7 Make ReplicaLoadBalancer a separate toggler 2018-03-02 13:35:25 +01:00
Oleksii Kliukin 59a214727c Fix clone for origins in non-default namespaces.
By default, spilo sets WAL_BUCKET_SCOPE_PREFIX depending on the cluster
namespace, possibly to a non-empty string. However, we won't be able to
clone those clusters, as the clone prefix is always set to an empty string.

We could go the other way around and set both WAL_BUCKET_SCOPE_PREFIX
and CLONE_WAL_BUCKET_SCOPE_PREFIX to a non-default value that depends
on the cluster's namespace, but it seems that we don't need this
feature for now (no conflict will occur even for clusters with the
same name and different namespaces because of the SCOPE_SUFFIX) and
it requires some additional testing first.
2018-03-01 12:26:09 +01:00
Sergey Dudoladov 35104cb72b Add CLONE_ prefix to the env var 2018-03-01 11:19:15 +01:00
Sergey Dudoladov bcb8caeddf Set WAL_BUCKET_SCOPE_PREFIX to the empty string 2018-03-01 11:16:47 +01:00
Sergey Dudoladov fb21246fcd Remove early stopping conditions that rely on the relica service being absent 2018-02-27 17:21:51 +01:00
Sergey Dudoladov 28fed26845 Do not delete an endpoint for the replica service w/o load balancer during sync 2018-02-27 17:18:30 +01:00
Sergey Dudoladov b107d781e8 Do not delete replica service w/o load balancer during sync 2018-02-27 17:16:00 +01:00
Sergey Dudoladov 2ef069ee93 Create/delete replica service regardless of load balancer setup 2018-02-27 17:10:49 +01:00
zerg-junior 0f392c2007
Merge pull request #252 from zalando-incubator/label-teams
Add 'team' label to pods, stateful sets, secrets and pod disruption budgets
2018-02-26 12:57:26 +01:00
Sergey Dudoladov 071547e5bf Modify to add extra labels only during resource creation 2018-02-26 11:11:50 +01:00
Oleksii Kliukin 2bb7e98268
update individual role secrets from infrastructure roles (#206)
* Track origin of roles.

* Propagate changes on infrastructure roles to corresponding secrets.

When the password in the infrastructure role is updated, re-generate the
secret for that role.

Previously, the password for an infrastructure role was always fetched from
the secret, making any updates to such role a no-op after the corresponding
secret had been generated.
2018-02-23 17:24:04 +01:00
Sergey Dudoladov 00dc810544 Add 'team' label to pods, stateful sets, secrets and pod disruption budgets 2018-02-23 14:36:10 +01:00
Dmitrii Dolgov ef50b147c5 Use list of checks instead of a map 2018-02-23 14:24:33 +01:00
Dmitrii Dolgov 95d86c7600 Move container comparison logic to a separate function 2018-02-23 11:58:37 +01:00
Oleksii Kliukin c4aab502b3
Remove Patroni leftover objects on cluster deletion. (#244)
* Remove all endpoints and configmaps from Patroni when Patroni is running with Kubernetes support on cluster deletion.
2018-02-23 09:52:22 +01:00
Dmitry Dolgov bf4b0f0f33
Merge pull request #240 from zalando-incubator/feature/goreport-improvements
Some improvements for golint, ineffassign and misspell
2018-02-22 11:31:08 +01:00
Oleksii Kliukin cca73e30b7
Make code around recreating pods and creating objects in the database less brittle (#213)
There used to be a masterLess flag that was supposed to indicate whether the cluster it belongs to runs without the acting master by design. At some point, as we didn't really have support for such clusters, the flag has been misused to indicate there is no master in the cluster. However, that was not done consistently (a cluster without all pods running would never be masterless, even when the master is not among the running pods) and it was based on the wrong assumption that the masterless cluster will remain masterless until the next attempt to change that flag, ignoring the possibility of master coming up or some node doing a successful promotion. Therefore, this PR gets rid of that flag completely.

When the cluster is running with 0 instances, there is obviously no master and it makes no sense to create any database objects inside the non-existing master. Therefore, this PR introduces an additional check for that.

recreatePods were assuming that the roles of the pods recorded when the function has stared will not change; for instance, terminated replica pods should start as replicas. Revisit that assumption by looking at the actual role of the re-spawned pods; that avoids a failover if some replica has promoted to the master role while being re-spawned. In addition, if the failover from the old master was unsuccessful, we used to stop and leave the old master running on an old pod, without recording this fact anywhere. This PR makes the failover failure emit a warning, but not stop recreating the last master pod; in the worst case, the running master will be terminated, however, this case is rather unlikely one.

As a side effect, make waitForPodLabel return the pod definition it waited for, avoiding extra API calls in recreatePods and movePodFromEndOfLifeNode
2018-02-22 10:42:05 +01:00
zerg-junior b0549c3c9c
Merge pull request #225 from zalando-incubator/support-many-namespaces
Support many namespaces
2018-02-20 17:39:42 +01:00
Oleksii Kliukin 99c090899f
Change the suffix delimiter to slash. (#242)
This allows using S3 API in order to simplify finding all folders that are different only by a suffix, since the suffix delimiter will not occur in the suffix itself (currently being a UID).
2018-02-20 16:31:44 +01:00
Oleksii Kliukin c597377617
Use cluster UID as a suffix to the WAL bucket. (#211)
Avoid reusing WAL S3 buckets of the older cluster with the same name as the existing one.

For the new cluster, the S3 bucket name will include a suffix that is equal to the UID of the PostgreSQL object describing the cluster. That way, the bucket name will stay the same for all members iff  they correspond to the same PostgreSQL cluster object.

When "clone: uid:" key is present in the cluster manifest and the cluster is cloned from an S3 bucket (currently that happens if the endTimestamp is present in the clone description) the S3 bucket to clone from is suffixed with the -uid value.
2018-02-20 15:36:43 +01:00
Dmitrii Dolgov a7cd859919 Some improvements for golint, ineffassign and misspell 2018-02-19 17:46:31 +01:00
Sergey Dudoladov f194a2ae5a Introduce changes from the PR #200 by @alexeyklyukin 2018-02-07 14:02:32 +01:00
Sergey Dudoladov ea84f9d577 Rename the configmap 'namespace' entry to avoid confusion with the map's owm namespace 2018-02-06 15:09:00 +01:00
Oleksii Kliukin b90a36c909
Set node_readiness_label default to an empty value. (#204)
Previously, it was set to the lifecycle-status:ready, breaking a
lot of minikube deployments. Also it was not possible befor to run
with this label set to an empty value.

Document the effect of the label in the new section of the
documentation.
2018-01-16 15:43:03 +01:00
Manuel Gómez bf4406d2a4 Consider container names in Statefulset diffs (#210)
This includes a comparison on container names being equal in the
decision of whether a Statefulset has been updated.
2018-01-16 12:06:11 +01:00
Oleksii Kliukin 23011bdf9a
Migrate only master pods. Migrate single masters. (#199)
Avoid migrating replica pods, since they will be handled by the
node draining anyway (the PDB specifies that only masters are to
be kept).

Allow migration of the single-pod clusters.
2018-01-09 11:55:11 +01:00
zerg-junior bb5ce6cbbe
Merge pull request #195 from zalando-incubator/databases-rest-endpoint
Add a REST endpoint to list databases in all clusters
2018-01-09 11:53:32 +01:00
Oleksii Kliukin 8e99518eeb
Improve behavior on node decomissionining (#184)
* Trigger the node migration on the lack of  the readiness label.

* Examine the node's readiness status on node add.

Make sure we don't miss the not ready node, especially when the
operator is killed during the migration.
2018-01-04 11:53:15 +01:00
Manuel Gómez 1109cfa7a1
Add PostgreSQL pod namespace Scalyr sidecar environment (#196)
Another tiny bit of information that could be useful for log filters
once we start deploying clusters into separate namespaces.
2017-12-22 17:12:50 +01:00
Oleksii Kliukin 9720ac1f7e WIP: Hold the proper locks while examining the list of databases.
Introduce a new lock called specMu lock to protect the cluster spec.
This lock is held on update and sync, and when retrieving the spec in
the API code. There is no need to acquire it for cluster creation and
deletion: creation assigns the spec to the cluster before linking it to
the controller, and deletion just removes the cluster from the list in
the controller, both holding the global clustersMu Lock.
2017-12-22 13:06:11 +01:00
Manuel Gómez cd9bc7bdc5
Add PostgreSQL pod name Scalyr sidecar environment (#194)
This will allow the Scalyr image to add a custom attribute to shipped
log entries that notes the name of the originating pod.
2017-12-21 16:52:27 +01:00
Manuel Gómez 15c278d4e8
Scalyr agent sidecar for log shipping (#190)
* Scalyr agent sidecar for log shipping

* Remove the default for the Scalyr image

Now the image needs to be specified explicitly to enable log shipping to
Scalyr.  This removes the problem of having to generate the config file
or publish our agent image repository.

* Add configuration variable for Scalyr server URL

Defaults to the EU address.

* Alter style

Newlines are cheap and make code easier to edit/refactor, but ok.

* Fix StatefulSet comparison logic

I broke it when I made the comparison consider all containers in the
PostgreSQL pod.
2017-12-21 15:34:26 +01:00
Oleksii Kliukin da0de8cff7
Make sure the statefulset that is deleted manually gets re-created. (#191)
* Make sure the statefulset that is deleted manually gets re-created.

Per report and analysis by Manuel Gomez.

* Move the existence checks for other objects out of the Create functions.

create{Object} for services, endpoints and PDBs refused to continue if
there is a cached definition in the cluster, however, the only place
where it makes sense is when creating a new cluster. Note that contrary
to the statefulset this doesn't fix any issues, since those definitions
were nullified correspondingly when the sync code detected there is no
object present in the Kubernetes cluster.
2017-12-21 15:20:43 +01:00
zerg-junior 5d5fa680a3
Merge pull request #180 from zalando-incubator/container-name
Make pod's single container name static
2017-12-15 16:13:33 +01:00
Oleksii Kliukin bf80f5225e
Introduce higher and lower bounds for the number of instances (#178)
* Introduce higher and lower bounds for the number of instances

Reduce the number of instances to the min_instances if it is lower and
to the max_instances if it is higher. -1 for either of those means there
is no lower or upper bound.

In addition, terminate the operator when there is a nonsense in the
configuration (i.e. max_instances < min_instances).

Reviewed by Jan Mußler and Sergey Dudoladov.
2017-12-15 16:02:50 +01:00
Sergey Dudoladov 52e358ba8f Make pod's single container name static 2017-12-15 15:53:53 +01:00
Oleksii Kliukin 0e255f82c6 Provide more information about variable conflicts.
They are mentioned in the documentation and the operator will emit a
warning each time the variable from the pod environment configmap is
ignored because the same variable is defined by the operator.

Some minor changes in the variable names to make the code more readable.

Per review from Sergey Dudoladov.
2017-12-14 14:39:33 +01:00
Oleksii Kliukin da4b66210a Expand variables from the PodEnvironmentConfigMap (#4)
Inject PodEnvironmentConfigMap variables inline into the
statefulset definition in order to be able to figure out
changes to the statefulset when only PodEnvironmentConfigMap
has changed.
2017-12-14 14:39:33 +01:00
Oleksii Kliukin 1c5451cd7d Spelling fix. 2017-12-14 14:39:33 +01:00
Oleksii Kliukin 55dc12e512 Examine custom environment sources when syncing.
When comparing statefulsets, make sure EnvFrom fields are compared
as well.
2017-12-14 14:39:33 +01:00
Georg Kunz e8d9c75949 Allow custom Postgres pod environment variables 2017-12-14 14:39:33 +01:00
Oleksii Kliukin 87bc47d8d0 Fixes for the case of re-creating the cluster after deletion.
- make sure that the secrets for the system users (superuser, replication)
  are not deleted when the main cluster is. Therefore, we can re-create
  the cluster, potentially forcing Patroni to restore it from the backup
  and enable Patroni to connect, since it will use the old password, not
  the newly generated random one.

- when syncing users, always check whether they are already in the DB.
  Previously, we did this only for the sync cluster case, but the new
  cluster could be actually the one restored from the backup by Patroni,
  having all or some of the users already in place.

 - delete endponts last. Patroni uses the $clustername endpoint in order
   to store the leader related metadata. If we remove it before removing
   all pods, one of those pods running Patroni will re-create it and the
   next attempt to create the cluster with the same name will stuble on
   the existing endpoint.

 - Use db.Exec instead of db.Query for queries that expect no result.
   This also fixes the issue with the DB creation, since we didn't
   release an empty Row object it was not possible to create more than
   one database for a cluster.
2017-12-13 16:49:00 +01:00
Oleksii Kliukin 1fb8cf7ea0
Avoid overwriting critical users. (#172)
* Avoid overwriting critical users.

Disallow defining new users either in the cluster manifest, teams
API or infrastructure roles with the names mentioned in the new
protected_role_names parameter (list of comma-separated names)

Additionally, forbid defining a user with the name matching either
super_username or replication_username, so that we don't overwrite
system roles required for correct working of the operator itself.

Also, clear PostgreSQL roles on each sync first in order to avoid using
the old definitions that are no longer present in the current manifest,
infrastructure roles secret or the teams API.
2017-12-05 14:27:12 +01:00
Oleksii Kliukin 022ce29314 Make an error message more verbose. 2017-12-04 10:49:25 +01:00
Oleksii Kliukin 637921cdee Tests for initHumanUsers and initinitRobotUsers.
Change the Cluster class in the process to implelement Teams API
calls and Oauth token fetches as interfaces, so that we can mock
them in the tests.
2017-12-04 10:49:25 +01:00
Oleksii Kliukin 611cfe96d6 Fix an issue when not assigning the merge result.
Add some tests.
2017-12-04 10:49:25 +01:00
Oleksii Kliukin 831ebb1f32 Fix the error reporting. 2017-12-04 10:49:25 +01:00
Oleksii Kliukin 2e226dee26 Avoid overwriting infrastrure roles.
When a role is defined in the infrastructure roles and the cluster
manifest use the infrastructure role definition and add flags
defined in the manifest.

Previously the role has been overwritten by the definition from the
manifest.  Because a random password is generated for each role from the
manifest the applications relying on the infrastructure role credentials
from the infrastructure roles secret were unable to connect.
2017-12-04 10:49:25 +01:00
Oleksii Kliukin dd0affc390 Tweak our reaction to the cluster upgrade process.
Previously, the operator started to move the pods off the nodes to be
decomissioned by watching the eol_node_label value. Every new postgres
pod has been created with the anti-affinity to that label, making sure
that the pods being moved won't land on another to be decomissioned
node.

The changes introduce another label that indicates the ready node.  The
new pod affinity will esnure that the pod is only scheduled to the node
marked as ready, discarding the previous anti-affinity.  That way the
nodes can transition from the pending-decomission to the other statuses
(drained, terminating) without having pods suddently scaled to them.

In addition, rename the label that triggers the start of the upgrade
process to node_eol_label (for consistency with node_readiness_label)
and set its default vvalue to lifecycle-status:pending-decomission.
2017-11-30 14:11:49 +01:00
Oleksii Kliukin 1ffe98ba9f Fix the connection leak and user options sync.
- fix the lack of closing the cursor for the query that returned no
rows.
- fix syncing of the user options, as previously those were not
  fetched from the database.
2017-11-27 16:46:34 +01:00
Oleksii Kliukin 975b21f633 Rename api roles configuration parameter.
Change api_roles_configuration to team_api_role_configuration
2017-11-22 10:43:35 +01:00
Oleksii Kliukin 2352fc9a39 go fmt run 2017-11-22 10:43:35 +01:00
Oleksii Kliukin 415a7fdc4d Allow global configuration options for API roles.
Add options to the PgUser structure, potentially allowing to set
per-role options in the cluster definition as well.

Introduce api_roles_configuration operator option with the default
of log_statement=all
2017-11-22 10:43:35 +01:00
Oleksii Kliukin 6dcd074ea0 Allow per-cluster setting of a docker image.
Add dockerImage cluster configuration parameter that overrides global
operator defaults when set to a non-empty value.
2017-11-14 11:53:04 +01:00
Oleksii Kliukin c25e849fe4 Fix a failure to create new statefulset at sync.
Also do a fmt run.
2017-11-08 18:24:17 +01:00
Murat Kabilov 86803406db
use sync methods while updating the cluster 2017-11-03 12:00:43 +01:00
Georg Kunz 47dd766fa7 Add node toleration config to PodSpec (#151)
* Add node toleration config to PodSpec

This allows to taint nodes dedicated to Postgres and prevents other pods from running on these nodes.

* Document taint and toleration setup

And remove setting from default operator ConfigMap

* Allow to overwrite tolerations with Postgres manifest
2017-11-02 19:10:44 +01:00
Oleksii Kliukin ce960e892a
Create new databases and change owners of existing ones during sync. (#153)
* Create new databases and change owners of existing ones during sync.
2017-11-02 17:46:33 +01:00
Oleksii Kliukin 7a76be7d3e Minor fixes around PDB (pod-distruption-budget) syncing: (#147)
- Call comparison function in the case of the sync as well as for update
- Include full cluster name in PDB name
- Assign cluster labels to the PDB object
2017-10-23 12:26:59 +02:00
Murat Kabilov c17aabb642 fix pod disruption budget labels (#146) 2017-10-20 15:01:51 +02:00
Murat Kabilov 661b141849 Fix Pod Disruption Budget null pointer exception 2017-10-20 11:43:50 +02:00
Murat Kabilov a1deae198b add missing master matchLabel for the PDB (#144) 2017-10-20 11:26:40 +02:00
Oleksii Kliukin eba23279c8 Kube cluster upgrade 2017-10-19 10:49:42 +02:00
Oleksii Kliukin 1dbf259c76 Retry opening DB connections. (#140)
Make sure DB connection retry also reopens a connection after closing it
2017-10-18 16:28:00 +02:00
Oleksii Kliukin 99870d8eac Fix division by zero when connecting to the DB.
Apparently the retry function's first parameter is the duration of
a single attempt and it cannot be zero.
2017-10-18 10:44:49 +02:00
Murat Kabilov 202f2de988 Retry connecting to pg 2017-10-17 17:03:50 +02:00
Murat Kabilov 6c4cb4e9da Perform manual failover during the scale down 2017-10-16 17:41:23 +02:00
Murat Kabilov 5b29576a8e Remove redundant constants 2017-10-16 15:52:48 +02:00
Murat Kabilov 3b32265258 Set status of the cluster on sync fail/success 2017-10-12 15:10:42 +02:00
Jan Mussler cec695d48e Superuser toggle for team members
Make superuser toggleable for team members. Add and "admin" role to team members if superuser is disabled.
2017-10-12 15:01:54 +02:00
Murat Kabilov 8d5faaa5a5 return idle status when worker has nothing to do 2017-10-11 15:42:20 +02:00
Oleksii Kliukin 793defef72 Fix pod wait timeouts.
Previously, a timer had been reset on every message received through
the pod channel.
2017-10-11 14:58:37 +02:00
Murat Kabilov 83c8d6c419 Extend diagnostic api with worker status info 2017-10-11 12:26:09 +02:00
Murat Kabilov 71a540ff48 Merge branch 'master' into crd 2017-10-09 11:55:18 +02:00
Murat Kabilov a35e9c6119 move from tpr to crd 2017-10-06 15:12:08 +02:00
Murat Kabilov 3b8c06416e skip manual failover for 1-pod clusters 2017-10-05 13:30:15 +03:00
Jan Mussler c4af0ac6a6 Update cluster.go 2017-10-05 10:58:23 +02:00
Jan M 4a1170855a Adding '_' to allowed chars. 2017-10-05 10:53:19 +02:00
Murat Kabilov 48ec6b35b9 perform manual failover on pg cluster rolling upgrade 2017-10-04 16:56:47 +03:00
Murat Kabilov 00194d0130 create dbs on cluster create 2017-10-04 16:24:27 +03:00
Murat Kabilov 5cfdabb63e fix regexp for api endpoint urls 2017-09-28 12:00:40 +02:00
Murat Kabilov be8bf22c00 add missing return 2017-09-28 11:23:56 +02:00
Murat Kabilov 93d4bf2b55 Merge branch 'master' into api-improvements 2017-09-26 14:47:13 +02:00
Murat Kabilov 19de2a24b7 go lint 2017-09-26 13:44:30 +02:00
Murat Kabilov d876f4d88e set secret name template via config map 2017-09-18 14:25:09 +02:00
Oleksii Kliukin 7667847bfe Feature/validate role options (#101)
Be more rigorous about validating user flags.

Only accept CREATE ROLE flags that doesn't have any params (i.e.
not ADMIN or CONNECTION LIMIT). Check that both flag and NOflag
are not used at the same time.
2017-09-15 13:57:48 +02:00
Murat Kabilov 969a06f521 Use DCS_ENABLE_KUBERNETES_API=true environment to enable kubernetes native deployment 2017-09-14 11:39:49 +02:00
Murat Kabilov 8430ee86c9 add comments on roles 2017-09-11 17:44:32 +02:00
Murat Kabilov 90b49a24ba make postgresql roles public 2017-09-11 17:44:32 +02:00
Oleksii Kliukin 8b85935a7a Allow cloning clusters from the operator. (#90)
Allow cloning clusters from the operator.

The changes add a new JSON node `clone` with possible values `cluster`
and `timestamp`. `cluster` is mandatory, and setting a non-empty
`timestamp` triggers wal-e point in time recovery. Spilo and Patroni do
the whole heavy-lifting, the operator just defines certain variables and
gathers some data about how to connect to the host to clone or the
target S3 bucket.

As a minor change, set the image pull policy to IfNotPresent instead
of Always to simplify local testing.

Change the default replication username to standby.
2017-09-08 16:47:03 +02:00
Oleksii Kliukin a0a9e8f849 Feature/configure replication role (#97)
Configure superuser and replication usernames
2017-09-07 10:12:34 +02:00
Murat Kabilov 39c123e96a fetch cluster resources by name, not by label selectors 2017-09-04 18:03:54 +02:00
Murat Kabilov 8aa11ecee2 Add patroni api client 2017-08-30 16:01:18 +02:00
Murat Kabilov 899c0bef45 Use warningf instead of warnf 2017-08-30 14:35:56 +02:00
Murat Kabilov f44c8e1206 Make pod termination grace period configurable 2017-08-18 16:52:19 +02:00
Murat Kabilov 71dfb33b2b make pod termination grace period configurable 2017-08-18 16:38:25 +02:00
Murat Kabilov 5967837875 pass the name of the status in the log message on set cluster status failure 2017-08-17 12:18:53 +02:00
Murat Kabilov d2828e5ece remove var shading; fix imports 2017-08-15 15:59:10 +02:00
Murat Kabilov 272d7e1bcf rename service field to services as it contains service per role 2017-08-15 15:55:56 +02:00
Murat Kabilov 82f58b57d8 add cluster and controller methods for getting status 2017-08-15 12:11:06 +02:00
Murat Kabilov 5470f20be4 always pass a cluster name as a logger field 2017-08-15 10:29:18 +02:00
Murat Kabilov e26db66cb5 start all the log messages with lowercase letters 2017-08-15 10:12:36 +02:00
Oleksii Kliukin 87a379f663 Avoid reusing closed DB connection. (#79)
Set DB connection to nil upon closing it.
2017-08-10 18:19:35 +02:00
Oleksii Kliukin f15f93f479 Bugfix/close db connections (#78)
Open and close DB connections on-demand.

Previously, we used to leave the DB connection open while the
cluster was registered with the operator, potentially resutling
in dangled connections if the operator terminates abnormally.

Small refactoring around the role syncing code.
2017-08-10 10:10:00 +02:00
Oleksii Kliukin a8b5b77cc4 Fix missing labels in the replica service selector. 2017-08-02 17:46:24 +02:00
Murat Kabilov cf663cb841 Fix golint warnings 2017-08-01 16:08:56 +02:00
Murat Kabilov 1f8b37f33d Make use of kubernetes client-go v4
* client-go v4.0.0-beta0
* remove unnecessary methods for tpr object
* rest client: use interface instead of structure pointer
* proper names for constants; some clean up for log messages
* remove teams api client from controller and make it per cluster
2017-07-25 15:25:17 +02:00
Oleksii Kliukin 4455f1b639 Feature/unit tests (#53)
- Avoid relying on Clientset structure to call Kubernetes API functions.
While Clientset is a convinient "catch-all" abstraction for calling
REST API related to different Kubernetes objects, it's impossible to
mock. Replacing it wih the kubernetes.Interface would be quite
straightforward, but would require an exra level of mocked interfaces,
because of the versioning. Instead, a new interface is defined, which
contains only the objects we need of the pre-defined versions.

-  Move KubernetesClient to k8sutil package.
- Add more tests.
2017-07-24 16:56:46 +02:00
Oleksii Kliukin a8ed1e25b4 Avoid re-creating master pod if it is empty during sync. (#58)
Fixes #59
2017-07-12 10:57:20 +02:00
Oleksii Kliukin 00150711e4 Configure load balancer on a per-cluster and operator-wide level (#57)
* Deny all requests to the load balancer by default.
* Operator-wide toggle for the load-balancer.
* Define per-cluster useLoadBalancer option.

If useLoadBalancer is not set - then operator-wide defaults take place. If it
is true - the load balancer is created, otherwise a service type clusterIP is
created.

Internally, we have to completely replace the service if the service type
changes. We cannot patch, since some fields from the old service that will
remain after patch are incompatible with the new one, and handling them
explicitly when updating the service is ugly and error-prone. We cannot
update the service because of the immutable fields, that leaves us the only
option of deleting the old service and creating the new one. Unfortunately,
there is still an issue of unnecessary removal of endpoints associated with
the service, it will be addressed in future commits.

* Revert the unintended effect of go fmt

* Recreate endpoints on service update.

When the service type is changed, the service is deleted and then
the one with the new type is created. Unfortnately, endpoints are
deleted as well. Re-create them afterwards, preserving the original
addresses stored in them.

* Improve error messages and comments. Use generate instead of gen in names.
2017-06-30 13:38:49 +02:00
Oleksii Kliukin 987990fb0e Move service annotation patch template into the constants. 2017-06-12 10:24:23 +02:00
Oleksii Kliukin 17826ee434 Go fmt run. 2017-06-12 10:24:23 +02:00
Oleksii Kliukin 51d73fb172 Replace service annotations when updating services.
In case the whole annotation changes (like the external DNS) we
don't want to keep the old one hanging around. Unline specs, we
don't expect anyone except the operator to change the annotations.

Use StrategicMergePatchType in order to replace the annotations
map completely.
2017-06-12 10:24:23 +02:00
Murat Kabilov 1540a2ba65 fix typos;
remove unnecessary tests;
go fmt -s
2017-06-08 15:52:01 +02:00
Oleksii Kliukin bc0e9ab4bc Add error checks per report from errcheck-ng 2017-06-08 10:41:44 +02:00
Murat Kabilov 292a9bda05 Check for dns annotation of the service 2017-06-07 16:41:39 +02:00
Oleksii Kliukin dc36c4ca12 Implement replicaLoadBalancer boolean flag. (#38)
The flag adds a replica service with the name cluster_name-repl and
a DNS name that defaults to {cluster}-repl.{team}.{hostedzone}.

The implementation converted Service field of the cluster into a map
with one or two elements and deals with the cases when the new flag
is changed on a running cluster
(the update and the sync should create or delete the replica service).
In order to pick up master and replica service and master endpoint
when listing cluster resources.

* Update the spec when updating the cluster.
2017-06-07 13:54:17 +02:00
Oleksii Kliukin 7b0ca31bfb Implements EBS volume resizing #35.
In order to support volumes different from EBS and filesystems other than EXT2/3/4 the respective code parts were implemented as interfaces. Adding the new resize for the volume or the filesystem will require implementing the interface, but no other changes in the cluster code itself.

Volume resizing first changes the EBS and the filesystem, and only afterwards is reflected in the Kubernetes "PersistentVolume" object. This is done deliberately to be able to check if the volume needs resizing by peeking at the Size of the PersistentVolume structure. We recheck, nevertheless, in the EBSVolumeResizer, whether the actual EBS volume size doesn't match the spec, since call to the AWS ModifyVolume is counted against the resize limit of once every 6 hours, even for those calls that shouldn't result in an actual resize (i.e. when the size matches the one for the running volume).

As a collateral, split the constants into multiple files, move the volume code into a separate file and fix minor issues related to the error reporting.
2017-06-06 13:53:27 +02:00
Murat Kabilov 1fb05212a9 Refactor teams API package 2017-05-30 10:14:30 +02:00
Murat Kabilov 009db16c7c Use queues for the pod events (#30) 2017-05-23 15:24:14 +02:00
Oleksii Kliukin afce38f6f0 Fix error messages (#27)
Use lowercase for kubernetes objects
Use %v instead of %s for errors
Start error messages with a lowercase letter.
2017-05-22 14:12:06 +02:00
Oleksii Kliukin 8beb5936b1 Don't error out at sync on existence of the object. (#26) 2017-05-22 12:58:47 +02:00
Murat Kabilov 4acaf27a5d Remove etcd requests (#25)
update glide
2017-05-19 17:18:37 +02:00
Murat Kabilov d34273543e Fix the golint, gosimple warnings 2017-05-18 17:38:54 +02:00
Murat Kabilov 233e8529c1 Return error instead of logging it 2017-05-18 17:24:44 +02:00