Commit Graph

98 Commits

Author SHA1 Message Date
Aaron Miller ec5b1d4d58 StatefulSet fsGroup config option to allow non-root spilo (#531)
* StatefulSet fsGroup config option to allow non-root spilo

* Allow Postgres CRD to overide SpiloFSGroup of the Operator.

* Document FSGroup of a Pod cannot be changed after creation.
2019-06-04 16:38:26 +02:00
Felix Kunde 5a0e95ac45
Add CRD configuration to Helm chart values.yaml (#559)
* add templates for CRDs incl. crd-install hooks
* support both config styles in values.yaml
* fix ServiceAccount naming in values.yaml
2019-06-03 14:48:32 +02:00
Erik Inge Bolsø ebda39368e database.go: remove hardcoded .svc.cluster.local dns suffix (#561)
* database.go: substitute hardcoded .svc.cluster.local dns suffix with config parameter

Use the pod's configured dns search path, for clusters where .svc.cluster.local is not correct.
2019-05-31 16:32:00 +02:00
Sergey Dudoladov f3e1e80aaf
Add logical backup (#442)
* Add k8s cron job to spawn logical backups

* Minor doc updates
2019-05-16 15:52:01 +02:00
Aaron Miller 15ec6a920d Config option to allow Spilo container to run non-privileged. (#525)
* Config option to allow Spilo container to run non-privileged.

Runs non-privileged by default.

Fixes #395

* add spilo_privileged to manifests/configmap.yaml

* add spilo_privileged to helm chart's values.yaml
2019-04-03 17:13:39 +02:00
Sergey Dudoladov 0b53dbe5dc
Set statefulset update and management policy explicitly (#515)
* fix logging in retry

* explicitly set the stateful set update strategy to onDelete

* add podManagementPolicy
2019-03-13 11:49:18 +01:00
Vineeth Reddy db72d82f14 gofmt and golint fixes (#506)
* fix gofmt and golint issues
2019-03-04 13:13:55 +01:00
Sergey Dudoladov f400539b69
Retry moving master pods (#463)
* Retry moving master pods

* bump up master pod wait timeout
2019-02-28 16:19:27 +01:00
Felix Kunde 31e568157b reflect change in github url (#496)
Project was moved from the incubator to the Zalando main org, hence the rename
2019-02-25 11:26:55 +01:00
teuto.net Netzdienste GmbH 26a7fdfa9f Add Pod Anti Affinity (#489)
* Add Pod Anti Affinity
2019-02-21 16:37:03 +01:00
Stephane T d11b23bd71 Add inherited_labels (#459)
* add support for inherited_labels

Signed-off-by: Stephane Tang <hi@stang.sh>

* update docs with inherited_labels

Signed-off-by: Stephane Tang <hi@stang.sh>
2019-02-14 12:29:06 +01:00
Armin Nesiren 6f6a599c90 Added possibility to add custom annotations to LoadBalancer service. (#461)
* Added possibility to add custom annotations to LoadBalancer service.
2019-01-25 11:35:27 +01:00
zerg-junior c0b0b9a832
[WIP] Add 'admin' option to create role (#425)
* Add 'admin' option to create role

* Fix run_locally_script
2018-12-27 10:14:33 +01:00
Dmitry Dolgov d6e6b00770
Add shm_volume option (#427)
Add possibility to mount a tmpfs volume to /dev/shm to avoid issues like
[this](https://github.com/docker-library/postgres/issues/416). To achieve that
two new options were introduced:

* `enableShmVolume` to PostgreSQL manifest, to specify whether or not mount
this volume per database cluster

* `enable_shm_volume` to operator configuration, to specify whether or not mount
per operator.

The first one, `enableShmVolume` takes precedence to allow us to be more flexible.
2018-12-21 16:22:30 +01:00
zerg-junior 45c89b3da4
[WIP] Add set_memory_request_to_limit option (#406)
* Add set_memory_request_to_limit option
2018-11-15 14:00:08 +01:00
zerg-junior 25fa45fd58 [WIP] Grant 'superuser' to the members of Postgres admin teams (#371)
Added support for superuser team in addition to the admin team that owns the postgres cluster.
2018-08-30 10:51:37 +02:00
Oleksii Kliukin e1ed4b847d
Use code-generation for CRD API and deepcopy methods (#369)
Client-go provides a https://github.com/kubernetes/code-generator package in order to provide the API to work with CRDs similar to the one available for built-in types, i.e. Pods, Statefulsets and so on.

Use this package to generate deepcopy methods (required for CRDs), instead of using an external deepcopy package; we also generate APIs used to manipulate both Postgres and OperatorConfiguration CRDs, as well as informers and listers for the Postgres CRD, instead of using generic informers and CRD REST API; by using generated code we can get rid of some custom and obscure CRD-related code and use a better API.

All generated code resides in /pkg/generated, with an exception of zz_deepcopy.go in apis/acid.zalan.do/v1

Rename postgres-operator-configuration CRD to OperatorConfiguration, since the former broke naming convention in the code-generator.

Moved Postgresql, PostgresqlList, OperatorConfiguration and OperatorConfigurationList and other types used by them into

Change the type of  the Error field in the Postgresql crd to a string, so that client-go could generate a deepcopy for it.

Use generated code to set status of CRD objects as well. Right now this is done with patch, however, Kubernetes 1.11 introduces the /status subresources, allowing us to set the status with
the special updateStatus call in the future. For now, we keep the code that is compatible with earlier versions of Kubernetes.

Rename postgresql.go to database.go and status.go to logs_and_api.go to reflect the purpose of each of those files.

Update client-go dependencies.

Minor reformatting and renaming.
2018-08-15 17:22:25 +02:00
Oleksii Kliukin 59f0c5551e
Allow configuring pod priority globally and per cluster. (#353)
* Allow configuring pod priority globally and per cluster.

Allow to specify pod priority class for all pods managed by the operator,
as well as for those belonging to individual clusters.

Controlled by the pod_priority_class_name operator configuration
parameter and the podPriorityClassName manifest option.

See https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass
for the explanation on how to define priority classes since Kubernetes 1.8.

Some import order changes are due to go fmt.
Removal of OrphanDependents deprecated field.

Code review by @zerg-junior
2018-08-03 14:03:37 +02:00
Oleksii Kliukin 0181a1b5b1
Introduce a repair scan to fix failing clusters (#304)
A repair is a sync scan that acts only on those clusters that indicate
that the last add, update or sync operation on them has failed. It is
supposed to kick in more frequently than the repair scan. The repair
scan still remains to be useful to fix the consequences of external
actions (i.e. someone deletes a postgres-related service by mistake)
unbeknownst to the operator.

The repair scan is controlled by the new repair_period parameter in the
operator configuration. It has to be at least 2 times more frequent than
a sync scan to have any effect (a normal sync scan will update both last
synced and last repaired attributes of the controller, since repair is
just a sync underneath).

A repair scan could be queued for a cluster that is already being synced
if the sync period exceeds the interval between repairs. In that case a
repair event will be discarded once the corresponding worker finds out
that the cluster is not failing anymore.

Review by @zerg-junior
2018-07-24 11:21:45 +02:00
zerg-junior 417f13c0bd
Submit RBAC credentials during initial Event processing (#344)
* During initial Event processing submit the service account for pods and bind it to a cluster role that allows Patroni to successfully start. The cluster role is assumed to be created by the k8s cluster administrator.
2018-07-19 16:40:40 +02:00
Oleksii Kliukin 25a306244f
Support for per-cluster and operator global sidecars (#331)
* Define sidecars in the operator configuration.

Right now only the name and the docker image can be defined, but with
the help of the pod_environment_configmap parameter arbitrary
environment variables can be passed to the sidecars.

* Refactoring around generatePodTemplate.

Original implementation of per-cluster sidecars by @theRealWardo 

Per review by @zerg-junior and @Jan-M
2018-07-02 16:25:27 +02:00
zerg-junior 7394c15d0a
Make AWS region configurable in the operator cofig map (#333) 2018-06-27 17:29:02 +02:00
Oleksii Kliukin 9cb48e0889
Document operator configuration parameters. (#313) 2018-06-08 13:21:57 +02:00
Sergey Dudoladov 2e041c50e6 Bump up default Spilo image 2018-05-28 16:54:27 +02:00
Manuel Gómez 32a1456a68
Update config.go 2018-05-24 16:58:46 +02:00
Sergey Dudoladov 749d723f55 Shorten the commen 2018-05-24 16:22:13 +02:00
Sergey Dudoladov 9824ddae5e Fix etcd_host default 2018-05-24 16:05:45 +02:00
Oleksii Kliukin 987b43456b
Deprecate old LB options, fix endpoint sync. (#287)
* Depreate old LB options, fix endpoint sync.

- deprecate useLoadBalancer, replicaLoadBalancer from the manifest
  and enable_load_balancer from the operator configuration. The old
  operator configuration options become no-op with this commit. For
  the old manifest options, `useLoadBalancer` and `replicaLoadBalancer`
  are still consulted,  but only in the absense of the new ones
  (enableMasterLoadBalancer and enableReplicaLoadBalancer).

- Make sure the endpoint being created during the sync receives proper
  addresses subset. This is more critical for the replicas, as for the
  masters Patroni will normally re-create the endpoint before the
  operator.

- Avoid creating the replica endpoint, since it will be created automatically
  by the corresponding service.
- Update the README and unit tests.

Code review by @mgomezch and @zerg-junior
2018-05-15 15:19:18 +02:00
Sergey Dudoladov 59ded0c212 Shorten bucket name 2018-05-02 14:05:57 +02:00
Sergey Dudoladov c45219bafa Set up an S3 bucket for the postgres daily logs 2018-05-02 12:52:42 +02:00
Sergey Dudoladov d99b553ec1 Convert default account definiton into JSON 2018-04-25 12:35:16 +02:00
Sergey Dudoladov e3f7fac443 Comment on the default value for pod service account name 2018-04-24 15:41:28 +02:00
Sergey Dudoladov 485ec4b8ea Move service account to Controller 2018-04-24 15:13:08 +02:00
Sergey Dudoladov c31c76281c Make operator unaware of its own service account 2018-04-23 14:38:20 +02:00
Sergey Dudoladov bd51d2922b Turn ServiceAccount into struct value to avoid race conditon during account creation 2018-04-20 13:05:05 +02:00
Sergey Dudoladov 214ae04aa7 Deploy service account for pod creation on demand 2018-04-18 16:20:20 +02:00
Sergey Dudoladov 96d46252f5 Change the default values to closer match previous behaviour 2018-03-26 11:43:46 +02:00
Sergey Dudoladov a8862aeee1 Enable backward compatibility for enable_load_balancer setting from operator configmap 2018-03-19 17:19:50 +01:00
Sergey Dudoladov 145689c950 Disable load balancer for master service by default (it may cost money) 2018-03-16 13:18:13 +01:00
Sergey Dudoladov 0986e56226 Add separate params for master and replica load balancers to operator configuration 2018-03-14 12:12:28 +01:00
Sergey Dudoladov e048328d6a Comment on special values for watched namespace 2018-02-20 17:26:17 +01:00
Sergey Dudoladov dcfc9925f6 Respond to code review 2018-02-20 14:43:02 +01:00
Sergey Dudoladov 74fa7b9492 Restrict operator to single watched namespace via env var 2018-02-07 16:44:49 +01:00
Sergey Dudoladov ea84f9d577 Rename the configmap 'namespace' entry to avoid confusion with the map's owm namespace 2018-02-06 15:09:00 +01:00
Oleksii Kliukin b90a36c909
Set node_readiness_label default to an empty value. (#204)
Previously, it was set to the lifecycle-status:ready, breaking a
lot of minikube deployments. Also it was not possible befor to run
with this label set to an empty value.

Document the effect of the label in the new section of the
documentation.
2018-01-16 15:43:03 +01:00
Oleksii Kliukin 8e99518eeb
Improve behavior on node decomissionining (#184)
* Trigger the node migration on the lack of  the readiness label.

* Examine the node's readiness status on node add.

Make sure we don't miss the not ready node, especially when the
operator is killed during the migration.
2018-01-04 11:53:15 +01:00
Manuel Gómez 15c278d4e8
Scalyr agent sidecar for log shipping (#190)
* Scalyr agent sidecar for log shipping

* Remove the default for the Scalyr image

Now the image needs to be specified explicitly to enable log shipping to
Scalyr.  This removes the problem of having to generate the config file
or publish our agent image repository.

* Add configuration variable for Scalyr server URL

Defaults to the EU address.

* Alter style

Newlines are cheap and make code easier to edit/refactor, but ok.

* Fix StatefulSet comparison logic

I broke it when I made the comparison consider all containers in the
PostgreSQL pod.
2017-12-21 15:34:26 +01:00
Oleksii Kliukin bf80f5225e
Introduce higher and lower bounds for the number of instances (#178)
* Introduce higher and lower bounds for the number of instances

Reduce the number of instances to the min_instances if it is lower and
to the max_instances if it is higher. -1 for either of those means there
is no lower or upper bound.

In addition, terminate the operator when there is a nonsense in the
configuration (i.e. max_instances < min_instances).

Reviewed by Jan Mußler and Sergey Dudoladov.
2017-12-15 16:02:50 +01:00
Georg Kunz e8d9c75949 Allow custom Postgres pod environment variables 2017-12-14 14:39:33 +01:00
Oleksii Kliukin 1fb8cf7ea0
Avoid overwriting critical users. (#172)
* Avoid overwriting critical users.

Disallow defining new users either in the cluster manifest, teams
API or infrastructure roles with the names mentioned in the new
protected_role_names parameter (list of comma-separated names)

Additionally, forbid defining a user with the name matching either
super_username or replication_username, so that we don't overwrite
system roles required for correct working of the operator itself.

Also, clear PostgreSQL roles on each sync first in order to avoid using
the old definitions that are no longer present in the current manifest,
infrastructure roles secret or the teams API.
2017-12-05 14:27:12 +01:00