Commit Graph

59 Commits

Author SHA1 Message Date
Oleksii Kliukin 26db91c53e
Improve infrastructure role definitions (#208)
Enhance definitions of infrastructure roles by allowing membership in multiple roles, role options and per-role configuration to be specified in the infrastructure role configmap, which must have the same name as the infrastructure role secret. See manifests/infrastructure-roles-configmap.yaml for the examples and updated README for the description of different types of database roles supposed by the operator and their purposes.

Change the logic of merging infrastructure roles with the manifest roles when they have the same name, to return the infrastructure role unchanged instead of merging. Previously, we used to propagate flags from the manifest role to the resulting infrastructure one, as there were no way to define flags for the infrastructure role; however, this is not the case anymore.

Code review and tests by @erthalion
2018-04-04 17:21:36 +02:00
zerg-junior ff5793b584
Merge pull request #258 from zalando-incubator/always-create-replica-service
[WIP] Always create replica service
2018-03-29 14:42:26 +02:00
Sergey Dudoladov 96d46252f5 Change the default values to closer match previous behaviour 2018-03-26 11:43:46 +02:00
Sergey Dudoladov a8862aeee1 Enable backward compatibility for enable_load_balancer setting from operator configmap 2018-03-19 17:19:50 +01:00
Sergey Dudoladov 931b48fcbb Respond to code reviews 2018-03-16 15:36:42 +01:00
Sergey Dudoladov 0986e56226 Add separate params for master and replica load balancers to operator configuration 2018-03-14 12:12:28 +01:00
Sergey Dudoladov ac6c5bcf09 Explicitly name replica and master load balancer params in PostgresSpec 2018-03-14 12:03:27 +01:00
zerg-junior cca50122a6
Delete config file added by mistake 2018-03-12 12:54:02 +01:00
Sergey Dudoladov 6839ce0170 Fix configuration of dns names 2018-03-12 12:45:52 +01:00
Jan Mussler cb55749c1b
Update postgres-operator.yaml (#255)
Bump operator image version.
2018-02-26 20:03:56 +01:00
Sergey Dudoladov dcfc9925f6 Respond to code review 2018-02-20 14:43:02 +01:00
Sergey Dudoladov 4c23917d42 Watch all namespaces if the relevant param is empty string / 'default' if param is unset 2018-02-12 11:47:56 +01:00
Sergey Dudoladov c0bc8eaa6d Comment manifests 2018-02-08 15:15:47 +01:00
Sergey Dudoladov 8b7bbde06e Make env var overwrite configmap setting for watching namespaces 2018-02-06 16:12:47 +01:00
Sergey Dudoladov 0ef801f4e0 Add example of the watched namespace to the operator config map 2018-02-06 15:16:21 +01:00
Oleksii Kliukin b90a36c909
Set node_readiness_label default to an empty value. (#204)
Previously, it was set to the lifecycle-status:ready, breaking a
lot of minikube deployments. Also it was not possible befor to run
with this label set to an empty value.

Document the effect of the label in the new section of the
documentation.
2018-01-16 15:43:03 +01:00
zerg-junior 6c57334666 Add an example for cloning a backup from existing cluster (#189)
Add an example for cloning a backup from existing cluster
2017-12-19 16:21:06 +01:00
Sergey Dudoladov c1b3ce8028 Fix loadBalancerConfig 2017-12-18 17:32:22 +01:00
zerg-junior 3c178f68df Warn on infrastructure-roles.yaml format violations (#177)
Emit a warning if there are unprocessed entries in the infrastructure-roles secret.
2017-12-15 17:21:41 +01:00
Oleksii Kliukin dd0affc390 Tweak our reaction to the cluster upgrade process.
Previously, the operator started to move the pods off the nodes to be
decomissioned by watching the eol_node_label value. Every new postgres
pod has been created with the anti-affinity to that label, making sure
that the pods being moved won't land on another to be decomissioned
node.

The changes introduce another label that indicates the ready node.  The
new pod affinity will esnure that the pod is only scheduled to the node
marked as ready, discarding the previous anti-affinity.  That way the
nodes can transition from the pending-decomission to the other statuses
(drained, terminating) without having pods suddently scaled to them.

In addition, rename the label that triggers the start of the upgrade
process to node_eol_label (for consistency with node_readiness_label)
and set its default vvalue to lifecycle-status:pending-decomission.
2017-11-30 14:11:49 +01:00
Oleksii Kliukin 975b21f633 Rename api roles configuration parameter.
Change api_roles_configuration to team_api_role_configuration
2017-11-22 10:43:35 +01:00
Oleksii Kliukin 415a7fdc4d Allow global configuration options for API roles.
Add options to the PgUser structure, potentially allowing to set
per-role options in the cluster definition as well.

Introduce api_roles_configuration operator option with the default
of log_statement=all
2017-11-22 10:43:35 +01:00
Jan Mussler a98a7c95c2 Reorganize Readme (#142)
removing parts of config.

* chaning secret name pattern to make things shorter.

* Move section on self building docker image.

* Fix typo.

* Bump image.

* bump version for pdb fix.

* Changes in regards to review.

* Fix xhyve driver link.

* Move to new api, remove service account, not needed for minikube.

* Changed minimal manifest and example to use right file.

* Added service account for operator again, it is needed in pods anyways later.
2017-10-24 20:42:22 +02:00
Alexander Kukushkin 39200ba8d4 Enable k8s leader election (#145)
and bump docker image version
2017-10-20 13:58:15 +02:00
Alexander Kukushkin a98c712a52 Change spilo docker image to demospilo (#141)
Image size is slightly more than 24MB, it doesn't contain wal-e and not suitable for production, but it is very good for demo purposes.
2017-10-19 13:53:12 +02:00
Oleksii Kliukin eba23279c8 Kube cluster upgrade 2017-10-19 10:49:42 +02:00
Jan Mussler cec695d48e Superuser toggle for team members
Make superuser toggleable for team members. Add and "admin" role to team members if superuser is disabled.
2017-10-12 15:01:54 +02:00
Murat Kabilov 702d901bd9 use clear name for env var denoting namespace to watch (#129) 2017-10-12 10:42:20 +02:00
Murat Kabilov a35e9c6119 move from tpr to crd 2017-10-06 15:12:08 +02:00
Murat Kabilov 00194d0130 create dbs on cluster create 2017-10-04 16:24:27 +03:00
Murat Kabilov 93d4bf2b55 Merge branch 'master' into api-improvements 2017-09-26 14:47:13 +02:00
Murat Kabilov 9a66e09b88 cluster history api endpoint 2017-09-26 14:30:45 +02:00
Murat Kabilov d876f4d88e set secret name template via config map 2017-09-18 14:25:09 +02:00
Oleksii Kliukin 8b85935a7a Allow cloning clusters from the operator. (#90)
Allow cloning clusters from the operator.

The changes add a new JSON node `clone` with possible values `cluster`
and `timestamp`. `cluster` is mandatory, and setting a non-empty
`timestamp` triggers wal-e point in time recovery. Spilo and Patroni do
the whole heavy-lifting, the operator just defines certain variables and
gathers some data about how to connect to the host to clone or the
target S3 bucket.

As a minor change, set the image pull policy to IfNotPresent instead
of Always to simplify local testing.

Change the default replication username to standby.
2017-09-08 16:47:03 +02:00
Murat Kabilov 71dfb33b2b make pod termination grace period configurable 2017-08-18 16:38:25 +02:00
Murat Kabilov 228639b839 add api port and ring log size values to the config map 2017-08-15 12:37:58 +02:00
Oleksii Kliukin 00150711e4 Configure load balancer on a per-cluster and operator-wide level (#57)
* Deny all requests to the load balancer by default.
* Operator-wide toggle for the load-balancer.
* Define per-cluster useLoadBalancer option.

If useLoadBalancer is not set - then operator-wide defaults take place. If it
is true - the load balancer is created, otherwise a service type clusterIP is
created.

Internally, we have to completely replace the service if the service type
changes. We cannot patch, since some fields from the old service that will
remain after patch are incompatible with the new one, and handling them
explicitly when updating the service is ugly and error-prone. We cannot
update the service because of the immutable fields, that leaves us the only
option of deleting the old service and creating the new one. Unfortunately,
there is still an issue of unnecessary removal of endpoints associated with
the service, it will be addressed in future commits.

* Revert the unintended effect of go fmt

* Recreate endpoints on service update.

When the service type is changed, the service is deleted and then
the one with the new type is created. Unfortnately, endpoints are
deleted as well. Re-create them afterwards, preserving the original
addresses stored in them.

* Improve error messages and comments. Use generate instead of gen in names.
2017-06-30 13:38:49 +02:00
Murat Kabilov e104a67260 Fix resync of the clusters 2017-06-08 11:51:48 +02:00
Murat Kabilov f7aaf8863d Change maintenance window format 2017-05-30 09:56:10 +02:00
Murat Kabilov 95a57d1e4f Use named arguments in the DNS name format 2017-05-18 17:23:59 +02:00
Murat Kabilov 0fd498d4d3 set image pull policy to ifnotpresent 2017-05-12 16:38:42 +02:00
Murat Kabilov deef84e606 remove new line from the token;
remove unnecessary data keys from the postgresq-operator secret
2017-05-12 16:38:42 +02:00
Murat Kabilov 9ee9e286ec make use of the local fake teams api 2017-05-12 16:38:42 +02:00
Murat Kabilov 2370659c69 Parallel cluster processing
Run operations concerning multiple clusters in parallel. Each cluster gets its
own worker in order to create, update, sync or delete clusters.  Each worker
acquires the lock on a cluster.  Subsequent operations on the same cluster
have to wait until the current one finishes.  There is a pool of parallel
workers, configurable with the `workers` parameter in the configmap and set by
default to 4. The cluster-related tasks  are assigned to the workers based on
a cluster name: the tasks for the same cluster will be always assigned to the
same worker. There is no blocking between workers, although there is a chance
that a single worker will become a bottleneck if too many clusters are
assigned to it; therefore, for large-scale deployments it might be necessary
to bump up workers from the default value.
2017-05-12 11:41:35 +02:00
Oleksii Kliukin 1c4bce86df Avoid "bulk-comparing" pod resources during sync. (#109)
* Avoid "bulk-comparing" pod resources during sync.

First attempt to fix bogus restarts due to the reported mismatch
of container resources where one of the resources is an empty struct,
while the other has all fields set to nil.

In addition, add an ability to set limits and requests per pod, as well as the operator-level defaults.
2017-05-12 11:41:35 +02:00
Murat Kabilov 8026c69222 update default config param values 2017-05-12 11:41:34 +02:00
Murat Kabilov da438aab3a Use ConfigMap to store operator's config 2017-05-12 11:41:34 +02:00
Oleksii Kliukin 71b93b4cc2 Feature/infrastructure roles (#91)
* Add infrastructure roles configured globally.

Those are the roles defined in the operator itself. The operator's
configuration refers to the secret containing role names, passwords
and membership information. While they are referred to as roles, in
reality those are users.

In addition, improve the regex to filter out invalid users and
make sure user secret names are compatible with DNS name spec.

Add an example manifest for the infrastructure roles.
2017-05-12 11:41:33 +02:00
Murat Kabilov dd2ed5ff9d Add team name to tpr object metadata name 2017-05-12 11:41:33 +02:00
Murat Kabilov c2d2a67ad5 Get config from environment variables;
ignore pg major version change;
get rid of resources package;
2017-05-12 11:41:29 +02:00