The node_eol_label is obsolete and not used.
The node_readiness_label, if set, will prevent scheduling pods on the node without that label, by default minikube doesn't set any label on the node.
* Sanity checks for the cluster name, improve tests.
- check that the normal and clone cluster name complies with the valid
service name. For clone cluster, only do it if clone timestamp is not
set; with a clone timestamp set, the clone name points to the S3 bucket
- add tests and improve existing ones, making sure we don't call Error()
method for an empty error, as well as that we don't miss cases where
expected error is not empty, but actual call to be tested does not
return an error.
Code review by @zerg-junior and @Jan-M
* Improve the pod moving behavior during the Kubernetes cluster upgrade.
Fix an issue of not waiting for at least one replica to become ready
(if the Statefulset indicates there are replicas) when moving the master
pod off the decomissioned node. Resolves the first part of #279.
Small fixes to error messages.
* Eliminate a race condition during the swithover.
When the operator initiates the failover (switchover) that fails and
then retries it for a second time it may happen that the previous
waitForPodChannel is still active. As a result, the operator subscribes
to the former master pod two times, causing a panic.
The problem was that the original code didn't bother to cancel the
waitForPodLalbel for the new master pod in the case when the failover
fails. This commit fixes it by adding a stop channel to that function.
Code review by @zerg-junior
Avoid showing "there is no service in the cluster" when syncing a
service for the cluster if the operator has been restarted after
the cluster had been created.
* Remove 'team' label from the statefulset selector.
I was never supposed to be there, but implicitely statefulset
creates a selector out of meta.labels field. That is the problem
with recent Kubernetes, since statefulset cannot pick up pods
with non-matching label selectors, and we rely on statefulset
picking up old pods after statefulset replacement.
Make sure selector changes trigger replacement of the statefulset.
In the case new selector has more labels than the old one nothing
should be done with a statefulset, otherwise the new statefulset
won't see orphaned pods from the old one, as they won't match the
selector.
See https://github.com/kubernetes/kubernetes/issues/46901#issuecomment-356418393
Note that the account here is named zalando-postgres-operator and not
the 'operator' default that is created in the serviceaccount.yaml and
also used by the operator configmap to create new postgres clusters.
This is done intentionally, as to avoid breaking those setups that
already work. Ideally, the operator should be run under the
zalando-postgres-operator service account. However, the service account
used to run Postgres clusters does not require all those privileges and
is described at
https://github.com/zalando/patroni/blob/master/kubernetes/patroni_k8s.yaml
The service account defined here acquires some privileges not really
used by the operator (i.e. we only need list and watch on configmaps),
this is also done intentionally to avoid breaking things if someone
decides to configure the same service account in the operator's
configmap to run postgres clusters.
Documentation and further testing by @zerg-junior
Enhance definitions of infrastructure roles by allowing membership in multiple roles, role options and per-role configuration to be specified in the infrastructure role configmap, which must have the same name as the infrastructure role secret. See manifests/infrastructure-roles-configmap.yaml for the examples and updated README for the description of different types of database roles supposed by the operator and their purposes.
Change the logic of merging infrastructure roles with the manifest roles when they have the same name, to return the infrastructure role unchanged instead of merging. Previously, we used to propagate flags from the manifest role to the resulting infrastructure one, as there were no way to define flags for the infrastructure role; however, this is not the case anymore.
Code review and tests by @erthalion