* StatefulSet fsGroup config option to allow non-root spilo
* Allow Postgres CRD to overide SpiloFSGroup of the Operator.
* Document FSGroup of a Pod cannot be changed after creation.
* database.go: substitute hardcoded .svc.cluster.local dns suffix with config parameter
Use the pod's configured dns search path, for clusters where .svc.cluster.local is not correct.
Override clone s3 bucket path
Add possibility to use a custom s3 bucket path for cloning a cluster
from an arbitrary bucket (e.g. from another k8s cluster). For that
a new config options is introduced `s3_wal_path`, that should point
to a location that spilo would understand.
* turns PostgresStatus type into a struct with field PostgresClusterStatus
* setStatus patch target is now /status subresource
* unmarshalling PostgresStatus takes care of previous status field convention
* new simple bool functions status.Running(), status.Creating()
* Config option to allow Spilo container to run non-privileged.
Runs non-privileged by default.
Fixes#395
* add spilo_privileged to manifests/configmap.yaml
* add spilo_privileged to helm chart's values.yaml
Add possibility to mount a tmpfs volume to /dev/shm to avoid issues like
[this](https://github.com/docker-library/postgres/issues/416). To achieve that
two new options were introduced:
* `enableShmVolume` to PostgreSQL manifest, to specify whether or not mount
this volume per database cluster
* `enable_shm_volume` to operator configuration, to specify whether or not mount
per operator.
The first one, `enableShmVolume` takes precedence to allow us to be more flexible.
* Minor improvements
* Document empty list vs null for users without privileges
* Change the wording for null values
* Add talk by Oleksii in Atmosphere
Client-go provides a https://github.com/kubernetes/code-generator package in order to provide the API to work with CRDs similar to the one available for built-in types, i.e. Pods, Statefulsets and so on.
Use this package to generate deepcopy methods (required for CRDs), instead of using an external deepcopy package; we also generate APIs used to manipulate both Postgres and OperatorConfiguration CRDs, as well as informers and listers for the Postgres CRD, instead of using generic informers and CRD REST API; by using generated code we can get rid of some custom and obscure CRD-related code and use a better API.
All generated code resides in /pkg/generated, with an exception of zz_deepcopy.go in apis/acid.zalan.do/v1
Rename postgres-operator-configuration CRD to OperatorConfiguration, since the former broke naming convention in the code-generator.
Moved Postgresql, PostgresqlList, OperatorConfiguration and OperatorConfigurationList and other types used by them into
Change the type of the Error field in the Postgresql crd to a string, so that client-go could generate a deepcopy for it.
Use generated code to set status of CRD objects as well. Right now this is done with patch, however, Kubernetes 1.11 introduces the /status subresources, allowing us to set the status with
the special updateStatus call in the future. For now, we keep the code that is compatible with earlier versions of Kubernetes.
Rename postgresql.go to database.go and status.go to logs_and_api.go to reflect the purpose of each of those files.
Update client-go dependencies.
Minor reformatting and renaming.
A repair is a sync scan that acts only on those clusters that indicate
that the last add, update or sync operation on them has failed. It is
supposed to kick in more frequently than the repair scan. The repair
scan still remains to be useful to fix the consequences of external
actions (i.e. someone deletes a postgres-related service by mistake)
unbeknownst to the operator.
The repair scan is controlled by the new repair_period parameter in the
operator configuration. It has to be at least 2 times more frequent than
a sync scan to have any effect (a normal sync scan will update both last
synced and last repaired attributes of the controller, since repair is
just a sync underneath).
A repair scan could be queued for a cluster that is already being synced
if the sync period exceeds the interval between repairs. In that case a
repair event will be discarded once the corresponding worker finds out
that the cluster is not failing anymore.
Review by @zerg-junior
* During initial Event processing submit the service account for pods and bind it to a cluster role that allows Patroni to successfully start. The cluster role is assumed to be created by the k8s cluster administrator.
* Up until now, the operator read its own configuration from the
configmap. That has a number of limitations, i.e. when the
configuration value is not a scalar, but a map or a list. We use a
custom code based on github.com/kelseyhightower/envconfig to decode
non-scalar values out of plain text keys, but that breaks when the data
inside the keys contains both YAML-special elememtns (i.e. commas) and
complex quotes, one good example for that is search_path inside
`team_api_role_configuration`. In addition, reliance on the configmap
forced a flag structure on the configuration, making it hard to write
and to read (see
https://github.com/zalando-incubator/postgres-operator/pull/308#issuecomment-395131778).
The changes allow to supply the operator configuration in a proper YAML
file. That required registering a custom CRD to support the operator
configuration and provide an example at
manifests/postgresql-operator-default-configuration.yaml. At the moment,
both old configmap and the new CRD configuration is supported, so no
compatibility issues, however, in the future I'd like to deprecate the
configmap-based configuration altogether. Contrary to the
configmap-based configuration, the CRD one doesn't embed defaults into
the operator code, however, one can use the
manifests/postgresql-operator-default-configuration.yaml as a starting
point in order to build a custom configuration.
Since previously `ReadyWaitInterval` and `ReadyWaitTimeout` parameters
used to create the CRD were taken from the operator configuration, which
is not possible if the configuration itself is stored in the CRD object,
I've added the ability to specify them as environment variables
`CRD_READY_WAIT_INTERVAL` and `CRD_READY_WAIT_TIMEOUT` respectively.
Per review by @zerg-junior and @Jan-M.
* Bump up a Spilo version to use Patroni >= v1.4.4 ; this fixes issues with k8s 1.10 API changes
* Bump up an operator version to use the new 'etcd_host' default value
* Re-use 'zalando-postgres-operator' as a pod service account and add extra RBAC permissions to make it work
* Document in quickstart connecting to Postgres via psql
The node_eol_label is obsolete and not used.
The node_readiness_label, if set, will prevent scheduling pods on the node without that label, by default minikube doesn't set any label on the node.
Note that the account here is named zalando-postgres-operator and not
the 'operator' default that is created in the serviceaccount.yaml and
also used by the operator configmap to create new postgres clusters.
This is done intentionally, as to avoid breaking those setups that
already work. Ideally, the operator should be run under the
zalando-postgres-operator service account. However, the service account
used to run Postgres clusters does not require all those privileges and
is described at
https://github.com/zalando/patroni/blob/master/kubernetes/patroni_k8s.yaml
The service account defined here acquires some privileges not really
used by the operator (i.e. we only need list and watch on configmaps),
this is also done intentionally to avoid breaking things if someone
decides to configure the same service account in the operator's
configmap to run postgres clusters.
Documentation and further testing by @zerg-junior
Enhance definitions of infrastructure roles by allowing membership in multiple roles, role options and per-role configuration to be specified in the infrastructure role configmap, which must have the same name as the infrastructure role secret. See manifests/infrastructure-roles-configmap.yaml for the examples and updated README for the description of different types of database roles supposed by the operator and their purposes.
Change the logic of merging infrastructure roles with the manifest roles when they have the same name, to return the infrastructure role unchanged instead of merging. Previously, we used to propagate flags from the manifest role to the resulting infrastructure one, as there were no way to define flags for the infrastructure role; however, this is not the case anymore.
Code review and tests by @erthalion