postgres-operator

Commit Graph

Author	SHA1	Message	Date
Felix Kunde	76d43525f7	define more default values for opConfig CRD (#955 )	2020-05-04 16:23:21 +02:00
Rafia Sabih	d52296c323	Propagate annotations to the StatefulSet (#932 ) * Initial commit * Corrections - set the type of the new configuration parameter to be array of strings - propagate the annotations to statefulset at sync * Enable regular expression matching * Improvements -handle rollingUpdate flag -modularize code -rename config parameter name * fix merge error * Pass annotations to connection pooler deployment * update code-gen * Add documentation and update manifests * add e2e test and introduce option in configmap * fix service annotations test * Add unit test * fix e2e tests * better key lookup of annotations tests * add debug message for annotation tests * Fix typos * minor fix for looping * Handle update path and renaming - handle the update path to update sts and connection pooler deployment. This way no need to wait for sync - rename the parameter to downscaler_annotations - handle other review comments * another try to fix python loops * Avoid unneccessary update events * Update manifests * some final polishing * fix cluster_test after polishing Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de> Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-05-04 14:46:56 +02:00
Felix Kunde	865d5b41a7	set event broadcasting to Infof and update rbac (#952 )	2020-04-29 17:26:46 +02:00
Felix Kunde	d76203b3f9	Bootstrapped databases with best practice role setup (#843 ) * PreparedDatabases with default role setup * merge changes from master * include preparedDatabases spec check when syncing databases * create a default preparedDB if not specified * add more default privileges for schemas * use empty brackets block for undefined objects * cover more default privilege scenarios and always define admin role * add DefaultUsers flag * support extensions and defaultUsers for preparedDatabases * remove exact version in deployment manifest * enable CRD validation for new field * update generated code * reflect code review * fix typo in SQL command * add documentation for preparedDatabases feature + minor changes * some datname should stay * add unit tests * reflect some feedback * init users for preparedDatabases also on update * only change DB default privileges on creation * add one more section in user docs * one more sentence	2020-04-29 10:56:06 +02:00
Sergey Dudoladov	cc635a02e3	Lazy upgrade of the Spilo image (#859 ) * initial implementation * describe forcing the rolling upgrade * make parameter name more descriptive * add missing pieces * address review * address review * fix bug in e2e tests * fix cluster name label in e2e test * raise test timeout * load spilo test image * use available spilo image * delete replica pod for lazy update test * fix e2e * fix e2e with a vengeance * lets wait for another 30m * print pod name in error msg * print pod name in error msg 2 * raise timeout, comment other tests * subsequent updates of config * add comma * fix e2e test * run unit tests before e2e * remove conflicting dependency * Revert "remove conflicting dependency" This reverts commit `65fc09054b`. * improve cdp build * dont run unit before e2e tests * Revert "improve cdp build" This reverts commit `e2a8fa12aa`. Co-authored-by: Sergey Dudoladov <sergey.dudoladov@zalando.de> Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-04-29 10:07:14 +02:00
Felix Kunde	1d009d9595	bump spilo and pooler version + update docs (#945 )	2020-04-28 16:01:13 +02:00
Björn Fischer	168abfe37b	Fully speced global sidecars (#890 ) * implement fully speced global sidecars * fix issue #924	2020-04-27 17:40:22 +02:00
siku4	f32c615a53	fix typo in additionalVolume struct (#933 ) * fix typo in additionalVolume struct Co-authored-by: siku4 <sk@sik-net.de>	2020-04-27 12:22:42 +02:00
Christian Rohmann	21b9b6fcbe	Emit K8S events to the postgresql CR as feedback to the requestor / user (#896 ) * Add EventsGetter to KubeClient to enable to sending K8S events * Add eventRecorder to the controller, initialize it and hand it down to cluster via its constructor to enable it to emit events this way * Add first set of events which then go to the postgresql custom resource the user interacts with to provide some feedback * Add right to "create" events to operator cluster role * Adapt cluster tests to new function sigurature with eventRecord (via NewFakeRecorder) * Get a proper reference before sending events to a resource Co-authored-by: Christian Rohmann <christian.rohmann@inovex.de>	2020-04-27 08:22:07 +02:00
ReSearchITEng	7e8f6687eb	make tls pr798 use additionalVolumes capability from pr736 (#920 ) * make tls pr798 use additionalVolumes capability from pr736 * move the volume* sections lower * update helm chart crds and docs * fix user.md typos	2020-04-15 15:24:55 +02:00
Thierry Sallé	ea3eef45d9	Additional volumes capability (#736 ) * Allow additional Volumes to be mounted * added TargetContainers option to determine if additional volume need to be mounter or not * fixed dependencies * updated manifest additional volume example * More validation Check that there are no volume mount path clashes or "all" vs ["a", "b"] mixtures. Also change the default behaviour to mount to "postgres" container. * More documentation / example about additional volumes * Revert go.sum and go.mod from origin/master * Declare addictionalVolume specs in CRDs * fixed k8sres after rebase * resolv conflict Co-authored-by: Dmitrii Dolgov <9erthalion6@gmail.com> Co-authored-by: Thierry <thierry@malt.com>	2020-04-15 09:13:35 +02:00
ReSearchITEng	7232326159	Fix val docs (#901 ) * missing quotes in pooler configmap in values.yaml * missing quotes in pooler configmap in values-crd.yaml * docs clarifications * helm3 --skip-crds * Update docs/user.md Co-Authored-By: Felix Kunde <felix-kunde@gmx.de> * details moved in docs Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-04-09 09:16:45 +02:00
Leon Albers	4dee8918bd	Allow configuration of patroni's replication mode (#869 ) * Add patroni parameters for `synchronous_mode` * Update complete-postgres-manifest.yaml, removed quotation marks * Update k8sres_test.go, adjust result for `Patroni configured` * Update k8sres_test.go, adjust result for `Patroni configured` * Update complete-postgres-manifest.yaml, set synchronous mode to false in this example * Update pkg/cluster/k8sres.go Does the same but is shorter. So we fix that it if you like. Co-Authored-By: Felix Kunde <felix-kunde@gmx.de> * Update docs/reference/cluster_manifest.md Co-Authored-By: Felix Kunde <felix-kunde@gmx.de> * Add patroni's `synchronous_mode_strict` * Extend `TestGenerateSpiloConfig` with `SynchronousModeStrict` Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-04-06 14:27:17 +02:00
Felix Kunde	64389b8bad	update image and docs for connection pooler (#898 )	2020-04-03 16:28:36 +02:00
ReSearchITEng	1249626a60	kubernetes_use_configmap (#887 ) * kubernetes_use_configmap * Update manifests/postgresql-operator-default-configuration.yaml Co-Authored-By: Felix Kunde <felix-kunde@gmx.de> * Update manifests/configmap.yaml Co-Authored-By: Felix Kunde <felix-kunde@gmx.de> * Update charts/postgres-operator/values.yaml Co-Authored-By: Felix Kunde <felix-kunde@gmx.de> * go.fmt Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-04-02 13:20:45 +02:00
Felix Kunde	b43b22dfcc	Call me pooler, not pool (#883 ) * rename pooler parts and add example to manifest * update codegen * fix manifest and add more details to docs * reflect renaming also in e2e tests	2020-04-01 10:34:03 +02:00
ReSearchITEng	6ed1030838	TLS - add OpenShift compatibility (#885 ) * solves https://github.com/zalando/postgres-operator/pull/798#issuecomment-605201260 Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-04-01 09:39:54 +02:00
Felix Kunde	ba9cf68650	Change type of pod environment config map to NamespacedName (#870 ) * allow PodEnvironmentConfigMap in other namespaces * update codegen * update docs and comments	2020-03-25 15:59:31 +01:00
Dmitry Dolgov	9dfa433363	Connection pooler (#799 ) Connection pooler support Add support for a connection pooler. The idea is to make it generic enough to be able to switch between different implementations (e.g. pgbouncer or odyssey). Operator needs to create a deployment with pooler and a service for it to access. For connection pool to work properly, a database needs to be prepared by operator, namely a separate user have to be created with an access to an installed lookup function (to fetch credential for other users). This setups is supposed to be used only by robot/application users. Usually a connection pool implementation is more CPU bounded, so it makes sense to create several pods for connection pool with more emphasize on cpu resources. At the moment there are no special affinity or tolerations assigned to bring those pods closer to the database. For availability purposes minimal number of connection pool pods is 2, ideally they have to be distributed between different nodes/AZ, but it's not enforced in the operator itself. Available configuration supposed to be ergonomic and in the normal case require minimum changes to a manifest to enable connection pool. To have more control over the configuration and functionality on the pool side one can customize the corresponding docker image. Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-03-25 12:57:26 +01:00
Felix Kunde	07c5da35e3	fix minor issues in docs and manifests (#866 ) * fix minor issues in docs and manifests * double retry_timeout_sec	2020-03-18 15:02:13 +01:00
Felix Kunde	cf829df1a4	define ownership between operator and clusters via annotation (#802 ) * define ownership between operator and postgres clusters * add documentation * add unit test	2020-03-17 16:34:31 +01:00
zimbatm	65fb2ce1a6	add support for custom TLS certificates (#798 ) * add support for custom TLS certificates	2020-03-13 11:44:38 +01:00
Hengchu Zhang	51909204fd	Change `logging_rest_api.api_port` to `8080` instead of `8008` (#848 ) The documentation states that the default operator REST service is at port `8080`, but the current default CRD based configuration is `8008`. Changing the default config to match documentation.	2020-02-28 14:13:58 +01:00
Felix Kunde	b24da3201c	bump version to 1.4.0 + some polishing (#839 ) * bump version to 1.4.0 + some polishing * align version for UI chart * update user docs to warn for standby replicas * minor log message changes for RBAC resources	2020-02-25 09:50:54 +01:00
Felix Kunde	7b94060d17	fix validation for S3ForcePathStyle (#841 )	2020-02-21 16:36:23 +01:00
Felix Kunde	e2a9b03913	bump spilo version to latest release (#836 )	2020-02-20 16:21:21 +01:00
Felix Kunde	742d7334a1	use cluster-name as default label everywhere (#782 ) * use cluster-name as default label everywhere * fix e2e test	2020-02-19 15:01:01 +01:00
Felix Kunde	d5660f65bb	[UI] add tab for monthly costs per cluster (#796 ) * add tab for monthly costs per cluster * sync run_local and update version number * lowering resources * some Makefile polishing and updated admin docs on UI * extend admin docs on UI * add api-service manifest for operator * set min limits in UI to default min limits of operator * reflect new UI helm charts in docs * make cluster name label configurable	2020-02-19 12:58:24 +01:00
Felix Kunde	aea9e9bd33	postgres-pod clusterrole (#832 ) * define postgres-pod clusterrole and align rbac in chart * align UI chart rbac with operator and update doc * operator RBAC needs podsecuritypolicy to grant it to postgres-pod	2020-02-19 12:32:54 +01:00
Felix Kunde	702a194c41	switch to rbac/v1 (#829 ) * switch to rbac/v1	2020-02-17 11:25:07 +01:00
Felix Kunde	3b10dc645d	patch/update services on type change (#824 ) * use Update when disabling LoadBalancer + added e2e test	2020-02-13 16:24:15 +01:00
Jonathan Juares Beber	744c71d16b	Allow services update when changing annotations (#818 ) The current implementations for `pkg.util.k8sutil.SameService` considers only service annotations change on the default annotations created by the operator. Custom annotations are not compared and consequently not applied after the first service creation. This commit introduces a complete annotations comparison between the current service created by the operator and the new one generated based on the configs. Also, it adds tests on the above-mentioned function.	2020-02-13 10:55:30 +01:00
Jonathan Juares Beber	ba60e15d07	Add ServiceAnnotations cluster config (#803 ) The [operator parameters][1] already support the `custom_service_annotations` config.With this parameter is possible to define custom annotations that will be used on the services created by the operator. The `custom_service_annotations` as all the other [operator parameters][1] are defined on the operator level and do not allow customization on the cluster level. A cluster may require different service annotations, as for example, set up different cloud load balancers timeouts, different ingress annotations, and/or enable more customizable environments. This commit introduces a new parameter on the cluster level, called `serviceAnnotations`, responsible for defining custom annotations just for the services created by the operator to the specifically defined cluster. It allows a mix of configuration between `custom_service_annotations` and `serviceAnnotations` where the latest one will have priority. In order to allow custom service annotations to be used on services without LoadBalancers (as for example, service mesh services annotations) both `custom_service_annotations` and `serviceAnnotations` are applied independently of load-balancing configuration. For retro-compatibility purposes, `custom_service_annotations` is still under [Load balancer related options][2]. The two default annotations when using LoadBalancer services, `external-dns.alpha.kubernetes.io/hostname` and `service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout` are still defined by the operator. `service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout` can be overridden by `custom_service_annotations` or `serviceAnnotations`, allowing a more customizable environment. `external-dns.alpha.kubernetes.io/hostname` can not be overridden once there is no differentiation between custom service annotations for replicas and masters. It updates the documentation and creates the necessary unit and e2e tests to the above-described feature too. [1]: https://github.com/zalando/postgres-operator/blob/master/docs/reference/operator_parameters.md [2]: https://github.com/zalando/postgres-operator/blob/master/docs/reference/operator_parameters.md#load-balancer-related-options	2020-02-10 12:03:25 +01:00
Vito Botta	a660d758a5	Add region setting for logical backups to non-AWS storage (#813 ) * Add region setting for logical backups to non-AWS storage	2020-02-10 11:48:24 +01:00
Felix Kunde	1f0312a014	make minimum limits boundaries configurable (#808 ) * make minimum limits boundaries configurable * add e2e test	2020-02-03 11:43:18 +01:00
Felix Kunde	077f9af4e3	bump to v1.3.1 (#780 )	2020-01-06 14:08:47 +01:00
Felix Kunde	9d7604ecf0	use v1.3.0-dirty tag (#778 )	2020-01-02 14:06:23 +01:00
Felix Kunde	59a329d77b	update operator release image (#777 )	2020-01-02 13:41:58 +01:00
Felix Kunde	7af1de890c	bump operator v1.3.0 with Spilo 12 image (#770 )	2019-12-17 17:13:56 +01:00
Felix Kunde	182e3bc7db	add missing fields to OperatorConfiguration CRD validation (#767 )	2019-12-16 17:08:09 +01:00
Felix Kunde	629feac98f	Remove bind verb and explain privileges (#765 ) Closes #256	2019-12-16 17:07:36 +01:00
Felix Kunde	97e0d6d388	extend docs and polish manifest examples (#762 )	2019-12-12 17:55:41 +01:00
Felix Kunde	cd110aabf4	Enforce minimum cpu and memory limits (#731 ) * add validation for PG resources and volume size * check resource requests also on UPDATE and SYNC + update docs * if cluster was running don't error on sync	2019-12-12 16:43:55 +01:00
Felix Kunde	0628439256	fix cpu resource validation (#757 )	2019-12-10 16:30:57 +01:00
Felix Kunde	107334fe71	Add global option to enable/disable init containers and sidecars (#478 ) * Add global option to enable/disable init containers and sidecars * update dependencies	2019-12-10 15:45:54 +01:00
Felix Kunde	11c2e815f7	include status subresource in validation (#744 ) * include status subresource in validation	2019-12-02 15:27:47 +01:00
Felix Kunde	a3b34f146f	Add CRD validation (#599 ) * add CRD manifests with validation * update documentation * patroni slots is not an array but a nested hash map * make deps call tools * cover validation in docs and export it in crds.go * add toggle to disable creation of CRD validation and document it * use templated service account also for CRD-configured helm deployment	2019-11-28 12:02:05 +01:00
Armin Nesiren	5f87384d7f	Passing endpoint, access and secret key to logical-backup container (#628 ) * Added possibility to add custom annotations to LoadBalancer service. * Added parameters for custom endpoint, access and secret key for logical backup. * Modified dump.sh so it knows how to handle new features. Configurable S3 SSE	2019-11-26 10:40:49 +01:00
Felix Kunde	f9487e41c1	inject cluster name label into logical backup pod (#725 ) * inject cluster name label into logical backup pod	2019-11-20 13:58:41 +01:00
Felix Kunde	63d1f3bbe4	add get for deployments to operator RBAC (#724 )	2019-11-18 12:13:05 +01:00
Felix Kunde	de14323f8e	disable load balancers by default in examples (#715 )	2019-11-12 10:57:48 +01:00
Thomas Runyon	535517cd1b	Custom annotations 329 (#657 ) * Add ability for custom annotations to database pods	2019-11-11 10:45:35 +01:00
Emre Hasegeli	b738283f6f	charts: Add pods/exec permission (#694 )	2019-10-23 11:22:23 +02:00
Erik Inge Bolsø	e3b39a5cbe	document configmap variant of inherited_labels (#678 ) * document configmap varient of inherited_labels and remove application label from cluster example since we will get application:spilo by default	2019-10-05 10:10:02 +02:00
Sergey Dudoladov	cf97ebb2b8	fix e2e tests (#672 ) * fix e2e tests * change Spilo version everywhere	2019-09-23 17:48:53 +02:00
Felix Kunde	4a099d698d	bump to v1.2.0 (#631 ) * bump to v1.2.0 * yaml lint: add one more space before incline comments	2019-07-25 12:23:53 +02:00
Felix Kunde	0aff809958	use h1 tags to not render titles in sidebar (#626 )	2019-07-19 12:50:39 +02:00
Jakub Román	8ec3fa1ec8	Kustomization (#608 ) Add ability to install the operator via kustomization.	2019-07-15 17:17:42 +02:00
Felix Kunde	4fc5822b24	Update docs for v1.2 (#609 ) * update docs and move parts from README to index.md * fix typos, headings and code alignment in docs	2019-07-11 17:19:27 +02:00
Felix Kunde	7c19cf50db	align config map, operator config, helm chart values and templates (#595 ) * align config map, operator config, helm chart values and templates * follow helm chart conventions also in CRD templates * split up values files and add comments * avoid yaml confusion in postgres manifests * bump spilo version and use example for logical_backup_s3_bucket * add ConfigTarget switch to values	2019-07-08 17:49:25 +02:00
Felix Kunde	3a914f9a3c	camelCasing all manifest parameters (#602 ) * deprecate snake_case manifest parameters * move backward compatible check and update test	2019-07-05 18:14:03 +02:00
Felix Kunde	36003b8264	enable shmVolume setting in OperatorConfiguration (#605 ) * enable shmVolume setting in OperatorConfiguration	2019-07-05 16:48:37 +02:00
Erik Inge Bolsø	d69211032e	update postgres-operator deployment to apps/v1 (#598 )	2019-06-28 13:41:50 +02:00
Rafia Sabih	540d58d5bd	Adding the support for standby cluster This will set up a continuous wal streaming cluster, by adding the corresponding section in postgres manifest. Instead of having a full-fledged standby cluster as in Patroni, here we use only the wal path of the source cluster and stream from there. Since, standby cluster is streaming from the master and does not require to create or use databases of it's own. Hence, it bypasses the creation of users or databases. There is a separate sample manifest added to set up a standby-cluster.	2019-06-21 10:11:39 +02:00
Markus	93bfed3e75	Add secret mount to operator (#535 ) * add secret mount to operator	2019-06-19 12:40:49 +02:00
Taehyun Kim	0ed92ed04e	add deletecollection verb (#589 ) Fixing privileges to execute `patronictl remove`. You could/should have also just used the operator delete cluster flow (remove manifest). It is not really the plan to use patroni inside a pod to remove a existing cluster.	2019-06-19 10:47:27 +02:00
Felix Kunde	6918394562	Add PDB configuration toggle (#583 ) * Don't create an impossible disruption budget for smaller clusters. * sync PDB also on update	2019-06-18 10:48:21 +02:00
Erik Inge Bolsø	e1d9395338	rbac: add user-facing clusterroles (#585 ) * rbac: add user-facing clusterroles	2019-06-14 15:59:51 +02:00
Erik Inge Bolsø	028b834ea6	postgres-operator deployment template: run operator as non-root, and with readonly filesystem (#582 )	2019-06-14 15:47:08 +02:00
Erik Inge Bolsø	ad5fec9bee	docs: add storageclass to complete-postgres-manifest example (#586 )	2019-06-11 16:25:02 +02:00
Aaron Miller	ec5b1d4d58	StatefulSet fsGroup config option to allow non-root spilo (#531 ) * StatefulSet fsGroup config option to allow non-root spilo * Allow Postgres CRD to overide SpiloFSGroup of the Operator. * Document FSGroup of a Pod cannot be changed after creation.	2019-06-04 16:38:26 +02:00
Felix Kunde	5a0e95ac45	Add CRD configuration to Helm chart values.yaml (#559 ) * add templates for CRDs incl. crd-install hooks * support both config styles in values.yaml * fix ServiceAccount naming in values.yaml	2019-06-03 14:48:32 +02:00
Erik Inge Bolsø	ebda39368e	database.go: remove hardcoded .svc.cluster.local dns suffix (#561 ) * database.go: substitute hardcoded .svc.cluster.local dns suffix with config parameter Use the pod's configured dns search path, for clusters where .svc.cluster.local is not correct.	2019-05-31 16:32:00 +02:00
Erik Inge Bolsø	b619569e28	Improve cluster sidecar documentation (#573 )	2019-05-27 15:31:52 +02:00
Sergey Dudoladov	f3e1e80aaf	Add logical backup (#442 ) * Add k8s cron job to spawn logical backups * Minor doc updates	2019-05-16 15:52:01 +02:00
Felix Kunde	4b9e6058e1	add update for CRD to RBAC (#564 )	2019-05-13 17:36:15 +02:00
Dmitry Dolgov	f29bdaf96a	Override clone s3 bucket path (#487 ) Override clone s3 bucket path Add possibility to use a custom s3 bucket path for cloning a cluster from an arbitrary bucket (e.g. from another k8s cluster). For that a new config options is introduced `s3_wal_path`, that should point to a location that spilo would understand.	2019-05-10 12:52:42 +02:00
Felix Kunde	ad0b250b5b	patch CRD on operator update (#558 ) * patch existing CRD each time there is an operator update	2019-05-09 12:35:15 +02:00
Felix Kunde	0fbfbb23bb	Use /status subresource instead of plain manifest field (#534 ) * turns PostgresStatus type into a struct with field PostgresClusterStatus * setStatus patch target is now /status subresource * unmarshalling PostgresStatus takes care of previous status field convention * new simple bool functions status.Running(), status.Creating()	2019-05-07 12:01:45 +02:00
Sergey Dudoladov	c1d108a832	Fix CRD-based operator configuration (#541 ) * Fix CRD-based operator configuration * add inherited labels, update docker image	2019-04-15 13:52:38 +02:00
Aaron Miller	15ec6a920d	Config option to allow Spilo container to run non-privileged. (#525 ) * Config option to allow Spilo container to run non-privileged. Runs non-privileged by default. Fixes #395 * add spilo_privileged to manifests/configmap.yaml * add spilo_privileged to helm chart's values.yaml	2019-04-03 17:13:39 +02:00
Sergey Dudoladov	0b53dbe5dc	Set statefulset update and management policy explicitly (#515 ) * fix logging in retry * explicitly set the stateful set update strategy to onDelete * add podManagementPolicy	2019-03-13 11:49:18 +01:00
Sergey Dudoladov	f400539b69	Retry moving master pods (#463 ) * Retry moving master pods * bump up master pod wait timeout	2019-02-28 16:19:27 +01:00
Stephane T	d11b23bd71	Add inherited_labels (#459 ) * add support for inherited_labels Signed-off-by: Stephane Tang <hi@stang.sh> * update docs with inherited_labels Signed-off-by: Stephane Tang <hi@stang.sh>	2019-02-14 12:29:06 +01:00
Sergey Dudoladov	43e8288751	Fix run operator locally (#462 ) * make test namespace optional * Update spilo/operator images * Add a command to replace operator image w/o minikube restart	2019-01-29 11:10:14 +01:00
Maxim Ivanov	3544cc90fa	Allow specifying init_containers in Postgres CRD (#445 ) * Add support for init_containers	2019-01-29 11:08:44 +01:00
Armin Nesiren	6f6a599c90	Added possibility to add custom annotations to LoadBalancer service. (#461 ) * Added possibility to add custom annotations to LoadBalancer service.	2019-01-25 11:35:27 +01:00
Jan Mussler	7445678261	bump spilo versions. (#439 )	2019-01-04 12:25:38 +01:00
zerg-junior	5cfcc453a9	Update CRD configuration docs and fix the CDP build (#414 ) * Update CRD configuration docs * document resource consumption of the operator * Add talks by Oleksii	2019-01-02 12:01:47 +01:00
zerg-junior	c0b0b9a832	[WIP] Add 'admin' option to create role (#425 ) * Add 'admin' option to create role * Fix run_locally_script	2018-12-27 10:14:33 +01:00
Dmitry Dolgov	d6e6b00770	Add shm_volume option (#427 ) Add possibility to mount a tmpfs volume to /dev/shm to avoid issues like [this](https://github.com/docker-library/postgres/issues/416). To achieve that two new options were introduced: * `enableShmVolume` to PostgreSQL manifest, to specify whether or not mount this volume per database cluster * `enable_shm_volume` to operator configuration, to specify whether or not mount per operator. The first one, `enableShmVolume` takes precedence to allow us to be more flexible.	2018-12-21 16:22:30 +01:00
zerg-junior	45c89b3da4	[WIP] Add set_memory_request_to_limit option (#406 ) * Add set_memory_request_to_limit option	2018-11-15 14:00:08 +01:00
jens-totemic	f25351c36a	Make OperatorConfiguration work (#410 ) * Fixes # 404	2018-11-13 11:22:07 +01:00
zerg-junior	ccaee94a35	Minor improvements (#381 ) * Minor improvements * Document empty list vs null for users without privileges * Change the wording for null values * Add talk by Oleksii in Atmosphere	2018-11-06 11:08:13 +01:00
zerg-junior	86ba92ad02	Rename 'permanent_slots' field to 'slots' (#401 )	2018-10-31 16:11:28 +01:00
zerg-junior	1b4181a724	[WIP] Add the ability to configure replications slots in Patroni (#398 ) * Add the ability to configure replication slots in Patroni * Add debugging to Makefile for CDP builds	2018-10-31 13:10:56 +01:00
Oleksii Kliukin	e1ed4b847d	Use code-generation for CRD API and deepcopy methods (#369 ) Client-go provides a https://github.com/kubernetes/code-generator package in order to provide the API to work with CRDs similar to the one available for built-in types, i.e. Pods, Statefulsets and so on. Use this package to generate deepcopy methods (required for CRDs), instead of using an external deepcopy package; we also generate APIs used to manipulate both Postgres and OperatorConfiguration CRDs, as well as informers and listers for the Postgres CRD, instead of using generic informers and CRD REST API; by using generated code we can get rid of some custom and obscure CRD-related code and use a better API. All generated code resides in /pkg/generated, with an exception of zz_deepcopy.go in apis/acid.zalan.do/v1 Rename postgres-operator-configuration CRD to OperatorConfiguration, since the former broke naming convention in the code-generator. Moved Postgresql, PostgresqlList, OperatorConfiguration and OperatorConfigurationList and other types used by them into Change the type of the Error field in the Postgresql crd to a string, so that client-go could generate a deepcopy for it. Use generated code to set status of CRD objects as well. Right now this is done with patch, however, Kubernetes 1.11 introduces the /status subresources, allowing us to set the status with the special updateStatus call in the future. For now, we keep the code that is compatible with earlier versions of Kubernetes. Rename postgresql.go to database.go and status.go to logs_and_api.go to reflect the purpose of each of those files. Update client-go dependencies. Minor reformatting and renaming.	2018-08-15 17:22:25 +02:00
Jan Mussler	6e8dcabac7	Update postgres-operator.yaml Bump manifest to use v1.0.0 operator	2018-08-10 14:17:44 +02:00
Oleksii Kliukin	0181a1b5b1	Introduce a repair scan to fix failing clusters (#304 ) A repair is a sync scan that acts only on those clusters that indicate that the last add, update or sync operation on them has failed. It is supposed to kick in more frequently than the repair scan. The repair scan still remains to be useful to fix the consequences of external actions (i.e. someone deletes a postgres-related service by mistake) unbeknownst to the operator. The repair scan is controlled by the new repair_period parameter in the operator configuration. It has to be at least 2 times more frequent than a sync scan to have any effect (a normal sync scan will update both last synced and last repaired attributes of the controller, since repair is just a sync underneath). A repair scan could be queued for a cluster that is already being synced if the sync period exceeds the interval between repairs. In that case a repair event will be discarded once the corresponding worker finds out that the cluster is not failing anymore. Review by @zerg-junior	2018-07-24 11:21:45 +02:00
zerg-junior	accbe20804	Upgrade version to enable RBAC in multiple namespace (#348 )	2018-07-19 18:22:30 +02:00
zerg-junior	417f13c0bd	Submit RBAC credentials during initial Event processing (#344 ) * During initial Event processing submit the service account for pods and bind it to a cluster role that allows Patroni to successfully start. The cluster role is assumed to be created by the k8s cluster administrator.	2018-07-19 16:40:40 +02:00
Oleksii Kliukin	3a9378d3b8	Allow configuring the operator via the YAML manifest. (#326 ) * Up until now, the operator read its own configuration from the configmap. That has a number of limitations, i.e. when the configuration value is not a scalar, but a map or a list. We use a custom code based on github.com/kelseyhightower/envconfig to decode non-scalar values out of plain text keys, but that breaks when the data inside the keys contains both YAML-special elememtns (i.e. commas) and complex quotes, one good example for that is search_path inside `team_api_role_configuration`. In addition, reliance on the configmap forced a flag structure on the configuration, making it hard to write and to read (see https://github.com/zalando-incubator/postgres-operator/pull/308#issuecomment-395131778). The changes allow to supply the operator configuration in a proper YAML file. That required registering a custom CRD to support the operator configuration and provide an example at manifests/postgresql-operator-default-configuration.yaml. At the moment, both old configmap and the new CRD configuration is supported, so no compatibility issues, however, in the future I'd like to deprecate the configmap-based configuration altogether. Contrary to the configmap-based configuration, the CRD one doesn't embed defaults into the operator code, however, one can use the manifests/postgresql-operator-default-configuration.yaml as a starting point in order to build a custom configuration. Since previously `ReadyWaitInterval` and `ReadyWaitTimeout` parameters used to create the CRD were taken from the operator configuration, which is not possible if the configuration itself is stored in the CRD object, I've added the ability to specify them as environment variables `CRD_READY_WAIT_INTERVAL` and `CRD_READY_WAIT_TIMEOUT` respectively. Per review by @zerg-junior and @Jan-M.	2018-07-16 16:20:46 +02:00
zerg-junior	7394c15d0a	Make AWS region configurable in the operator cofig map (#333 )	2018-06-27 17:29:02 +02:00
erthalion	e661ea1ea7	Mention `uid` field	2018-06-01 16:44:57 +02:00
zerg-junior	69e4ae2d95	Update postgres-operator.yaml Tags are of fixed length (not arbitrary long prefixes of commit hashes)	2018-05-25 12:59:12 +02:00
zerg-junior	9c86f8bd96	Fix conf for minikube (#301 ) * Bump up a Spilo version to use Patroni >= v1.4.4 ; this fixes issues with k8s 1.10 API changes * Bump up an operator version to use the new 'etcd_host' default value * Re-use 'zalando-postgres-operator' as a pod service account and add extra RBAC permissions to make it work * Document in quickstart connecting to Postgres via psql	2018-05-25 12:25:42 +02:00
Sergey Dudoladov	83a26fb78b	Rename RBAC file	2018-05-17 12:05:31 +02:00
Sergey Dudoladov	a926515530	Employ RBAC when run on minikube	2018-05-16 15:28:45 +02:00
Sergey Dudoladov	ca8542185a	Add RBAC to Quickstart guide	2018-05-16 11:01:16 +02:00
Oleksii Kliukin	40163677c7	Remove Kubernetes upgrade-related labels The node_eol_label is obsolete and not used. The node_readiness_label, if set, will prevent scheduling pods on the node without that label, by default minikube doesn't set any label on the node.	2018-05-08 15:50:10 +02:00
Jan M	2bb3bdeeb4	Slimming out README and config map, targeting easy first time deployers to minicube.	2018-05-04 12:20:54 +02:00
zerg-junior	8f08bef67c	Merge pull request #277 from zalando-incubator/automatically-deploy-service-account Deploy service account for pod creation on demand	2018-04-26 14:44:37 +02:00
Sergey Dudoladov	c31c76281c	Make operator unaware of its own service account	2018-04-23 14:38:20 +02:00
Manuel Gómez	5e1d86e31e	Fix clone timestamp key in example manifest (#276 ) It was set to `endTimestamp`, but it should be `timestamp`.	2018-04-16 18:23:41 +02:00
Oleksii Kliukin	c44cd9e4e6	Define the operator RBAC (#234 ) Note that the account here is named zalando-postgres-operator and not the 'operator' default that is created in the serviceaccount.yaml and also used by the operator configmap to create new postgres clusters. This is done intentionally, as to avoid breaking those setups that already work. Ideally, the operator should be run under the zalando-postgres-operator service account. However, the service account used to run Postgres clusters does not require all those privileges and is described at https://github.com/zalando/patroni/blob/master/kubernetes/patroni_k8s.yaml The service account defined here acquires some privileges not really used by the operator (i.e. we only need list and watch on configmaps), this is also done intentionally to avoid breaking things if someone decides to configure the same service account in the operator's configmap to run postgres clusters. Documentation and further testing by @zerg-junior	2018-04-05 11:24:24 +02:00
Oleksii Kliukin	26db91c53e	Improve infrastructure role definitions (#208 ) Enhance definitions of infrastructure roles by allowing membership in multiple roles, role options and per-role configuration to be specified in the infrastructure role configmap, which must have the same name as the infrastructure role secret. See manifests/infrastructure-roles-configmap.yaml for the examples and updated README for the description of different types of database roles supposed by the operator and their purposes. Change the logic of merging infrastructure roles with the manifest roles when they have the same name, to return the infrastructure role unchanged instead of merging. Previously, we used to propagate flags from the manifest role to the resulting infrastructure one, as there were no way to define flags for the infrastructure role; however, this is not the case anymore. Code review and tests by @erthalion	2018-04-04 17:21:36 +02:00
zerg-junior	ff5793b584	Merge pull request #258 from zalando-incubator/always-create-replica-service [WIP] Always create replica service	2018-03-29 14:42:26 +02:00
Sergey Dudoladov	96d46252f5	Change the default values to closer match previous behaviour	2018-03-26 11:43:46 +02:00
Sergey Dudoladov	a8862aeee1	Enable backward compatibility for enable_load_balancer setting from operator configmap	2018-03-19 17:19:50 +01:00
Sergey Dudoladov	931b48fcbb	Respond to code reviews	2018-03-16 15:36:42 +01:00
Sergey Dudoladov	0986e56226	Add separate params for master and replica load balancers to operator configuration	2018-03-14 12:12:28 +01:00
Sergey Dudoladov	ac6c5bcf09	Explicitly name replica and master load balancer params in PostgresSpec	2018-03-14 12:03:27 +01:00
zerg-junior	cca50122a6	Delete config file added by mistake	2018-03-12 12:54:02 +01:00
Sergey Dudoladov	6839ce0170	Fix configuration of dns names	2018-03-12 12:45:52 +01:00
Jan Mussler	cb55749c1b	Update postgres-operator.yaml (#255 ) Bump operator image version.	2018-02-26 20:03:56 +01:00
Sergey Dudoladov	dcfc9925f6	Respond to code review	2018-02-20 14:43:02 +01:00
Sergey Dudoladov	4c23917d42	Watch all namespaces if the relevant param is empty string / 'default' if param is unset	2018-02-12 11:47:56 +01:00
Sergey Dudoladov	c0bc8eaa6d	Comment manifests	2018-02-08 15:15:47 +01:00
Sergey Dudoladov	8b7bbde06e	Make env var overwrite configmap setting for watching namespaces	2018-02-06 16:12:47 +01:00
Sergey Dudoladov	0ef801f4e0	Add example of the watched namespace to the operator config map	2018-02-06 15:16:21 +01:00
Oleksii Kliukin	b90a36c909	Set node_readiness_label default to an empty value. (#204 ) Previously, it was set to the lifecycle-status:ready, breaking a lot of minikube deployments. Also it was not possible befor to run with this label set to an empty value. Document the effect of the label in the new section of the documentation.	2018-01-16 15:43:03 +01:00
zerg-junior	6c57334666	Add an example for cloning a backup from existing cluster (#189 ) Add an example for cloning a backup from existing cluster	2017-12-19 16:21:06 +01:00
Sergey Dudoladov	c1b3ce8028	Fix loadBalancerConfig	2017-12-18 17:32:22 +01:00
zerg-junior	3c178f68df	Warn on infrastructure-roles.yaml format violations (#177 ) Emit a warning if there are unprocessed entries in the infrastructure-roles secret.	2017-12-15 17:21:41 +01:00
Oleksii Kliukin	dd0affc390	Tweak our reaction to the cluster upgrade process. Previously, the operator started to move the pods off the nodes to be decomissioned by watching the eol_node_label value. Every new postgres pod has been created with the anti-affinity to that label, making sure that the pods being moved won't land on another to be decomissioned node. The changes introduce another label that indicates the ready node. The new pod affinity will esnure that the pod is only scheduled to the node marked as ready, discarding the previous anti-affinity. That way the nodes can transition from the pending-decomission to the other statuses (drained, terminating) without having pods suddently scaled to them. In addition, rename the label that triggers the start of the upgrade process to node_eol_label (for consistency with node_readiness_label) and set its default vvalue to lifecycle-status:pending-decomission.	2017-11-30 14:11:49 +01:00
Oleksii Kliukin	975b21f633	Rename api roles configuration parameter. Change api_roles_configuration to team_api_role_configuration	2017-11-22 10:43:35 +01:00
Oleksii Kliukin	415a7fdc4d	Allow global configuration options for API roles. Add options to the PgUser structure, potentially allowing to set per-role options in the cluster definition as well. Introduce api_roles_configuration operator option with the default of log_statement=all	2017-11-22 10:43:35 +01:00
Jan Mussler	a98a7c95c2	Reorganize Readme (#142 ) removing parts of config. * chaning secret name pattern to make things shorter. * Move section on self building docker image. * Fix typo. * Bump image. * bump version for pdb fix. * Changes in regards to review. * Fix xhyve driver link. * Move to new api, remove service account, not needed for minikube. * Changed minimal manifest and example to use right file. * Added service account for operator again, it is needed in pods anyways later.	2017-10-24 20:42:22 +02:00
Alexander Kukushkin	39200ba8d4	Enable k8s leader election (#145 ) and bump docker image version	2017-10-20 13:58:15 +02:00
Alexander Kukushkin	a98c712a52	Change spilo docker image to demospilo (#141 ) Image size is slightly more than 24MB, it doesn't contain wal-e and not suitable for production, but it is very good for demo purposes.	2017-10-19 13:53:12 +02:00
Oleksii Kliukin	eba23279c8	Kube cluster upgrade	2017-10-19 10:49:42 +02:00
Jan Mussler	cec695d48e	Superuser toggle for team members Make superuser toggleable for team members. Add and "admin" role to team members if superuser is disabled.	2017-10-12 15:01:54 +02:00
Murat Kabilov	702d901bd9	use clear name for env var denoting namespace to watch (#129 )	2017-10-12 10:42:20 +02:00
Murat Kabilov	a35e9c6119	move from tpr to crd	2017-10-06 15:12:08 +02:00
Murat Kabilov	00194d0130	create dbs on cluster create	2017-10-04 16:24:27 +03:00
Murat Kabilov	93d4bf2b55	Merge branch 'master' into api-improvements	2017-09-26 14:47:13 +02:00
Murat Kabilov	9a66e09b88	cluster history api endpoint	2017-09-26 14:30:45 +02:00
Murat Kabilov	d876f4d88e	set secret name template via config map	2017-09-18 14:25:09 +02:00
Oleksii Kliukin	8b85935a7a	Allow cloning clusters from the operator. (#90 ) Allow cloning clusters from the operator. The changes add a new JSON node `clone` with possible values `cluster` and `timestamp`. `cluster` is mandatory, and setting a non-empty `timestamp` triggers wal-e point in time recovery. Spilo and Patroni do the whole heavy-lifting, the operator just defines certain variables and gathers some data about how to connect to the host to clone or the target S3 bucket. As a minor change, set the image pull policy to IfNotPresent instead of Always to simplify local testing. Change the default replication username to standby.	2017-09-08 16:47:03 +02:00
Murat Kabilov	71dfb33b2b	make pod termination grace period configurable	2017-08-18 16:38:25 +02:00
Murat Kabilov	228639b839	add api port and ring log size values to the config map	2017-08-15 12:37:58 +02:00
Oleksii Kliukin	00150711e4	Configure load balancer on a per-cluster and operator-wide level (#57 ) * Deny all requests to the load balancer by default. * Operator-wide toggle for the load-balancer. * Define per-cluster useLoadBalancer option. If useLoadBalancer is not set - then operator-wide defaults take place. If it is true - the load balancer is created, otherwise a service type clusterIP is created. Internally, we have to completely replace the service if the service type changes. We cannot patch, since some fields from the old service that will remain after patch are incompatible with the new one, and handling them explicitly when updating the service is ugly and error-prone. We cannot update the service because of the immutable fields, that leaves us the only option of deleting the old service and creating the new one. Unfortunately, there is still an issue of unnecessary removal of endpoints associated with the service, it will be addressed in future commits. * Revert the unintended effect of go fmt * Recreate endpoints on service update. When the service type is changed, the service is deleted and then the one with the new type is created. Unfortnately, endpoints are deleted as well. Re-create them afterwards, preserving the original addresses stored in them. * Improve error messages and comments. Use generate instead of gen in names.	2017-06-30 13:38:49 +02:00
Murat Kabilov	e104a67260	Fix resync of the clusters	2017-06-08 11:51:48 +02:00
Murat Kabilov	f7aaf8863d	Change maintenance window format	2017-05-30 09:56:10 +02:00
Murat Kabilov	95a57d1e4f	Use named arguments in the DNS name format	2017-05-18 17:23:59 +02:00
Murat Kabilov	0fd498d4d3	set image pull policy to ifnotpresent	2017-05-12 16:38:42 +02:00
Murat Kabilov	deef84e606	remove new line from the token; remove unnecessary data keys from the postgresq-operator secret	2017-05-12 16:38:42 +02:00
Murat Kabilov	9ee9e286ec	make use of the local fake teams api	2017-05-12 16:38:42 +02:00
Murat Kabilov	2370659c69	Parallel cluster processing Run operations concerning multiple clusters in parallel. Each cluster gets its own worker in order to create, update, sync or delete clusters. Each worker acquires the lock on a cluster. Subsequent operations on the same cluster have to wait until the current one finishes. There is a pool of parallel workers, configurable with the `workers` parameter in the configmap and set by default to 4. The cluster-related tasks are assigned to the workers based on a cluster name: the tasks for the same cluster will be always assigned to the same worker. There is no blocking between workers, although there is a chance that a single worker will become a bottleneck if too many clusters are assigned to it; therefore, for large-scale deployments it might be necessary to bump up workers from the default value.	2017-05-12 11:41:35 +02:00
Oleksii Kliukin	1c4bce86df	Avoid "bulk-comparing" pod resources during sync. (#109 ) * Avoid "bulk-comparing" pod resources during sync. First attempt to fix bogus restarts due to the reported mismatch of container resources where one of the resources is an empty struct, while the other has all fields set to nil. In addition, add an ability to set limits and requests per pod, as well as the operator-level defaults.	2017-05-12 11:41:35 +02:00
Murat Kabilov	8026c69222	update default config param values	2017-05-12 11:41:34 +02:00
Murat Kabilov	da438aab3a	Use ConfigMap to store operator's config	2017-05-12 11:41:34 +02:00
Oleksii Kliukin	71b93b4cc2	Feature/infrastructure roles (#91 ) * Add infrastructure roles configured globally. Those are the roles defined in the operator itself. The operator's configuration refers to the secret containing role names, passwords and membership information. While they are referred to as roles, in reality those are users. In addition, improve the regex to filter out invalid users and make sure user secret names are compatible with DNS name spec. Add an example manifest for the infrastructure roles.	2017-05-12 11:41:33 +02:00
Murat Kabilov	dd2ed5ff9d	Add team name to tpr object metadata name	2017-05-12 11:41:33 +02:00
Murat Kabilov	c2d2a67ad5	Get config from environment variables; ignore pg major version change; get rid of resources package;	2017-05-12 11:41:29 +02:00
Oleksii Kliukin	1817bf65a1	Make example manifests minikube-friendly. Remove fixed namespace from all manifests, reduce resource requests. Remove the storageclass default, since it is not present in minikube. Use the team name instead of integer id, remove unused robots. The manifests are still compatible with the non-local deployment, the only difference is that now a namespace is requred (assuming that the operator can only be deployed in a specific namespace.)	2017-05-12 11:41:28 +02:00
Oleksii Kliukin	a2e78ac2ec	Feature/persistent volumes	2017-05-12 11:41:25 +02:00
Murat Kabilov	ae77fa15e8	Pod Rolling update introduce Pod events channel; add parsing of the MaintenanceWindows section; skip deleting Etcd key on cluster delete; use external etcd host; watch for tpr/pods in the namespace of the operator pod only;	2017-05-12 11:41:25 +02:00
Murat Kabilov	2b8956bd33	Add service account manifest	2017-05-12 11:41:19 +02:00
Murat Kabilov	dfde075c66	Use TPR object namespace while creating its objects	2017-05-12 11:37:09 +02:00
Murat Kabilov	6e2d64bd50	Create human users from teams api	2017-05-12 11:37:09 +02:00
Murat Kabilov	58506634c4	Create pg users	2017-05-12 11:37:09 +02:00
Murat Kabilov	abb1173035	Code refactor	2017-05-12 11:37:09 +02:00
Murat Kabilov	75e6bfa55c	makefile improvements	2017-05-12 11:37:07 +02:00

... 2 3 4 5 6 ...

324 Commits