postgres-operator

Commit Graph

Author	SHA1	Message	Date
Felix Kunde	1d44dd4694	delete secret resource from map (#2123 )	2022-12-02 10:09:19 +01:00
Philipp B	84fe38a069	switch to batch API v1 for Jobs (#2066 )	2022-10-07 11:27:58 +02:00
Felix Kunde	2aa52094db	switch to policy API v1 for PDBs (#2008 ) * switch to policy API v1 for PDBs * update e2e test dependencies * use kind 0.14.0 * bump K8s client in e2e docker image * bump e2e tests-runner	2022-10-06 09:43:17 +02:00
Felix Kunde	60e0685c32	define readinessProbe on statefulSet (#1825 ) * define readinessProbe on statefulSet * do not error out on deleting Patroni cluster objects * change delete order for patroni objects	2022-03-30 18:19:34 +02:00
Felix Kunde	d032e4783e	LoadBalancer toggles for master and replica pooler pods (#1799 ) * Add support for pooler load balancer Signed-off-by: Sergey Shatunov <me@prok.pw> * Rename to enable_master_pooler_load_balancer Signed-off-by: Sergey Shatunov <me@prok.pw> * target port should be intval * enhance pooler e2e test * add new options to crds.go Co-authored-by: Sergey Shatunov <me@prok.pw>	2022-03-04 13:36:17 +01:00
Felix Kunde	f7858ffb70	Initialize arrays of errors / error messages + minor refactoring (#1701 ) * init error arrays correctly * avoid nilPointer when syncing connectionPooler * getInfrastructureRoles should return error * fix unit tests and return type for getInfrastructureRoles	2021-11-29 12:49:12 +01:00
Rafia Sabih	75a9e2be38	Create cross namespace secrets (#1490 ) * Create cross namespace secrets * add test cases * fixes * Fixes - include namespace in secret name only when namespace is provided - use username.namespace as key to pgUsers only when namespace is provided - avoid conflict in the role creation in db by checking namespace alongwith the username * Update unit tests * Fix test case * Fixes - update regular expression for usernames - add test to allow check for valid usernames - create pg roles with namespace (if any) appended in rolename * add more test cases for valid usernames * update docs * fixes as per review comments * update e2e * fixes * Add toggle to allow namespaced secrets * update docs * comment update * Update e2e/tests/test_e2e.py * few minor fixes * fix unit tests * fix e2e * fix e2e attempt 2 * fix e2e Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de> Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2021-06-11 10:35:30 +02:00
Felix Kunde	32e6c135b9	replace statefulset on annotation diff (#1449 ) * replace statefulset on annotation diff * remove update annotation function for statefulset * add unit test for syncing annotations * add inherited annotation to unit test	2021-04-22 11:22:52 +02:00
Felix Kunde	ff8143770c	Improve rolling upgrades and rolling upgrade continue (#1341 ) * add TODOs for moving rooling update label on pods * steer rolling update via pod annotation * rename patch method and fix reading flag on sync * pass only pods to recreatePods function * do not take address of iterator if you use it later * add e2e test and pass switchover targets to recreatePods * add wait_for_pod_failover for e2e test * add one more e2e test case * helm chart remove 1.6.0 archive from 1.6.0 archive * reflect code review feedback	2021-02-26 15:38:58 +01:00
Rafia Sabih	49158ecb68	Connection pooler for replica (#1127 ) * Enable connection pooler for replica * Refactor code for connection pooler - Move all the relevant code to a separate file - Move all the related tests to a separate file - Avoid using cluster where not required - Simplify the logic in sync and other methods - Cleanup of duplicated or unused code * Fix labels for the replica pods * Update deleteConnectionPooler to include role * Adding test cases and other changes - Fix unit test and delete secret when required only - Make sure we use empty fresh cluster for every test case. * enhance e2e test * Disable pooler in complete manifest as this is source for e2e too an creates unnecessary pooler setups. Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de> Co-authored-by: Jan Mussler <janm81@gmail.com>	2020-11-13 14:52:21 +01:00
Jan Mussler	3a86dfc8bb	End 2 End tests speedup (#1180 ) * Improving end 2 end tests, especially speed of execution and error, by implementing proper eventual asserts and timeouts. * Add documentation for running individual tests * Fixed String encoding in Patorni state check and error case * Printing config as multi log line entity, makes it readable and grepable on startup * Cosmetic changes to logs. Removed quotes from diff. Move all object diffs to text diff. Enabled padding for log level. * Mount script with tools for easy logaccess and watching objects. * Set proper update strategy for Postgres operator deployment. * Move long running test to end. Move pooler test to new functions. * Remove quote from valid K8s identifiers.	2020-10-28 10:04:33 +01:00
Felix Kunde	0508266219	Remove all secrets on delete incl. pooler (#1091 ) * fix syncSecrets and remove pooler secret * update log for deleteSecret * use c.credentialSecretName(username) * minor fix	2020-08-10 18:26:26 +02:00
Felix Kunde	43163cf83b	allow using both infrastructure_roles_options (#1090 ) * allow using both infrastructure_roles_options * new default values for user and role definition * use robot_zmon as parent role * add operator log to debug * right name for old secret * only extract if rolesDefs is empty * set password1 in old infrastructure role * fix new infra rile secret * choose different role key for new secret * set memberof everywhere * reenable all tests * reflect feedback * remove condition for rolesDefs	2020-08-10 15:08:03 +02:00
Felix Kunde	375963424d	delete secrets the right way (#1054 ) * delete secrets the right way * make a one function * continue deleting secrets even if one delete fails Co-authored-by: Felix Kunde <felix.kunde@zalando.de>	2020-07-10 15:07:42 +02:00
Christian Rohmann	8ff7658ed3	Fix pooler delete (#960 ) deleteConnectionPooler function incorrectly checks that the delete api response is ResourceNotFound. Looks like the only consequence is a confusing log message, but obviously it's wrong. Remove negation, since having ResourceNotFound as error is the good case. Co-authored-by: Christian Rohmann <christian.rohmann@inovex.de>	2020-05-13 14:55:54 +02:00
Rafia Sabih	d52296c323	Propagate annotations to the StatefulSet (#932 ) * Initial commit * Corrections - set the type of the new configuration parameter to be array of strings - propagate the annotations to statefulset at sync * Enable regular expression matching * Improvements -handle rollingUpdate flag -modularize code -rename config parameter name * fix merge error * Pass annotations to connection pooler deployment * update code-gen * Add documentation and update manifests * add e2e test and introduce option in configmap * fix service annotations test * Add unit test * fix e2e tests * better key lookup of annotations tests * add debug message for annotation tests * Fix typos * minor fix for looping * Handle update path and renaming - handle the update path to update sts and connection pooler deployment. This way no need to wait for sync - rename the parameter to downscaler_annotations - handle other review comments * another try to fix python loops * Avoid unneccessary update events * Update manifests * some final polishing * fix cluster_test after polishing Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de> Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-05-04 14:46:56 +02:00
Dmitry Dolgov	a1f2bd05b9	Prevent superuser from being a connection pool user (#906 ) * Protected and system users can't be a connection pool user It's not supported, neither it's a best practice. Also fix potential null pointer access. For protected users it makes sense by intent of protecting this users (e.g. from being overriden or used as something else than supposed). For system users the reason is the same as for superuser, it's about replicastion user and it's under patroni control. This is implemented on both levels, operator config and postgresql manifest. For the latter we just use default name in this case, assuming that operator config is always correct. For the former, since it's a serious misconfiguration, operator panics.	2020-04-09 09:21:45 +02:00
Felix Kunde	b43b22dfcc	Call me pooler, not pool (#883 ) * rename pooler parts and add example to manifest * update codegen * fix manifest and add more details to docs * reflect renaming also in e2e tests	2020-04-01 10:34:03 +02:00
Felix Kunde	66f2cda87f	Move operator to go 1.14 (#882 ) * update go modules march 2020 * update to GO 1.14 * reflect k8s client API changes	2020-03-30 15:50:17 +02:00
Dmitry Dolgov	9dfa433363	Connection pooler (#799 ) Connection pooler support Add support for a connection pooler. The idea is to make it generic enough to be able to switch between different implementations (e.g. pgbouncer or odyssey). Operator needs to create a deployment with pooler and a service for it to access. For connection pool to work properly, a database needs to be prepared by operator, namely a separate user have to be created with an access to an installed lookup function (to fetch credential for other users). This setups is supposed to be used only by robot/application users. Usually a connection pool implementation is more CPU bounded, so it makes sense to create several pods for connection pool with more emphasize on cpu resources. At the moment there are no special affinity or tolerations assigned to bring those pods closer to the database. For availability purposes minimal number of connection pool pods is 2, ideally they have to be distributed between different nodes/AZ, but it's not enforced in the operator itself. Available configuration supposed to be ergonomic and in the normal case require minimum changes to a manifest to enable connection pool. To have more control over the configuration and functionality on the pool side one can customize the corresponding docker image. Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-03-25 12:57:26 +01:00
Felix Kunde	3b10dc645d	patch/update services on type change (#824 ) * use Update when disabling LoadBalancer + added e2e test	2020-02-13 16:24:15 +01:00
Felix Kunde	107334fe71	Add global option to enable/disable init containers and sidecars (#478 ) * Add global option to enable/disable init containers and sidecars * update dependencies	2019-12-10 15:45:54 +01:00
Felix Kunde	2ce602fcd7	fix errors when changing service type (#716 ) * fix errors when changing service type * nullify service and endpoint before recreation * improve wait for delete logic and reuse config parameters	2019-11-26 10:28:32 +01:00
Eric	6e682fd6b5	Fixing spelling mistake in delete PVC function name (#691 )	2019-10-18 16:41:56 +02:00
Felix Kunde	f0e29060b1	move StatefulSet to apps/v1 (#675 )	2019-09-30 16:42:04 +02:00
Rafia Sabih	2886027516	Some typos/spelling mistakes fix (#580 ) Harmless typos fix.	2019-06-06 14:20:15 +02:00
Sergey Dudoladov	f3e1e80aaf	Add logical backup (#442 ) * Add k8s cron job to spawn logical backups * Minor doc updates	2019-05-16 15:52:01 +02:00
Felix Kunde	31e568157b	reflect change in github url (#496 ) Project was moved from the incubator to the Zalando main org, hence the rename	2019-02-25 11:26:55 +01:00
Maxim Ivanov	8330905ce7	Don't panic if Service for the role was not found (#451 )	2019-01-18 13:38:47 +01:00
zerg-junior	7907f95d2f	Improve reporting about rolling updates (#391 )	2018-09-24 11:57:43 +02:00
Oleksii Kliukin	e933908084	Configure pg_hba in the local postgresql configuration of Patroni. (#361 ) Previously, the operator put pg_hba into the bootstrap/pg_hba key of Patroni. That had 2 adverse effects: - pg_hba.conf was shadowed by Spilo default section in the local postgresql configuration - when updating pg_hba in the cluster manifest, the updated lines were not propagated to DCS, since the key was defined in the boostrap section of Patroni. Include some minor refactoring, moving methods to unexported when possible and commenting out usage of md5, so that gosec won't complain. Per https://github.com/zalando-incubator/postgres-operator/issues/330 Review by @zerg-junior	2018-08-08 11:01:26 +02:00
Oleksii Kliukin	b06186eb41	Linter-induced code refactoring, run round 2. (#360 ) Run more linters in the gometalinter, i.e. deadcode, megacheck, nakedret, dup. More consistent code formatting, remove two dead functions, eliminate naked a bunch of naked returns, refactor a few functions to avoid code duplication.	2018-08-06 12:09:19 +02:00
Oleksii Kliukin	59f0c5551e	Allow configuring pod priority globally and per cluster. (#353 ) * Allow configuring pod priority globally and per cluster. Allow to specify pod priority class for all pods managed by the operator, as well as for those belonging to individual clusters. Controlled by the pod_priority_class_name operator configuration parameter and the podPriorityClassName manifest option. See https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass for the explanation on how to define priority classes since Kubernetes 1.8. Some import order changes are due to go fmt. Removal of OrphanDependents deprecated field. Code review by @zerg-junior	2018-08-03 14:03:37 +02:00
Oleksii Kliukin	ac7b132314	Refactoring inspired by gometalinter. (#357 ) Among other things, fix a few issues with deepcopy implementation.	2018-08-03 11:09:45 +02:00
Oleksii Kliukin	d2d3f21dc2	Client go upgrade v6 (#352 ) There are shortcuts in this code, i.e. we created the deepcopy function by using the deepcopy package instead of the generated code, that will be addressed once migrated to client-go v8. Also, some objects, particularly statefulsets, are still taken from v1beta, this will also be addressed in further commits once the changes are stabilized.	2018-08-01 11:08:01 +02:00
Oleksii Kliukin	48a5744314	Use Patroni API to set bootstrap-only options. (#299 ) Call Patroni API /config in order to set special options that are ignored when set in the configuration file, such as max_connections. Per https://github.com/zalando-incubator/postgres-operator/issues/297 * Some minor refacoring: Rename Cluster ManualFailover to Swithover Rename Patroni Failover to Switchover Add more details to error messages and comments introduced in this PR. Review by @zerg-junior	2018-05-29 12:35:25 +02:00
Oleksii Kliukin	11d568bf65	Address code review by @zerg-junior - new info messages, rename the annotation flag.	2018-05-15 16:50:03 +02:00
Oleksii Kliukin	0c616a802f	Merge branch 'master' into rolling_updates_with_statefulset_annotations # Conflicts: # pkg/cluster/k8sres.go	2018-05-15 15:33:34 +02:00
Oleksii Kliukin	987b43456b	Deprecate old LB options, fix endpoint sync. (#287 ) * Depreate old LB options, fix endpoint sync. - deprecate useLoadBalancer, replicaLoadBalancer from the manifest and enable_load_balancer from the operator configuration. The old operator configuration options become no-op with this commit. For the old manifest options, `useLoadBalancer` and `replicaLoadBalancer` are still consulted, but only in the absense of the new ones (enableMasterLoadBalancer and enableReplicaLoadBalancer). - Make sure the endpoint being created during the sync receives proper addresses subset. This is more critical for the replicas, as for the masters Patroni will normally re-create the endpoint before the operator. - Avoid creating the replica endpoint, since it will be created automatically by the corresponding service. - Update the README and unit tests. Code review by @mgomezch and @zerg-junior	2018-05-15 15:19:18 +02:00
Oleksii Kliukin	332dab5237	Merge branch 'rolling_updates_with_statefulset_annotations' of github.com:zalando-incubator/postgres-operator into rolling_updates_with_statefulset_annotations	2018-05-08 14:51:10 +02:00
Oleksii Kliukin	f41a42f922	Merge branch 'rolling_updates_with_statefulset_annotations' of github.com:zalando-incubator/postgres-operator into rolling_updates_with_statefulset_annotations	2018-05-07 10:16:30 +02:00
Oleksii Kliukin	ce0d4af91c	Initial implementation for the statefulset annotations indicating rolling updates.	2018-05-07 08:07:37 +02:00
Oleksii Kliukin	1a20362c5b	Initial implementation for the statefulset annotations indicating rolling updates.	2018-05-04 18:59:23 +02:00
Dmitry Dolgov	bf4b0f0f33	Merge pull request #240 from zalando-incubator/feature/goreport-improvements Some improvements for golint, ineffassign and misspell	2018-02-22 11:31:08 +01:00
Dmitrii Dolgov	a7cd859919	Some improvements for golint, ineffassign and misspell	2018-02-19 17:46:31 +01:00
Sergey Dudoladov	f194a2ae5a	Introduce changes from the PR #200 by @alexeyklyukin	2018-02-07 14:02:32 +01:00
Sergey Dudoladov	ea84f9d577	Rename the configmap 'namespace' entry to avoid confusion with the map's owm namespace	2018-02-06 15:09:00 +01:00
Oleksii Kliukin	23011bdf9a	Migrate only master pods. Migrate single masters. (#199 ) Avoid migrating replica pods, since they will be handled by the node draining anyway (the PDB specifies that only masters are to be kept). Allow migration of the single-pod clusters.	2018-01-09 11:55:11 +01:00
Oleksii Kliukin	da0de8cff7	Make sure the statefulset that is deleted manually gets re-created. (#191 ) * Make sure the statefulset that is deleted manually gets re-created. Per report and analysis by Manuel Gomez. * Move the existence checks for other objects out of the Create functions. create{Object} for services, endpoints and PDBs refused to continue if there is a cached definition in the cluster, however, the only place where it makes sense is when creating a new cluster. Note that contrary to the statefulset this doesn't fix any issues, since those definitions were nullified correspondingly when the sync code detected there is no object present in the Kubernetes cluster.	2017-12-21 15:20:43 +01:00
Oleksii Kliukin	87bc47d8d0	Fixes for the case of re-creating the cluster after deletion. - make sure that the secrets for the system users (superuser, replication) are not deleted when the main cluster is. Therefore, we can re-create the cluster, potentially forcing Patroni to restore it from the backup and enable Patroni to connect, since it will use the old password, not the newly generated random one. - when syncing users, always check whether they are already in the DB. Previously, we did this only for the sync cluster case, but the new cluster could be actually the one restored from the backup by Patroni, having all or some of the users already in place. - delete endponts last. Patroni uses the $clustername endpoint in order to store the leader related metadata. If we remove it before removing all pods, one of those pods running Patroni will re-create it and the next attempt to create the cluster with the same name will stuble on the existing endpoint. - Use db.Exec instead of db.Query for queries that expect no result. This also fixes the issue with the DB creation, since we didn't release an empty Row object it was not possible to create more than one database for a cluster.	2017-12-13 16:49:00 +01:00

1 2

100 Commits