Commit Graph

48 Commits

Author SHA1 Message Date
Polina Bungina 47efca33c9
Improve inherited annotations (#2657)
* Annotate PVC on Sync/Update, not only change PVC template
* Don't rotate pods when only annotations changed
* Annotate Logical Backup's and Pooler's pods
* Annotate PDB, Endpoints created by the Operator, Secrets, Logical Backup jobs

Inherited annotations are only added/updated, not removed
2024-06-26 13:10:37 +02:00
Felix Kunde 83878fe447
make bucket prefix for logical backup configurable (#2609)
* make bucket prefix for logical backup configurable
* include container comparison in logical backup diff
* add unit test and update description for compareContainers
* don't rely on users putting / in the config - reflect other comments from review
2024-04-23 14:24:04 +02:00
Felix Kunde 29ea863faf
allow empty resources when defaults are empty (#2524)
* allow empty resources when defaults are empty
* update codegen
* add more unit tests and remove internal resources defaults
* a unit test for min limit and raising to request
* uncomment defaults in example configmap
* simplifying pooler pod generation unit test
2024-02-09 07:35:53 +01:00
Felix Kunde 4a0c483514
add unit test and documentation for finalizers (#2509)
* add unit test and documentation for finalizers
* error msg with lower case and cover sync case
* try to avoid adding json-patch dependency
* use Update to remove finalizer
* changing status and finalizer during create
* do not call Delete() twice
2024-01-22 12:13:40 +01:00
Felix Kunde 39fcf2e6b9
remove Users section from Patroni Bootstrap (#2490) 2024-01-03 16:47:21 +01:00
Felix Kunde 9ee14f26cb
let isSystemUsername check all system users (#2489)
* let isSystemUsername check all system users
* extend robot user unit test
* reset system users for initSystemUser test
2023-12-08 15:21:56 +01:00
Felix Kunde 28cd2f188a
better backwards compatibility with old DNS name format for LBs (#2171)
* better backwards compatibility with legacy DNS name format for LBs
* improve docs on DNS string
2023-01-17 10:06:11 +01:00
Owen Ou 021ab07a23
Introduce `masterServiceAnnotations` & `replicaServiceAnnotations` (#2161)
* Introduce `masterServiceAnnotations` & `replicaServiceAnnotations`

Introduce `masterServiceAnnotations` & `replicaServiceAnnotations` to the `Postgresql` CRD.
`masterServiceAnnotations` overrides `serviceAnnotations` for master role if not empty.
`replicaServiceAnnotations` overrides `serviceAnnotations` for replica role if not empty.
Existing definition of `serviceAnnotations` continue to work for backward compatibitlity when neither `masterServiceAnnotations` nor `replicaServiceAnnotations` is defined.

This closes https://github.com/zalando/postgres-operator/issues/1927

* Accumulate service annotations

First, global config, then ServiceAnnotations overriding, then MasterServiceAnnotations and ReplicaServiceAnnotations.

This addresses
https://github.com/zalando/postgres-operator/pull/2161#discussion_r1063558711.

* Update admin doc with master & replica service annotations overrides

Addressed https://github.com/zalando/postgres-operator/pull/2161#discussion_r1064744086

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2023-01-11 13:29:16 +01:00
Felix Kunde 70f3ee8e36
skip db sync on failed initUsers during UPDATE (#2083)
* skip db sync on failed initUsers during UPDATE
* provide unit test for teams API being unavailable
* add test for 404 case
2022-10-21 17:50:14 +02:00
Dmitry Volodin a85023ff10
Cluster env variables should be reflected for StatefulSet update (#2045)
* Cluster env variables should be reflected for StatefulSet update
* Add unit test for comparing StatefulSet's
2022-10-13 13:54:58 +02:00
Felix Kunde 4c07494ac7
deprecate ClusterName field of Postgresql type and remove team from REST endpoints (#2015)
* deprecate ClusterName field of Postgresql type
* remove for teamId from operator API endpints /status /logs /history
* update dns_format_string and yaml template in UI
2022-08-29 15:00:25 +02:00
Felix Kunde 89375186b3
use old LB DNS format when teamId prefix is disabled (#2011)
* use old LB DNS format when teamId prefix is disabled
* support both old and new format in external-dns
* switch dns template from team to namespace
2022-08-25 18:29:54 +02:00
Felix Kunde ef324494a0
fetch pooler and fes_user system user only when corresponding features are used (#2009)
* fetch pooler and fes_user system user only when corresponding features are used
* cover error case in unit test
* use string formatting instead of +
2022-08-24 16:28:49 +02:00
Felix Kunde b2642fa2fc
allow in place pw rotation of system users (#1953)
* allow in place pw rotation of system users
* block postgres user from rotation
* mark pooler pods for replacement
* adding podsGetter where pooler is synced in unit tests
* move rotation code in extra function
2022-08-18 14:14:31 +02:00
Felix Kunde a77d5df158
reverse membership for additional owner roles (#1862)
* reverse membership for additional owner roles
* remove type RoleOriginSpilo
* use e2e images with cron_admin inside
* let operator resolve reversed membership
* make additional owner roles part of the sync user strategy
* add more context in the docs about additional_owner_roles
2022-04-28 11:15:40 +02:00
Felix Kunde eecd13169c
refactor spilo env var generation (#1848)
* refactor spilo env generation
* enhance docs on env vars
* add unit test for appendEnvVar
2022-04-14 11:47:33 +02:00
Jociele Padilha 483bf624ee
add test team member (#1842)
* return err if teams API fails with StatusCode other than 404
* add unit test for 404 at team members

Co-authored-by: Jociele Padilha <jociele.padilha@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2022-04-14 10:02:54 +02:00
Felix Kunde 654d22d04a
Configure annotations to be ignored in comparisons during sync (#1823)
* feat: add ignored annotations when comparing during sync

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
Co-authored-by: Moshe Immerman <moshe@flanksource.com>
2022-03-24 18:38:37 +01:00
Felix Kunde a020708ef1
fix unit test and improve stability in e2e test (#1819)
* fix unit test and improve stability in e2e test
* fix resource handling
2022-03-21 10:05:20 +01:00
Jakob Gillich f3b83c0b05
Fix empty resources spec field failing schema validation (#1589)
In Go, when a struct field is not set, it becomes a struct with
default values for all fields. These default values are included
during serialization. This causes issues with schema validation
where optional fields cannot be omitted because default values
are considered invalid.

This patch addresses this issue for `Resources` fields on several
types by using a pointer value.
2022-03-18 16:16:32 +01:00
Felix Kunde 2719d411c3
grant db owners to cron_admin (#1805)
* grant db owners to cron_admin
* allow specifiying more extra owner roles
* add unit test for InitAdditionalOwnerRoles
* add e2e test
2022-03-18 12:36:12 +01:00
Felix Kunde d032e4783e
LoadBalancer toggles for master and replica pooler pods (#1799)
* Add support for pooler load balancer

Signed-off-by: Sergey Shatunov <me@prok.pw>

* Rename to enable_master_pooler_load_balancer

Signed-off-by: Sergey Shatunov <me@prok.pw>

* target port should be intval
* enhance pooler e2e test
* add new options to crds.go

Co-authored-by: Sergey Shatunov <me@prok.pw>
2022-03-04 13:36:17 +01:00
Maksim Zhylinski fb8a6c7a68
Compare container ports in a smarter way (#1755)
* Compare ports ingoring order and considering protocol defaults

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2022-02-28 11:35:41 +01:00
Rafia Sabih fa604027cf
Move flag to configmap (#1540)
* Move flag to configmap

Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2021-07-02 08:46:21 +02:00
Igor Yanchenko ebb3204cdd
restart instances via rest api instead of recreating pods, fixes bug with being unable to decrease some values, like max_connections (#1103)
* restart instances via rest api instead of recreating pods
* Ignore differences in bootstrap.dcs when compare SPILO_CONFIGURATION
* isBootstrapOnlyParameter is rewritten, instead of whitelist it uses blacklist
* added e2e test for max_connections decreasing
* documentation updated
* pending_restart flag added to restart api call, wait fot ttl seconds after restart
* refactoring, /restart returns error if pending_restart is set to true and patroni is not pending restart
* restart postgresql instances within pods only if pod's restart is not required
* patroni might need to restart postgresql after pods were recreated if values like max_connections decreased
* instancesRestart is not critical, try to restart pods if not successful
* cleanup

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2021-06-14 11:00:58 +02:00
Rafia Sabih 75a9e2be38
Create cross namespace secrets (#1490)
* Create cross namespace secrets

* add test cases

* fixes

* Fixes
- include namespace in secret name only when namespace is provided
- use username.namespace as key to pgUsers only when namespace is
  provided
- avoid conflict in the role creation in db by checking namespace
  alongwith the username

* Update unit tests

* Fix test case

* Fixes

- update regular expression for usernames
- add test to allow check for valid usernames
- create pg roles with namespace (if any) appended in rolename

* add more test cases for valid usernames

* update docs

* fixes as per review comments

* update e2e

* fixes

* Add toggle to allow namespaced secrets

* update docs

* comment update

* Update e2e/tests/test_e2e.py

* few minor fixes

* fix unit tests

* fix e2e

* fix e2e attempt 2

* fix e2e

Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2021-06-11 10:35:30 +02:00
Felix Kunde 3a49b485e5
delete secrets of system users too (#974) 2020-05-14 11:34:02 +02:00
Rafia Sabih d52296c323
Propagate annotations to the StatefulSet (#932)
* Initial commit

* Corrections

- set the type of the new  configuration parameter to be array of
  strings
- propagate the annotations to statefulset at sync

* Enable regular expression matching

* Improvements

-handle rollingUpdate flag
-modularize code
-rename config parameter name

* fix merge error

* Pass annotations to connection pooler deployment

* update code-gen

* Add documentation and update manifests

* add e2e test and introduce option in configmap

* fix service annotations test

* Add unit test

* fix e2e tests

* better key lookup of annotations tests

* add debug message for annotation tests

* Fix typos

* minor fix for looping

* Handle update path and renaming

- handle the update path to update sts and connection pooler deployment.
  This way no need to wait for sync
- rename the parameter to downscaler_annotations
- handle other review comments

* another try to fix python loops

* Avoid unneccessary update events

* Update manifests

* some final polishing

* fix cluster_test after polishing

Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de>
Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-05-04 14:46:56 +02:00
Felix Kunde d76203b3f9
Bootstrapped databases with best practice role setup (#843)
* PreparedDatabases with default role setup

* merge changes from master

* include preparedDatabases spec check when syncing databases

* create a default preparedDB if not specified

* add more default privileges for schemas

* use empty brackets block for undefined objects

* cover more default privilege scenarios and always define admin role

* add DefaultUsers flag

* support extensions and defaultUsers for preparedDatabases

* remove exact version in deployment manifest

* enable CRD validation for new field

* update generated code

* reflect code review

* fix typo in SQL command

* add documentation for preparedDatabases feature + minor changes

* some datname should stay

* add unit tests

* reflect some feedback

* init users for preparedDatabases also on update

* only change DB default privileges on creation

* add one more section in user docs

* one more sentence
2020-04-29 10:56:06 +02:00
Christian Rohmann 21b9b6fcbe
Emit K8S events to the postgresql CR as feedback to the requestor / user (#896)
* Add EventsGetter to KubeClient to enable to sending K8S events

* Add eventRecorder to the controller, initialize it and hand it down to cluster via its constructor to enable it to emit events this way

* Add first set of events which then go to the postgresql custom resource the user interacts with to provide some feedback

* Add right to "create" events to operator cluster role

* Adapt cluster tests to new function sigurature with eventRecord (via NewFakeRecorder)

* Get a proper reference before sending events to a resource

Co-authored-by: Christian Rohmann <christian.rohmann@inovex.de>
2020-04-27 08:22:07 +02:00
Dmitry Dolgov a1f2bd05b9
Prevent superuser from being a connection pool user (#906)
* Protected and system users can't be a connection pool user

It's not supported, neither it's a best practice. Also fix potential null
pointer access. For protected users it makes sense by intent of protecting this
users (e.g. from being overriden or used as something else than supposed). For
system users the reason is the same as for superuser, it's about replicastion
user and it's under patroni control.

This is implemented on both levels, operator config and postgresql manifest.
For the latter we just use default name in this case, assuming that operator
config is always correct. For the former, since it's a serious
misconfiguration, operator panics.
2020-04-09 09:21:45 +02:00
Felix Kunde b43b22dfcc
Call me pooler, not pool (#883)
* rename pooler parts and add example to manifest
* update codegen
* fix manifest and add more details to docs
* reflect renaming also in e2e tests
2020-04-01 10:34:03 +02:00
Dmitry Dolgov 9dfa433363
Connection pooler (#799)
Connection pooler support

Add support for a connection pooler. The idea is to make it generic enough to
be able to switch between different implementations (e.g. pgbouncer or
odyssey). Operator needs to create a deployment with pooler and a service for
it to access.

For connection pool to work properly, a database needs to be prepared by
operator, namely a separate user have to be created with an access to an
installed lookup function (to fetch credential for other users).

This setups is supposed to be used only by robot/application users. Usually a
connection pool implementation is more CPU bounded, so it makes sense to create
several pods for connection pool with more emphasize on cpu resources. At the
moment there are no special affinity or tolerations assigned to bring those
pods closer to the database. For availability purposes minimal number of
connection pool pods is 2, ideally they have to be distributed between
different nodes/AZ, but it's not enforced in the operator itself. Available
configuration supposed to be ergonomic and in the normal case require minimum
changes to a manifest to enable connection pool. To have more control over the
configuration and functionality on the pool side one can customize the
corresponding docker image.

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-03-25 12:57:26 +01:00
Jonathan Juares Beber ba60e15d07 Add ServiceAnnotations cluster config (#803)
The [operator parameters][1] already support the
`custom_service_annotations` config.With this parameter is possible to
define custom annotations that will be used on the services created by the
operator. The `custom_service_annotations` as all the other
[operator parameters][1] are defined on the operator level and do not allow
customization on the cluster level. A cluster may require different service
annotations, as for example, set up different cloud load balancers
timeouts, different ingress annotations, and/or enable more customizable
environments.

This commit introduces a new parameter on the cluster level, called
`serviceAnnotations`, responsible for defining custom annotations just for
the services created by the operator to the specifically defined cluster.
It allows a mix of configuration between `custom_service_annotations` and
`serviceAnnotations` where the latest one will have priority. In order to
allow custom service annotations to be used on services without
LoadBalancers (as for example, service mesh services annotations) both
`custom_service_annotations` and `serviceAnnotations` are applied
independently of load-balancing configuration. For retro-compatibility
purposes, `custom_service_annotations` is still under
[Load balancer related options][2]. The two default annotations when using
LoadBalancer services, `external-dns.alpha.kubernetes.io/hostname` and
`service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout` are
still defined by the operator.
`service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout` can
be overridden by `custom_service_annotations` or `serviceAnnotations`,
allowing a more customizable environment.
`external-dns.alpha.kubernetes.io/hostname` can not be overridden once
there is no differentiation between custom service annotations for
replicas and masters.

It updates the documentation and creates the necessary unit and e2e
tests to the above-described feature too.

[1]: https://github.com/zalando/postgres-operator/blob/master/docs/reference/operator_parameters.md
[2]: https://github.com/zalando/postgres-operator/blob/master/docs/reference/operator_parameters.md#load-balancer-related-options
2020-02-10 12:03:25 +01:00
Thomas Runyon 535517cd1b Custom annotations 329 (#657)
* Add ability for custom annotations to database pods
2019-11-11 10:45:35 +01:00
Felix Kunde 0fbfbb23bb
Use /status subresource instead of plain manifest field (#534)
* turns PostgresStatus type into a struct with field PostgresClusterStatus
* setStatus patch target is now /status subresource
* unmarshalling PostgresStatus takes care of previous status field convention
* new simple bool functions status.Running(), status.Creating()
2019-05-07 12:01:45 +02:00
Felix Kunde 31e568157b reflect change in github url (#496)
Project was moved from the incubator to the Zalando main org, hence the rename
2019-02-25 11:26:55 +01:00
Noah Kantrowitz 688d252752 Some tweaks to ensure compat with newer Go. (#383) 2018-09-17 10:13:07 +02:00
Noah Kantrowitz 0b75a89920 Fix the casing of github.com/Sirupsen/logrus to match what the project itself uses. (#380)
Dep enforces this.
2018-09-06 10:26:48 +02:00
zerg-junior 25fa45fd58 [WIP] Grant 'superuser' to the members of Postgres admin teams (#371)
Added support for superuser team in addition to the admin team that owns the postgres cluster.
2018-08-30 10:51:37 +02:00
Oleksii Kliukin e1ed4b847d
Use code-generation for CRD API and deepcopy methods (#369)
Client-go provides a https://github.com/kubernetes/code-generator package in order to provide the API to work with CRDs similar to the one available for built-in types, i.e. Pods, Statefulsets and so on.

Use this package to generate deepcopy methods (required for CRDs), instead of using an external deepcopy package; we also generate APIs used to manipulate both Postgres and OperatorConfiguration CRDs, as well as informers and listers for the Postgres CRD, instead of using generic informers and CRD REST API; by using generated code we can get rid of some custom and obscure CRD-related code and use a better API.

All generated code resides in /pkg/generated, with an exception of zz_deepcopy.go in apis/acid.zalan.do/v1

Rename postgres-operator-configuration CRD to OperatorConfiguration, since the former broke naming convention in the code-generator.

Moved Postgresql, PostgresqlList, OperatorConfiguration and OperatorConfigurationList and other types used by them into

Change the type of  the Error field in the Postgresql crd to a string, so that client-go could generate a deepcopy for it.

Use generated code to set status of CRD objects as well. Right now this is done with patch, however, Kubernetes 1.11 introduces the /status subresources, allowing us to set the status with
the special updateStatus call in the future. For now, we keep the code that is compatible with earlier versions of Kubernetes.

Rename postgresql.go to database.go and status.go to logs_and_api.go to reflect the purpose of each of those files.

Update client-go dependencies.

Minor reformatting and renaming.
2018-08-15 17:22:25 +02:00
Oleksii Kliukin d2d3f21dc2 Client go upgrade v6 (#352)
There are shortcuts in this code, i.e. we created the deepcopy function
by using the deepcopy package instead of the generated code, that will
be addressed once migrated to client-go v8. Also, some objects,
particularly statefulsets, are still taken from v1beta, this will also
be addressed in further commits once the changes are stabilized.
2018-08-01 11:08:01 +02:00
Oleksii Kliukin fe47f9ebea
Improve the pod moving behavior during the Kubernetes cluster upgrade. (#281)
* Improve the pod moving behavior during the Kubernetes cluster upgrade.

Fix an issue of not waiting for at least one replica to become ready
(if the Statefulset indicates there are replicas) when moving the master
pod off the decomissioned node. Resolves the first part of #279.

Small fixes to error messages.

* Eliminate a race condition during the swithover.

When the operator initiates the failover (switchover) that fails and
then retries it for a second time it may happen that the previous
waitForPodChannel is still active. As a result, the operator subscribes
to the former master pod two times, causing a panic.

The problem was that the original code didn't bother to cancel the
waitForPodLalbel for the new master pod in the case when the failover
fails. This commit fixes it by adding a stop channel to that function.

Code review by @zerg-junior
2018-05-03 10:20:24 +02:00
Oleksii Kliukin 26db91c53e
Improve infrastructure role definitions (#208)
Enhance definitions of infrastructure roles by allowing membership in multiple roles, role options and per-role configuration to be specified in the infrastructure role configmap, which must have the same name as the infrastructure role secret. See manifests/infrastructure-roles-configmap.yaml for the examples and updated README for the description of different types of database roles supposed by the operator and their purposes.

Change the logic of merging infrastructure roles with the manifest roles when they have the same name, to return the infrastructure role unchanged instead of merging. Previously, we used to propagate flags from the manifest role to the resulting infrastructure one, as there were no way to define flags for the infrastructure role; however, this is not the case anymore.

Code review and tests by @erthalion
2018-04-04 17:21:36 +02:00
Oleksii Kliukin 2bb7e98268
update individual role secrets from infrastructure roles (#206)
* Track origin of roles.

* Propagate changes on infrastructure roles to corresponding secrets.

When the password in the infrastructure role is updated, re-generate the
secret for that role.

Previously, the password for an infrastructure role was always fetched from
the secret, making any updates to such role a no-op after the corresponding
secret had been generated.
2018-02-23 17:24:04 +01:00
Oleksii Kliukin 87bc47d8d0 Fixes for the case of re-creating the cluster after deletion.
- make sure that the secrets for the system users (superuser, replication)
  are not deleted when the main cluster is. Therefore, we can re-create
  the cluster, potentially forcing Patroni to restore it from the backup
  and enable Patroni to connect, since it will use the old password, not
  the newly generated random one.

- when syncing users, always check whether they are already in the DB.
  Previously, we did this only for the sync cluster case, but the new
  cluster could be actually the one restored from the backup by Patroni,
  having all or some of the users already in place.

 - delete endponts last. Patroni uses the $clustername endpoint in order
   to store the leader related metadata. If we remove it before removing
   all pods, one of those pods running Patroni will re-create it and the
   next attempt to create the cluster with the same name will stuble on
   the existing endpoint.

 - Use db.Exec instead of db.Query for queries that expect no result.
   This also fixes the issue with the DB creation, since we didn't
   release an empty Row object it was not possible to create more than
   one database for a cluster.
2017-12-13 16:49:00 +01:00
Oleksii Kliukin 1fb8cf7ea0
Avoid overwriting critical users. (#172)
* Avoid overwriting critical users.

Disallow defining new users either in the cluster manifest, teams
API or infrastructure roles with the names mentioned in the new
protected_role_names parameter (list of comma-separated names)

Additionally, forbid defining a user with the name matching either
super_username or replication_username, so that we don't overwrite
system roles required for correct working of the operator itself.

Also, clear PostgreSQL roles on each sync first in order to avoid using
the old definitions that are no longer present in the current manifest,
infrastructure roles secret or the teams API.
2017-12-05 14:27:12 +01:00
Oleksii Kliukin 637921cdee Tests for initHumanUsers and initinitRobotUsers.
Change the Cluster class in the process to implelement Teams API
calls and Oauth token fetches as interfaces, so that we can mock
them in the tests.
2017-12-04 10:49:25 +01:00