Commit Graph

588 Commits

Author SHA1 Message Date
Jan Mußler 085d0da6a9 Added example for fake client. 2020-11-04 23:43:57 +01:00
Jan Mußler 1e10549e43 More debug output for trouble shooting of missing sync. 2020-11-04 21:32:09 +01:00
Rafia Sabih 9de57f0182 fix unit tests and other minor changes 2020-11-04 07:56:34 +01:00
Rafia Sabih 2b44eff294 fix unit test and delete secret when required only 2020-11-03 15:57:24 +01:00
Rafia Sabih 041b9eec52 Enhance e2e test and other fixes 2020-11-03 15:36:52 +01:00
Rafia Sabih fd18e02f2a merge master 2020-10-29 13:36:07 +01:00
Jan Mussler c694a72352
Make failure in retry a warning not an error. (#1188) 2020-10-29 13:12:25 +01:00
Rafia Sabih 133f77c330
Refactor connection pooler (#1159)
Refactor connection_pooler code

- Move all the relevant code to a separate file
- Move all the related tests to a separate file
- Avoid using cluster where not required
- Simplify the logic in sync and other methods
- Cleanup of duplicated or unused code

Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de>
2020-10-29 12:05:53 +01:00
Felix Kunde 9a11e85d57
disable PostgresTeam by default (#1186)
* disable PostgresTeam by default

* fix version in chart
2020-10-28 17:51:37 +01:00
Felix Kunde d658b9672e
PostgresTeam CRD for advanced team management (#1165)
* PostgresTeamCRD for advanced team management

* rework internal structure to be closer to CRD

* superusers instead of admin

* add more util functions and unit tests

* fix initHumanUsers

* check for superusers when creating normal teams

* polishing and fixes

* adding the essential missing pieces

* add documentation and update rbac

* reflect some feedback

* reflect more feedback

* fixing debug logs and raise QueueResyncPeriodTPR

* add two more flags to disable CRD and its superuser support

* fix chart

* update go modules

* move to client 1.19.3 and update codegen
2020-10-28 10:40:10 +01:00
Jan Mussler 3a86dfc8bb
End 2 End tests speedup (#1180)
* Improving end 2 end tests, especially speed of execution and error, by implementing proper eventual asserts and timeouts.
* Add documentation for running individual tests
* Fixed String encoding in Patorni state check and error case
* Printing config as multi log line entity, makes it readable and grepable on startup
* Cosmetic changes to logs. Removed quotes from diff. Move all object diffs to text diff. Enabled padding for log level.
* Mount script with tools for easy logaccess and watching objects.
* Set proper update strategy for Postgres operator deployment.
* Move long running test to end. Move pooler test to new functions.
* Remove quote from valid K8s identifiers.
2020-10-28 10:04:33 +01:00
arminfelder 7730ecfdec
fixed case where, no ready label is defined, but node is unscheduable (#1162)
* fixed case where, no ready label is defined, but node is unscheduable
2020-10-28 09:33:52 +01:00
Rafia Sabih 2d20982913 conflict resolution for master merge 2020-10-27 17:42:23 +01:00
Felix Kunde e97235aa39
update dependencies oct 2020 (#1184)
* update dependencies oct 2020

* update codegen
2020-10-27 16:59:26 +01:00
preved911 d9f5d1c9df
changed PodEnvironmentSecret location namespace (#1177)
Signed-off-by: Ildar Valiullin <preved.911@gmail.com>
2020-10-22 08:49:30 +02:00
Felix Kunde 22fa0875e2
add maxLength constraint for CRD (#1175)
* add maxLength constraint for CRD
2020-10-22 08:44:04 +02:00
刘新 a8bfe4eb87
Remove repeated initialization of Pod ServiceAccount (#1164)
Co-authored-by: xin.liu <xin.liu@woqutech.com>
2020-10-20 14:18:22 +02:00
Dmitry Dolgov 1f5d0995a5
Lookup function installation (#1171)
* Lookup function installation

Due to reusing a previous database connection without closing it, lookup
function installation process was skipping the first database in the
list, installing twice into postgres db instead. To prevent that, make
internal initDbConnWithName to overwrite a connection object, and return
the same object only from initDbConn, which is sort of public interface.

Another solution for this would be to modify initDbConnWithName to
return a connection object and then generate one temporary connection
for each db. It sound feasible but after one attempt it seems it
requires a bit more changes around (init, close connections) and
doesn't bring anything significantly better on the table. In case if
some future changes will prove this wrong, do not hesitate to refactor.

Change retry strategy to more insistive one, namely:

* retry on the next sync even if we failed to process one database and
install pooler appliance.

* perform the whole installation unconditionally on update, since the
list of target databases could be changed.

And for the sake of making it even more robust, also log the case when
operator decides to skip installation.

Extend connection pooler e2e test with verification that all dbs have
required schema installed.
2020-10-19 16:18:58 +02:00
Dmitry Dolgov d15f2d3392
Readiness probe (#1169)
Right now there are no readiness probes defined for connection pooler,
which means after a pod restart there is a short time window (between a
container start and connection pooler starting listening to a socket)
when a service can send queries to a new pod, but connection will be
refused. The pooler container is rather lightweight and it start to
listen immediately, so the time window is small, but still.

To fix this add a readiness probe for tcp socket opened by connection
pooler.
2020-10-15 10:16:42 +02:00
Alex Stockinger 692c721854
Introduce ENABLE_JSON_LOGGING env variable (#1158) 2020-10-08 15:32:15 +02:00
Rafia Sabih d2c410d72b Add labels 2020-10-05 12:10:02 +02:00
Rafia Sabih a6ffdbae36 Cleanup deleteConnectionPooler 2020-10-05 12:06:58 +02:00
Rafia Sabih 86e6a51fa9 Merge branch 'master' into replica-pooler 2020-10-05 11:47:47 +02:00
Felix Kunde 21475f4547
Cleanup config examples (#1151)
* post polishing for latest PRs

* update travis and go modules

* make deprecation comments in structs less confusing

* have separate pod priority class es for operator and database pods
2020-09-30 17:24:14 +02:00
Sergey Dudoladov 2a21cc4393
Compare Postgres pod priority on Sync (#1144)
* compare Postgres pod priority on Sync

Co-authored-by: Sergey Dudoladov <sergey.dudoladov@zalando.de>
2020-09-23 17:26:56 +02:00
Rafia Sabih 683cb15b0d Add and update tests 2020-09-23 17:09:36 +02:00
Rafia Sabih 1ee79887cf Add sync test 2020-09-23 11:19:01 +02:00
neelasha-09 ab95eaa6ef
Fixes #1130 (#1139)
* Fixes #1130

Co-authored-by: Felix Kunde <felix-kunde@gmx.de>
2020-09-22 17:16:05 +02:00
Rafia Sabih 033a7f4e24 Update tests 2020-09-21 22:55:10 +02:00
Rafia Sabih 2936ed0060 Refactor needConnectionPooler
Have one unified function to tell if any connection pooler is required

Add a helper function to list the roles that require connection pooler,
helps in avoiding duplication of code
2020-09-21 16:40:40 +02:00
Rico Berger d09e418b56
Set user and group in security context (#1083)
* Set user and group in security context
2020-09-15 13:27:59 +02:00
Igor Yanchenko d8884a4003
Allow to overwrite default ExternalTrafficPolicy for the service (#1136)
* Allow to overwrite default ExternalTrafficPolicy for the service
2020-09-15 13:19:22 +02:00
Rafia Sabih 7c9b459919 Minor fix 2020-09-11 10:22:29 +02:00
Rafia Sabih 46ff4bb738 Resolve review comments 2020-09-11 10:17:43 +02:00
Rafia Sabih 770fc1e612 Improvements in tests
- Fixed the issue with failing test cases
- Add more test cases for replica connection pooler
- Added docs about the new flag
2020-09-08 17:32:47 +02:00
Rafia Sabih b3dbac5b81 Adding test cases and other changes
- Refactor needConnectionPooler for master and replica separately
- Improve sync function
- Add test cases to create, delete and sync with repplica connection
  pooler

Other changes
2020-09-04 17:32:35 +02:00
Rafia Sabih 374dd00538 Fix sync 2020-09-04 08:02:27 +02:00
Rafia Sabih 1814342dc3 Assorted changes
- Update deleteConnectionPooler to include role
- Rename EnableMasterConnectionPooler back to original name for backward
  compatiility
- other minor chnages and code improvements
2020-09-03 16:01:25 +02:00
Rafia Sabih 503082cf1a fix for labels selector 2020-09-03 09:32:48 +02:00
Rafia Sabih a9248b1379 Fix labels for the replica pods 2020-09-02 17:17:56 +02:00
Rafia Sabih fb49376085 Enable connection pooler for replica
- Refactor code for connection pooler deployment and services
- Refactor sync code for connection pooler
- Rename EnableConnectionPooler to EnableMasterConnectionPooler
- Update yamls and tests
2020-09-02 13:46:36 +02:00
hlihhovac e03e9f919a
add missing omitempty directive to the attributes of PostgresSpec (#1128)
Co-authored-by: Pavlo Golub <pavlo.golub@gmail.com>
2020-08-31 12:28:52 +02:00
Rafia Sabih 83ddd5c85b Add new pooler service for replica 2020-08-28 14:55:49 +02:00
Rafia Sabih 3a906aba93 Add pooler for replica 2020-08-28 13:25:46 +02:00
Felix Kunde 3ddc56e5b9
allow delete only if annotations meet configured criteria (#1069)
* define annotations for delete protection

* change log level and reduce log lines for e2e tests

* reduce wait_for_pod_start even further
2020-08-13 16:36:22 +02:00
Felix Kunde dfd0dd90ed
set search_path for default roles (#1065)
* set search_path for default roles

* deployment back to 1.5.0

Co-authored-by: Felix Kunde <felix.kunde@zalando.de>
2020-08-11 10:42:31 +02:00
Felix Kunde 0508266219
Remove all secrets on delete incl. pooler (#1091)
* fix syncSecrets and remove pooler secret

* update log for deleteSecret

* use c.credentialSecretName(username)

* minor fix
2020-08-10 18:26:26 +02:00
Felix Kunde 43163cf83b
allow using both infrastructure_roles_options (#1090)
* allow using both infrastructure_roles_options

* new default values for user and role definition

* use robot_zmon as parent role

* add operator log to debug

* right name for old secret

* only extract if rolesDefs is empty

* set password1 in old infrastructure role

* fix new infra rile secret

* choose different role key for new secret

* set memberof everywhere

* reenable all tests

* reflect feedback

* remove condition for rolesDefs
2020-08-10 15:08:03 +02:00
Dmitry Dolgov 7cf2fae6df
[WIP] Extend infrastructure roles handling (#1064)
Extend infrastructure roles handling

Postgres Operator uses infrastructure roles to provide access to a database for
external users e.g. for monitoring purposes. Such infrastructure roles are
expected to be present in the form of k8s secrets with the following content:

    inrole1: some_encrypted_role
    password1: some_encrypted_password
    user1: some_entrypted_name

    inrole2: some_encrypted_role
    password2: some_encrypted_password
    user2: some_entrypted_name

The format of this content is implied implicitly and not flexible enough. In
case if we do not have possibility to change the format of a secret we want to
use in the Operator, we need to recreate it in this format.

To address this lets make the format of secret content explicitly. The idea is
to introduce a new configuration option for the Operator.

    infrastructure_roles_secrets:
    - secretname: k8s_secret_name
      userkey: some_encrypted_name
      passwordkey: some_encrypted_password
      rolekey: some_encrypted_role

    - secretname: k8s_secret_name
      userkey: some_encrypted_name
      passwordkey: some_encrypted_password
      rolekey: some_encrypted_role

This would allow Operator to use any avalable secrets to prepare infrastructure
roles. To make it backward compatible simulate the old behaviour if the new
option is not present.

The new configuration option is intended be used mainly from CRD, but it's also
available via Operator ConfigMap in a limited fashion. For ConfigMap one can
put there only a string with one secret definition in the following format (as
a string):

    infrastructure_roles_secrets: |
        secretname: k8s_secret_name,
        userkey: some_encrypted_name,
        passwordkey: some_encrypted_password,
        rolekey: some_encrypted_role

Note than only one secret could be specified this way, no multiple secrets are
allowed.

Eventually the resulting list of infrastructure roles would be a total sum of
all supported ways to describe it, namely legacy via
infrastructure_roles_secret_name and infrastructure_roles_secrets from both
ConfigMap and CRD.
2020-08-05 14:18:56 +02:00
Felix Kunde f3ddce81d5
fix random order for pod environment tests (#1085) 2020-07-30 17:48:15 +02:00