postgres-operator

Commit Graph

Author	SHA1	Message	Date
Jan Mussler	3e275d122a	Allow individual teams to do auto upgrade via operator. (#1699 ) * Allow whitelisting of teams to do auto upgrade upgrade via operator. Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2021-11-29 12:47:18 +01:00
Rafia Sabih	e98439e5b6	Add log messages for usernames (#1692 ) * add log messages for usernames * document behavior better in logs Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2021-11-18 09:55:33 +01:00
Felix Kunde	1eafd688d0	restart master first in some edge cases (#1655 ) * restart master first in some edge cases * edge case is when desired is lower than effective * wait after config patch and restart on sync whenever we see pending_restart * convert options to int to check decrease and add unit test * minor update to e2e tests * wait only after restart not every sync * using spilo 14 e2e images	2021-10-26 16:43:19 +02:00
Felix Kunde	2a33bf3313	improve Patroni config sync (#1635 ) * improve Patroni config sync * collect new and updated slots to patch patroni * refactor httpGet in Patroni and extend unit tests * GetMemberData should call the patroni endpoint * add PATCH test	2021-10-13 17:17:26 +02:00
Felix Kunde	ab25fb29b7	make Postgres 14 available (#1636 ) * make Postgres 14 available * don't be too hard to 9.5 * bump Spilo image and more docs updates * update e2e test upgrading to 14	2021-10-12 12:00:59 +02:00
Jan Mussler	d0d7a32d52	Clearing up error on resize failure message. (#1641 ) * Clearing up error message.	2021-10-08 17:11:21 +02:00
Felix Kunde	62ed7e470f	improve pooler sync (#1593 ) * remove role from installLookupFunction and run it on database sync, too * fix condition to decide on syncing pooler * trigger lookup from database sync only if pooler is set * use empty spec everywhere and do not sync if one lookupfunction was passed * do not sync pooler after being disabled	2021-08-27 12:41:37 +02:00
Aaron Peschel	1dd0cd9691	Add Support for Azure WAL-G Backups (#1537 ) This commit adds support for using an Azure storage account as a backup location. It uses the existing GCS functionality as a reference for what to do, and follows the example set by GCS as closely as possible. The decision to name the cloud provider key "aws_or_gcp" is unfortunate while adding support for Azure, but I have left it alone to allow for this changeset to be backwards compatible.	2021-08-26 14:59:03 +02:00
John Rood	2d2ce6197b	Add volume selector (#1385 ) * Add volume selector * Add slightly better documentation and gofmt changes * Update generated deepcopy * Add test for PV selector Co-authored-by: John Rood <j.rood@picturae.com>	2021-08-26 14:57:54 +02:00
Quan Hoang	1b3366e9f4	Support affinity in connection pooler deployments (#1464 )	2021-08-24 15:25:03 +02:00
Felix Kunde	f0815fc5bd	remove debug log of Spilo env vars (#1591 )	2021-08-23 15:44:34 +02:00
Felix Kunde	282b6d2863	allow secrets of default users in a different namespace (#1581 ) * allow secrets of default users in a different namespace * add warning in case secretNamespace is ignored	2021-08-18 16:00:26 +02:00
Felix Kunde	66620d5049	refactor restarting instances (#1535 ) * refactor restarting instances and reduce listPods calls * only add parameters to set if it differs from effective config * update e2e test for updating Postgres config * patch config only once	2021-08-09 16:23:41 +02:00
Felix Kunde	58bab073da	fix searching for users with namespace in name (#1569 ) * fix searching for users with namespace in name and improve e2e test * remove reformatting username to query	2021-07-27 09:46:55 +02:00
Rafia Sabih	fa604027cf	Move flag to configmap (#1540 ) * Move flag to configmap Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de> Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2021-07-02 08:46:21 +02:00
Jan Mussler	330c2c4c0b	Do not modify if values are below gp3 minimum throughput. (#1543 ) * Do not modify if values are below gp3 minimum throughput.	2021-06-30 15:01:55 +02:00
Felix Kunde	54e506c00b	define default access privileges for default users too (#1512 ) * define default access privileges for default users too * extend docs on defaultUsers	2021-06-22 16:45:28 +02:00
Sergey Dudoladov	53fb540c35	Add basic retry around switchover (#1510 ) * add basic retry around switchover Co-authored-by: Sergey Dudoladov <sergey.dudoladov@gmail.com>	2021-06-17 08:48:26 +02:00
Igor Yanchenko	ebb3204cdd	restart instances via rest api instead of recreating pods, fixes bug with being unable to decrease some values, like max_connections (#1103 ) * restart instances via rest api instead of recreating pods * Ignore differences in bootstrap.dcs when compare SPILO_CONFIGURATION * isBootstrapOnlyParameter is rewritten, instead of whitelist it uses blacklist * added e2e test for max_connections decreasing * documentation updated * pending_restart flag added to restart api call, wait fot ttl seconds after restart * refactoring, /restart returns error if pending_restart is set to true and patroni is not pending restart * restart postgresql instances within pods only if pod's restart is not required * patroni might need to restart postgresql after pods were recreated if values like max_connections decreased * instancesRestart is not critical, try to restart pods if not successful * cleanup Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2021-06-14 11:00:58 +02:00
Rafia Sabih	75a9e2be38	Create cross namespace secrets (#1490 ) * Create cross namespace secrets * add test cases * fixes * Fixes - include namespace in secret name only when namespace is provided - use username.namespace as key to pgUsers only when namespace is provided - avoid conflict in the role creation in db by checking namespace alongwith the username * Update unit tests * Fix test case * Fixes - update regular expression for usernames - add test to allow check for valid usernames - create pg roles with namespace (if any) appended in rolename * add more test cases for valid usernames * update docs * fixes as per review comments * update e2e * fixes * Add toggle to allow namespaced secrets * update docs * comment update * Update e2e/tests/test_e2e.py * few minor fixes * fix unit tests * fix e2e * fix e2e attempt 2 * fix e2e Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de> Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2021-06-11 10:35:30 +02:00
Felix Kunde	dd9c3907b7	pick first container if postgres is not found (#1505 ) * pick first container if postgres is not found * minor change	2021-05-28 11:44:10 +02:00
Felix Kunde	7884af2d59	get postgres container by name, not index (#1504 )	2021-05-27 18:56:58 +02:00
Felix Kunde	48cdca645d	rework additional volume test (#1502 )	2021-05-27 18:37:30 +02:00
Quan Hoang	af5378eea5	Mount additional volumes to 'postgres' container when 'targetContains` is an empty list (#1475 ) * Mount additional volumes to 'postgres' container when 'targetContainers' is an empty list Co-authored-by: Felix Kunde <felix-kunde@gmx.de> Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2021-05-27 14:56:14 +02:00
Felix Kunde	eeb59c5bfd	Rename roles that are removed from PostgresTeam CRD (#1457 ) * rename db roles that are removed from manifests * extend PostgresTeam e2e test * make suffix configurable and add deprecated field to pgUser struct * deny LOGIN from deprecated roles * update feature documentation	2021-05-21 15:49:39 +02:00
Quan Hoang	18e2efe4e3	Update sts when modifying additionalVolumes (#1474 )	2021-05-10 12:16:47 +02:00
Felix Kunde	f0f7f25d30	Fix go lint errors (#1468 ) * fix linter errors * fix linter errors in kubectl plugin * update PyYAML dependency in e2e tests * declare a testVolume in volume_test	2021-05-10 11:48:03 +02:00
Felix Kunde	32e6c135b9	replace statefulset on annotation diff (#1449 ) * replace statefulset on annotation diff * remove update annotation function for statefulset * add unit test for syncing annotations * add inherited annotation to unit test	2021-04-22 11:22:52 +02:00
Felix Kunde	6b73ac4282	fix pooler sync with empty cluster name (#1448 )	2021-04-09 14:08:28 +02:00
Felix Kunde	c18241f187	Bump v1.6.2 (#1433 ) * helm chart remove 1.6.0 archive from 1.6.0 archive * bump operator to v1.6.2 * fix pointer deref * skip connection pooler sync when empty * revert pooler change and minor update to version msg * do not log query on error when creating or altering users	2021-04-01 11:53:07 +02:00
neelasha-09	9e93c0a4ef	Fix for AllowPrivilegeEscalation : issue-1403 (#1412 ) * Fix for AllowPrivilegeEscalation : issue-1403 * fixed syntax error * Aligned the value for parameter * Aligned the value for parameter * Update crds.go * Aligned the parameter spilo_allow_privilege_escalation * Parameters sorted in Alphabetical order in manifests yaml * Parameters sorted in Alphabetical order in manifests yaml * Update pkg/controller/operator_config.go * Update docs/reference/operator_parameters.md Co-authored-by: Neelam Sharma <neelasha@amdocs.com> Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2021-03-29 10:37:59 +02:00
machine424	78bfba85d2	create global default privileges in the appropriate prepared databases (#1421 )	2021-03-26 14:19:26 +01:00
Felix Kunde	c9acd52700	Major version upgrade config (#1386 ) * reflect new major version upgrade options everywhere * emit events during major version upgrade	2021-03-09 15:28:15 +01:00
Felix Kunde	ff8143770c	Improve rolling upgrades and rolling upgrade continue (#1341 ) * add TODOs for moving rooling update label on pods * steer rolling update via pod annotation * rename patch method and fix reading flag on sync * pass only pods to recreatePods function * do not take address of iterator if you use it later * add e2e test and pass switchover targets to recreatePods * add wait_for_pod_failover for e2e test * add one more e2e test case * helm chart remove 1.6.0 archive from 1.6.0 archive * reflect code review feedback	2021-02-26 15:38:58 +01:00
Jan Mussler	e837751ae0	Log result. (#1387 )	2021-02-26 14:54:26 +01:00
Jan Mussler	636a9a8191	Support major version upgrade via manifest and global upgrades via min version (#1372 ) Support major version upgrade trigger via manifest. There is `off` `manual` and `full`. Manual is what you expect, and full will auto upgrade clusters below a certain threshold.	2021-02-25 11:42:43 +01:00
Felix Kunde	ca968ca150	Fix empty capabilities (#1380 ) * helm chart remove 1.6.0 archive from 1.6.0 archive * empty pod capabilities should be nil	2021-02-22 17:27:32 +01:00
Felix Kunde	41858a702c	making pgTeamMap a pointer (#1349 ) * making pgTeamMap a pointer * init empty map * add e2e test for additional teams and members * update test_min_resource_limits * add more waiting in node_affinity_test * no need for pointers in map of postgresTeamMebership * another minor update on node affinity test * refactor and fix fetching additional members	2021-02-16 10:38:20 +01:00
Michael Seiwald	17da6bc649	Truncate cronjob name at 52 characters (#1208 )	2021-02-15 17:00:21 +01:00
Jan Mussler	772f0ca771	Fix volume sync order. (#1340 )	2021-02-12 17:36:11 +01:00
zvier	6aeb92f024	code optimization (#1350 ) * pre-allocate cap for slice structure * if clause is no need because of range, and kubelet also use range method to get each capability so there is no side-effect Signed-off-by: Jeff Zvier <zvier20@gmail.com>	2021-02-09 09:35:24 +01:00
Felix Kunde	0cce565b65	fix when adding only one capability (#1339 ) * fix when adding only one capability * fix error messages in unit test	2021-01-29 16:10:27 +01:00
Felix Kunde	12ad8c91fa	configurable container capabilities (#1336 ) * configurable container capabilities * revert change on TestTLS * fix e2e test * minor fix	2021-01-29 14:54:48 +01:00
Jan Mussler	43168ca622	Also sync volumes on updates. (#1330 )	2021-01-25 20:28:37 +01:00
Felix Kunde	ac2a00c45e	set allowPrivilegeEscalation for deployment templates (#1328 ) * set allowPrivilegeEscalation for deployment templates * securityContext of container, not pod * aligning * default service account for pooler	2021-01-25 18:23:29 +01:00
Jan Mussler	4a88f00a3f	Full AWS gp3 support for iops and througput config. (#1261 ) Support new AWS EBS volume type `gp3` with `iops` and `throughput` in the manifest. Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2021-01-25 10:07:18 +01:00
Felix Kunde	4ea0b5f432	set AllowPrivilegeEscalation on container securityContext (#1326 )	2021-01-22 14:06:19 +01:00
Rafia Sabih	a9b677c957	Use fake client for connection pooler (#1301 ) Connection pooler creation, deletion, and synchronization now tested using fake client API. Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de>	2021-01-19 17:40:20 +01:00
Felix Kunde	258799b420	allow additional members from other teams (#1314 )	2021-01-15 15:11:02 +01:00
Rafia Sabih	e398cf8c7e	Avoid syncing when possible (#1274 ) Avoid extra syncing in case there are no changes in pooler requirements. Add pooler specific labels to pooler secrets. Add test case to check for pooler secret creation and deletion. Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de>	2021-01-14 09:53:09 +01:00
Sergey Dudoladov	b7f4cde541	wrap getting Patroni state into retry (#1293 ) Retry calls to Patorni API to get cluster state Co-authored-by: Sergey Dudoladov <sergey.dudoladov@zalando.de>	2021-01-08 15:08:44 +01:00
Sergey Dudoladov	168b679506	add a prefix for the name of a logical backup job (#1287 ) * add a prefix for the name of a logical backup job Co-authored-by: Sergey Dudoladov <sergey.dudoladov@zalando.de>	2021-01-07 10:38:07 +01:00
Felix Kunde	07c4f52ede	use pointer type for nodeAffinity (#1263 ) * use pointer type for nodeAffinity	2020-12-18 12:07:59 +01:00
Jan Mussler	a63ad49ef8	Initial commit for new 1.6 release with Postgres 13 support. (#1257 ) * Initial commit for new 1.6 release with Postgres 13 support. * Updating maintainers, Go version, Codeowners. * Use lazy upgrade image that contains pg13. * fix typo for ownerReference * fix clusterrole in helm chart * reflect GCP logical backup in validation * improve PostgresTeam docs * change defaults for enable_pgversion_env_var and storage_resize_mode * explain manual part of in-place upgrade * remove gsoc docs Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-12-17 15:00:29 +01:00
Pavel Tumik	77252e316c	Add node affinity support (#1166 ) * Adding nodeaffinity support alongside node_readiness_label * add documentation for node affinity * add node affinity e2e test * add unit test for node affinity Co-authored-by: Steffen Pøhner Henriksen <str3sses@gmail.com> Co-authored-by: Adrian Astley <adrian.astley@activision.com>	2020-12-16 14:56:28 +01:00
Rafia Sabih	f28706e940	Sync sts at pgversion upgrade (#1256 ) When pgversion is updated to a higher major version number, sync statefulSets also. Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de>	2020-12-16 13:50:24 +01:00
Pavel Tumik	fbd04896c2	Add ability to upload logical backup to gcs (#1173 ) Support logical backup provider/storage S3 and GCS equivalent	2020-12-16 10:41:08 +01:00
Felix Kunde	929075814a	diff SecurityContext of containers (#1255 ) * diff SecurityContext of containers * change log messages to use "does not" vs "doesn't"	2020-12-15 10:06:53 +01:00
Felix Kunde	83fbccac5a	new env var for backwards compatability between spilo 12 and 13 (#1254 )	2020-12-14 18:43:53 +01:00
Jan Mussler	b88d8e34e1	Fix function name in test (#1250 ) * Fix function name in test Error was somehow introduced in last 2 PRs merged. * Update volumes_test.go	2020-12-12 00:35:27 +01:00
Felix Kunde	6a97316a69	Support inherited annotations for all major objects (#1236 ) * add comments where inherited annotations could be added * add inheritedAnnotations feature * return nil if no annotations are set * minor changes * first downscaler then inherited annotations * add unit test for inherited annotations * add pvc to test + minor changes * missing comma * fix nil map assignment * set annotations in the same order it is done in other places * replace acidClientSet with acid getters in K8s client * more fixes on clientSet vs getters * minor changes * remove endpoints from annotation test * refine unit test - but deployment and sts are still empty * fix checkinng sts and deployment * make annotations setter one liners * no need for len check anymore Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de>	2020-12-11 16:34:01 +01:00
Jan Mussler	549f71bb49	Support EBS gp2 to gp3 migration on sync for below 1tb volumes (#1242 ) * initial commit for gp3 migration. * Default volume migration done. * Added Gomock and one test case with mock. * Dep update. * more changes for code gen. * push fake package. * Rename var. * Changes to Makefile and return value. * Macke mocks phony due to overlap in foldername. * Learning as one goes. Initialize map. * Wrong toggle. * Expect modify call. * Fix mapping of ids in test. * Fix volume id. * volume ids. * Fixing test setup. Late night... * create all pvs. * Fix test case config. * store volumes and compare. * More logs. * Logging of migration action. * Ensure to log errors. * Log warning if modify failed, e.g. due to ebs volume state. * Add more output. * Skip local e2e tests. * Reflect k8s volume id in test data. Extract aws volume id from k8s value. * Finalizing ebs migration. * More logs. describe fails. * Fix non existing fields in gp2 discovery. * Remove nothing to do flag for migration. * Final commit for migration. * add new options to all places Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-12-11 15:52:32 +01:00
Rafia Sabih	5a6da7275f	avoid hard-codeed spilo-role (#1246 ) Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de>	2020-12-09 13:00:06 +01:00
Sergey Dudoladov	dc9a5b1e61	Introduce PGVERSION (#1172 ) * introduce PGVERSION Co-authored-by: Sergey Dudoladov <sergey.dudoladov@zalando.de>	2020-11-27 18:49:49 +01:00
Sergey Dudoladov	6f5751fe55	raise log level for malformed secrets (#1235 ) Co-authored-by: Sergey Dudoladov <sergey.dudoladov@zalando.de>	2020-11-27 18:47:50 +01:00
Boyan Bonev	85d1a72cd6	Add scheduler name support - [Update #990 ] (#1226 ) * Add ability to specify alternative schedulers via schedulerName. Co-authored-by: micah.coletti@gmail.com <micah.coletti@gmail.com>	2020-11-25 10:55:05 +01:00
Jan Mussler	c4ae11629b	Fix connection pooler deployment selectors (#1213 ) Stick with the existing pooler deployment selector labels to make it compatible with existing deployments. Make the use of additional labels clear and avoid where not needed. Deployment Selector and Service Selector now do not use extra labels, pod spec does.	2020-11-23 17:18:18 +01:00
Rafia Sabih	49158ecb68	Connection pooler for replica (#1127 ) * Enable connection pooler for replica * Refactor code for connection pooler - Move all the relevant code to a separate file - Move all the related tests to a separate file - Avoid using cluster where not required - Simplify the logic in sync and other methods - Cleanup of duplicated or unused code * Fix labels for the replica pods * Update deleteConnectionPooler to include role * Adding test cases and other changes - Fix unit test and delete secret when required only - Make sure we use empty fresh cluster for every test case. * enhance e2e test * Disable pooler in complete manifest as this is source for e2e too an creates unnecessary pooler setups. Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de> Co-authored-by: Jan Mussler <janm81@gmail.com>	2020-11-13 14:52:21 +01:00
Felix Kunde	3fed565328	check resize mode on update events (#1194 ) * check resize mode on update events * add unit test for PVC resizing * set resize mode to pvc in charts and manifests * add test for quantityToGigabyte * just one debug line for syncing volumes * extend test and update log msg	2020-11-11 13:22:43 +01:00
Sergey Dudoladov	e779eab22f	Update e2e pipeline (#1202 ) * clean up after test_multi_namespace test * see the PR description for complete list of changes Co-authored-by: Sergey Dudoladov <sergey.dudoladov@zalando.de>	2020-11-11 10:21:46 +01:00
Pavel Tumik	db0d089e75	Fix cloning from GCS (#1176 ) * Fix clone from gcs * pass google credentials env var if using GS bucket * remove requirement for timezone as GCS returns timestamp in local time to the region it is in * Revert "remove requirement for timezone as GCS returns timestamp in local time to the region it is in" This reverts commit `ac4eb350d9`. * update GCS documentation * remove sentence about logical backups * reword pod environment configmap section * fix documentation	2020-11-03 15:05:44 +01:00
Sergey Dudoladov	4f3bb6aa8c	Remove operator checks that prevent PG major version upgrade (#1160 ) * remove checks that prevent major version upgrade Co-authored-by: Sergey Dudoladov <sergey.dudoladov@zalando.de>	2020-11-02 16:49:29 +01:00
Jan Mussler	c694a72352	Make failure in retry a warning not an error. (#1188 )	2020-10-29 13:12:25 +01:00
Felix Kunde	d658b9672e	PostgresTeam CRD for advanced team management (#1165 ) * PostgresTeamCRD for advanced team management * rework internal structure to be closer to CRD * superusers instead of admin * add more util functions and unit tests * fix initHumanUsers * check for superusers when creating normal teams * polishing and fixes * adding the essential missing pieces * add documentation and update rbac * reflect some feedback * reflect more feedback * fixing debug logs and raise QueueResyncPeriodTPR * add two more flags to disable CRD and its superuser support * fix chart * update go modules * move to client 1.19.3 and update codegen	2020-10-28 10:40:10 +01:00
Jan Mussler	3a86dfc8bb	End 2 End tests speedup (#1180 ) * Improving end 2 end tests, especially speed of execution and error, by implementing proper eventual asserts and timeouts. * Add documentation for running individual tests * Fixed String encoding in Patorni state check and error case * Printing config as multi log line entity, makes it readable and grepable on startup * Cosmetic changes to logs. Removed quotes from diff. Move all object diffs to text diff. Enabled padding for log level. * Mount script with tools for easy logaccess and watching objects. * Set proper update strategy for Postgres operator deployment. * Move long running test to end. Move pooler test to new functions. * Remove quote from valid K8s identifiers.	2020-10-28 10:04:33 +01:00
preved911	d9f5d1c9df	changed PodEnvironmentSecret location namespace (#1177 ) Signed-off-by: Ildar Valiullin <preved.911@gmail.com>	2020-10-22 08:49:30 +02:00
Dmitry Dolgov	1f5d0995a5	Lookup function installation (#1171 ) * Lookup function installation Due to reusing a previous database connection without closing it, lookup function installation process was skipping the first database in the list, installing twice into postgres db instead. To prevent that, make internal initDbConnWithName to overwrite a connection object, and return the same object only from initDbConn, which is sort of public interface. Another solution for this would be to modify initDbConnWithName to return a connection object and then generate one temporary connection for each db. It sound feasible but after one attempt it seems it requires a bit more changes around (init, close connections) and doesn't bring anything significantly better on the table. In case if some future changes will prove this wrong, do not hesitate to refactor. Change retry strategy to more insistive one, namely: * retry on the next sync even if we failed to process one database and install pooler appliance. * perform the whole installation unconditionally on update, since the list of target databases could be changed. And for the sake of making it even more robust, also log the case when operator decides to skip installation. Extend connection pooler e2e test with verification that all dbs have required schema installed.	2020-10-19 16:18:58 +02:00
Dmitry Dolgov	d15f2d3392	Readiness probe (#1169 ) Right now there are no readiness probes defined for connection pooler, which means after a pod restart there is a short time window (between a container start and connection pooler starting listening to a socket) when a service can send queries to a new pod, but connection will be refused. The pooler container is rather lightweight and it start to listen immediately, so the time window is small, but still. To fix this add a readiness probe for tcp socket opened by connection pooler.	2020-10-15 10:16:42 +02:00
Sergey Dudoladov	2a21cc4393	Compare Postgres pod priority on Sync (#1144 ) * compare Postgres pod priority on Sync Co-authored-by: Sergey Dudoladov <sergey.dudoladov@zalando.de>	2020-09-23 17:26:56 +02:00
neelasha-09	ab95eaa6ef	Fixes #1130 (#1139 ) * Fixes #1130 Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-09-22 17:16:05 +02:00
Rico Berger	d09e418b56	Set user and group in security context (#1083 ) * Set user and group in security context	2020-09-15 13:27:59 +02:00
Igor Yanchenko	d8884a4003	Allow to overwrite default ExternalTrafficPolicy for the service (#1136 ) * Allow to overwrite default ExternalTrafficPolicy for the service	2020-09-15 13:19:22 +02:00
Felix Kunde	dfd0dd90ed	set search_path for default roles (#1065 ) * set search_path for default roles * deployment back to 1.5.0 Co-authored-by: Felix Kunde <felix.kunde@zalando.de>	2020-08-11 10:42:31 +02:00
Felix Kunde	0508266219	Remove all secrets on delete incl. pooler (#1091 ) * fix syncSecrets and remove pooler secret * update log for deleteSecret * use c.credentialSecretName(username) * minor fix	2020-08-10 18:26:26 +02:00
Felix Kunde	43163cf83b	allow using both infrastructure_roles_options (#1090 ) * allow using both infrastructure_roles_options * new default values for user and role definition * use robot_zmon as parent role * add operator log to debug * right name for old secret * only extract if rolesDefs is empty * set password1 in old infrastructure role * fix new infra rile secret * choose different role key for new secret * set memberof everywhere * reenable all tests * reflect feedback * remove condition for rolesDefs	2020-08-10 15:08:03 +02:00
Felix Kunde	f3ddce81d5	fix random order for pod environment tests (#1085 )	2020-07-30 17:48:15 +02:00
hlihhovac	47b11f7f89	change Clone attribute of PostgresSpec to CloneDescription (#1020 ) change Clone attribute of PostgresSpec to ConnectionPooler update go.mod from master * fix TestConnectionPoolerSynchronization() * Update pkg/apis/acid.zalan.do/v1/postgresql_type.go Co-authored-by: Felix Kunde <felix-kunde@gmx.de> Co-authored-by: Pavlo Golub <pavlo.golub@gmail.com> Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-07-30 16:31:29 +02:00
Felix Kunde	3bee590d43	fix index in TestGenerateSpiloPodEnvVarswq (#1084 ) Co-authored-by: Felix Kunde <felix.kunde@zalando.de>	2020-07-30 13:35:37 +02:00
Christian Rohmann	ece341d516	Allow pod environment variables to also be sourced from a secret (#946 ) * Extend operator configuration to allow for a pod_environment_secret just like pod_environment_configmap * Add all keys from PodEnvironmentSecrets as ENV vars (using SecretKeyRef to protect the value) * Apply envVars from pod_environment_configmap and pod_environment_secrets before doing the global settings from the operator config. This allows them to be overriden by the user (via configmap / secret) * Add ability use a Secret for custom pod envVars (via pod_environment_secret) to admin documentation * Add pod_environment_secret to Helm chart values.yaml * Add unit tests for PodEnvironmentConfigMap and PodEnvironmentSecret - highly inspired by @kupson and his very similar PR #481 * Added new parameter pod_environment_secret to operatorconfig CRD and configmap examples * Add pod_environment_secret to the operationconfiguration CRD Co-authored-by: Christian Rohmann <christian.rohmann@inovex.de>	2020-07-30 10:48:16 +02:00
Igor Yanchenko	002b47ec32	Use scram-sha-256 hash if postgresql parameter password_encryption set to do so. (#995 ) * Use scram-sha-256 hash if postgresql parameter password_encryption set to do so. * test fixed * Refactoring * code style	2020-07-16 14:43:57 +02:00
Felix Kunde	375963424d	delete secrets the right way (#1054 ) * delete secrets the right way * make a one function * continue deleting secrets even if one delete fails Co-authored-by: Felix Kunde <felix.kunde@zalando.de>	2020-07-10 15:07:42 +02:00
Igor Yanchenko	88735a798a	Resize volume by changing pvc size if enabled in config. (#958 ) * Try to resize pvc if resizing pv has failed * added config option to switch between storage resize strategies * changes according to requests * Update pkg/controller/operator_config.go Co-authored-by: Felix Kunde <felix-kunde@gmx.de> * enable_storage_resize documented added examples to the default configuration and helm value files * enable_storage_resize renamed to volume_resize_mode, off by default * volume_resize_mode renamed to storage_resize_mode * Update pkg/apis/acid.zalan.do/v1/crds.go * pkg/cluster/volumes.go updated * Update docs/reference/operator_parameters.md * Update manifests/postgresql-operator-default-configuration.yaml * Update pkg/controller/operator_config.go * Update pkg/util/config/config.go * Update charts/postgres-operator/values-crd.yaml * Update charts/postgres-operator/values.yaml * Update docs/reference/operator_parameters.md * added logging if no changes required Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-07-03 10:53:37 +02:00
Felix Kunde	0c6655a22d	skip creation later to improve visibility of errors (#1013 ) * try to emit error for missing team name in cluster name * skip creation after new cluster object * move SetStatus to k8sclient and emit event when skipping creation and rename to SetPostgresCRDStatus Co-authored-by: Felix Kunde <felix.kunde@zalando.de>	2020-06-17 13:32:16 +02:00
Felix Kunde	fa6929f028	do not block rolling updates with lazy spilo update enabled (#1012 ) * do not block rolling updates with lazy spilo update enabled * treat initContainers like Spilo image Co-authored-by: Felix Kunde <felix.kunde@zalando.de>	2020-06-11 12:23:39 +02:00
Felix Kunde	fe7ffaa112	trigger rolling update when securityContext of PodTemplate changes (#1007 ) Co-authored-by: Felix Kunde <felix.kunde@zalando.de>	2020-06-09 10:27:57 +02:00
alfredw33	2b0def5bc8	Support for GCS WAL-E backups (#620 ) * Support for WAL_GS_BUCKET and GOOGLE_APPLICATION_CREDENTIALS environtment variables * Fixed merge issue but also removed all changes to support macos. * Updated test to new format * Missed macos specific changes * Added documentation and addressed comments * Update docs/administrator.md * Update docs/administrator.md * Update e2e/run.sh Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-06-03 17:33:48 +02:00
Steffen Pøhner Henriksen	0fa61a6ab3	Changed order of sidecar env vars (#980 ) * Changed order of sidecar env vars * Cleaned up test code	2020-05-25 16:32:33 +02:00
Felix Kunde	3a49b485e5	delete secrets of system users too (#974 )	2020-05-14 11:34:02 +02:00
Christian Rohmann	8ff7658ed3	Fix pooler delete (#960 ) deleteConnectionPooler function incorrectly checks that the delete api response is ResourceNotFound. Looks like the only consequence is a confusing log message, but obviously it's wrong. Remove negation, since having ResourceNotFound as error is the good case. Co-authored-by: Christian Rohmann <christian.rohmann@inovex.de>	2020-05-13 14:55:54 +02:00
Ask Bjørn Hansen	852f29274a	Fix typo in error message (#969 )	2020-05-12 10:05:42 +02:00
Rafia Sabih	d52296c323	Propagate annotations to the StatefulSet (#932 ) * Initial commit * Corrections - set the type of the new configuration parameter to be array of strings - propagate the annotations to statefulset at sync * Enable regular expression matching * Improvements -handle rollingUpdate flag -modularize code -rename config parameter name * fix merge error * Pass annotations to connection pooler deployment * update code-gen * Add documentation and update manifests * add e2e test and introduce option in configmap * fix service annotations test * Add unit test * fix e2e tests * better key lookup of annotations tests * add debug message for annotation tests * Fix typos * minor fix for looping * Handle update path and renaming - handle the update path to update sts and connection pooler deployment. This way no need to wait for sync - rename the parameter to downscaler_annotations - handle other review comments * another try to fix python loops * Avoid unneccessary update events * Update manifests * some final polishing * fix cluster_test after polishing Co-authored-by: Rafia Sabih <rafia.sabih@zalando.de> Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-05-04 14:46:56 +02:00
Felix Kunde	d76203b3f9	Bootstrapped databases with best practice role setup (#843 ) * PreparedDatabases with default role setup * merge changes from master * include preparedDatabases spec check when syncing databases * create a default preparedDB if not specified * add more default privileges for schemas * use empty brackets block for undefined objects * cover more default privilege scenarios and always define admin role * add DefaultUsers flag * support extensions and defaultUsers for preparedDatabases * remove exact version in deployment manifest * enable CRD validation for new field * update generated code * reflect code review * fix typo in SQL command * add documentation for preparedDatabases feature + minor changes * some datname should stay * add unit tests * reflect some feedback * init users for preparedDatabases also on update * only change DB default privileges on creation * add one more section in user docs * one more sentence	2020-04-29 10:56:06 +02:00
Sergey Dudoladov	cc635a02e3	Lazy upgrade of the Spilo image (#859 ) * initial implementation * describe forcing the rolling upgrade * make parameter name more descriptive * add missing pieces * address review * address review * fix bug in e2e tests * fix cluster name label in e2e test * raise test timeout * load spilo test image * use available spilo image * delete replica pod for lazy update test * fix e2e * fix e2e with a vengeance * lets wait for another 30m * print pod name in error msg * print pod name in error msg 2 * raise timeout, comment other tests * subsequent updates of config * add comma * fix e2e test * run unit tests before e2e * remove conflicting dependency * Revert "remove conflicting dependency" This reverts commit `65fc09054b`. * improve cdp build * dont run unit before e2e tests * Revert "improve cdp build" This reverts commit `e2a8fa12aa`. Co-authored-by: Sergey Dudoladov <sergey.dudoladov@zalando.de> Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-04-29 10:07:14 +02:00
Sergey Dudoladov	0ca30ba3d9	fix params in function call (#939 ) Co-authored-by: Sergey Dudoladov <sergey.dudoladov@zalando.de>	2020-04-28 09:31:41 +02:00
Björn Fischer	168abfe37b	Fully speced global sidecars (#890 ) * implement fully speced global sidecars * fix issue #924	2020-04-27 17:40:22 +02:00
Christian Rohmann	21b9b6fcbe	Emit K8S events to the postgresql CR as feedback to the requestor / user (#896 ) * Add EventsGetter to KubeClient to enable to sending K8S events * Add eventRecorder to the controller, initialize it and hand it down to cluster via its constructor to enable it to emit events this way * Add first set of events which then go to the postgresql custom resource the user interacts with to provide some feedback * Add right to "create" events to operator cluster role * Adapt cluster tests to new function sigurature with eventRecord (via NewFakeRecorder) * Get a proper reference before sending events to a resource Co-authored-by: Christian Rohmann <christian.rohmann@inovex.de>	2020-04-27 08:22:07 +02:00
Sergey Dudoladov	3c91bdeffa	Re-create pods only if all replicas are running (#903 ) * adds a Get call to Patroni interface to fetch state of a Patroni member * postpones re-creating pods if at least one replica is currently being created Co-authored-by: Sergey Dudoladov <sergey.dudoladov@zalando.de> Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-04-20 15:14:11 +02:00
ReSearchITEng	5014eebfb2	when kubernetes_use_configmaps -> skip further endpoints actions even delete (#921 ) * further compatibility with k8sUseConfigMaps - skip further endpoints related actions * Update pkg/cluster/cluster.go thanks! Co-Authored-By: Felix Kunde <felix-kunde@gmx.de> * Update pkg/cluster/cluster.go Co-Authored-By: Felix Kunde <felix-kunde@gmx.de> * Update pkg/cluster/cluster.go Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-04-16 16:47:59 +02:00
Dmitry Dolgov	6a689cdc1c	Prevent empty syncs (#922 ) There is a possibility to pass nil as one of the specs and an empty spec into syncConnectionPooler. In this case it will perfom a syncronization because nil != empty struct. Avoid such cases and make it testable by returning list of syncronization reasons on top together with the final error.	2020-04-16 15:14:31 +02:00
ReSearchITEng	7e8f6687eb	make tls pr798 use additionalVolumes capability from pr736 (#920 ) * make tls pr798 use additionalVolumes capability from pr736 * move the volume* sections lower * update helm chart crds and docs * fix user.md typos	2020-04-15 15:24:55 +02:00
Thierry Sallé	ea3eef45d9	Additional volumes capability (#736 ) * Allow additional Volumes to be mounted * added TargetContainers option to determine if additional volume need to be mounter or not * fixed dependencies * updated manifest additional volume example * More validation Check that there are no volume mount path clashes or "all" vs ["a", "b"] mixtures. Also change the default behaviour to mount to "postgres" container. * More documentation / example about additional volumes * Revert go.sum and go.mod from origin/master * Declare addictionalVolume specs in CRDs * fixed k8sres after rebase * resolv conflict Co-authored-by: Dmitrii Dolgov <9erthalion6@gmail.com> Co-authored-by: Thierry <thierry@malt.com>	2020-04-15 09:13:35 +02:00
Dmitry Dolgov	a1f2bd05b9	Prevent superuser from being a connection pool user (#906 ) * Protected and system users can't be a connection pool user It's not supported, neither it's a best practice. Also fix potential null pointer access. For protected users it makes sense by intent of protecting this users (e.g. from being overriden or used as something else than supposed). For system users the reason is the same as for superuser, it's about replicastion user and it's under patroni control. This is implemented on both levels, operator config and postgresql manifest. For the latter we just use default name in this case, assuming that operator config is always correct. For the former, since it's a serious misconfiguration, operator panics.	2020-04-09 09:21:45 +02:00
Leon Albers	4dee8918bd	Allow configuration of patroni's replication mode (#869 ) * Add patroni parameters for `synchronous_mode` * Update complete-postgres-manifest.yaml, removed quotation marks * Update k8sres_test.go, adjust result for `Patroni configured` * Update k8sres_test.go, adjust result for `Patroni configured` * Update complete-postgres-manifest.yaml, set synchronous mode to false in this example * Update pkg/cluster/k8sres.go Does the same but is shorter. So we fix that it if you like. Co-Authored-By: Felix Kunde <felix-kunde@gmx.de> * Update docs/reference/cluster_manifest.md Co-Authored-By: Felix Kunde <felix-kunde@gmx.de> * Add patroni's `synchronous_mode_strict` * Extend `TestGenerateSpiloConfig` with `SynchronousModeStrict` Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-04-06 14:27:17 +02:00
ReSearchITEng	1249626a60	kubernetes_use_configmap (#887 ) * kubernetes_use_configmap * Update manifests/postgresql-operator-default-configuration.yaml Co-Authored-By: Felix Kunde <felix-kunde@gmx.de> * Update manifests/configmap.yaml Co-Authored-By: Felix Kunde <felix-kunde@gmx.de> * Update charts/postgres-operator/values.yaml Co-Authored-By: Felix Kunde <felix-kunde@gmx.de> * go.fmt Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-04-02 13:20:45 +02:00
Felix Kunde	b43b22dfcc	Call me pooler, not pool (#883 ) * rename pooler parts and add example to manifest * update codegen * fix manifest and add more details to docs * reflect renaming also in e2e tests	2020-04-01 10:34:03 +02:00
Felix Kunde	e6eb10d28a	fix TestTLS (#894 )	2020-04-01 10:31:31 +02:00
ReSearchITEng	6ed1030838	TLS - add OpenShift compatibility (#885 ) * solves https://github.com/zalando/postgres-operator/pull/798#issuecomment-605201260 Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-04-01 09:39:54 +02:00
Felix Kunde	66f2cda87f	Move operator to go 1.14 (#882 ) * update go modules march 2020 * update to GO 1.14 * reflect k8s client API changes	2020-03-30 15:50:17 +02:00
Felix Kunde	ba9cf68650	Change type of pod environment config map to NamespacedName (#870 ) * allow PodEnvironmentConfigMap in other namespaces * update codegen * update docs and comments	2020-03-25 15:59:31 +01:00
Dmitry Dolgov	9dfa433363	Connection pooler (#799 ) Connection pooler support Add support for a connection pooler. The idea is to make it generic enough to be able to switch between different implementations (e.g. pgbouncer or odyssey). Operator needs to create a deployment with pooler and a service for it to access. For connection pool to work properly, a database needs to be prepared by operator, namely a separate user have to be created with an access to an installed lookup function (to fetch credential for other users). This setups is supposed to be used only by robot/application users. Usually a connection pool implementation is more CPU bounded, so it makes sense to create several pods for connection pool with more emphasize on cpu resources. At the moment there are no special affinity or tolerations assigned to bring those pods closer to the database. For availability purposes minimal number of connection pool pods is 2, ideally they have to be distributed between different nodes/AZ, but it's not enforced in the operator itself. Available configuration supposed to be ergonomic and in the normal case require minimum changes to a manifest to enable connection pool. To have more control over the configuration and functionality on the pool side one can customize the corresponding docker image. Co-authored-by: Felix Kunde <felix-kunde@gmx.de>	2020-03-25 12:57:26 +01:00
Felix Kunde	579f78864b	pass cluster labels as JSON to Spilo (#877 )	2020-03-25 09:59:54 +01:00
Felix Kunde	b66734a0a9	omit PgVersion diff on sync (#860 ) * use PostgresParam.PgVersion everywhere * on sync compare pgVersion with SpiloConfiguration * update getNewPgVersion and added tests	2020-03-13 11:48:19 +01:00
zimbatm	65fb2ce1a6	add support for custom TLS certificates (#798 ) * add support for custom TLS certificates	2020-03-13 11:44:38 +01:00
Felix Kunde	b997e3682f	be more permissive with standbys (#842 ) * be more permissive with standbys * reflect feedback and updated docs	2020-02-24 15:14:14 +01:00
Felix Kunde	742d7334a1	use cluster-name as default label everywhere (#782 ) * use cluster-name as default label everywhere * fix e2e test	2020-02-19 15:01:01 +01:00
Felix Kunde	702a194c41	switch to rbac/v1 (#829 ) * switch to rbac/v1	2020-02-17 11:25:07 +01:00
Felix Kunde	3b10dc645d	patch/update services on type change (#824 ) * use Update when disabling LoadBalancer + added e2e test	2020-02-13 16:24:15 +01:00
Jonathan Juares Beber	ba60e15d07	Add ServiceAnnotations cluster config (#803 ) The [operator parameters][1] already support the `custom_service_annotations` config.With this parameter is possible to define custom annotations that will be used on the services created by the operator. The `custom_service_annotations` as all the other [operator parameters][1] are defined on the operator level and do not allow customization on the cluster level. A cluster may require different service annotations, as for example, set up different cloud load balancers timeouts, different ingress annotations, and/or enable more customizable environments. This commit introduces a new parameter on the cluster level, called `serviceAnnotations`, responsible for defining custom annotations just for the services created by the operator to the specifically defined cluster. It allows a mix of configuration between `custom_service_annotations` and `serviceAnnotations` where the latest one will have priority. In order to allow custom service annotations to be used on services without LoadBalancers (as for example, service mesh services annotations) both `custom_service_annotations` and `serviceAnnotations` are applied independently of load-balancing configuration. For retro-compatibility purposes, `custom_service_annotations` is still under [Load balancer related options][2]. The two default annotations when using LoadBalancer services, `external-dns.alpha.kubernetes.io/hostname` and `service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout` are still defined by the operator. `service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout` can be overridden by `custom_service_annotations` or `serviceAnnotations`, allowing a more customizable environment. `external-dns.alpha.kubernetes.io/hostname` can not be overridden once there is no differentiation between custom service annotations for replicas and masters. It updates the documentation and creates the necessary unit and e2e tests to the above-described feature too. [1]: https://github.com/zalando/postgres-operator/blob/master/docs/reference/operator_parameters.md [2]: https://github.com/zalando/postgres-operator/blob/master/docs/reference/operator_parameters.md#load-balancer-related-options	2020-02-10 12:03:25 +01:00
Vito Botta	a660d758a5	Add region setting for logical backups to non-AWS storage (#813 ) * Add region setting for logical backups to non-AWS storage	2020-02-10 11:48:24 +01:00
Felix Kunde	1f0312a014	make minimum limits boundaries configurable (#808 ) * make minimum limits boundaries configurable * add e2e test	2020-02-03 11:43:18 +01:00
Felix Kunde	7fb163252c	standby clusters can only have 1 pod for now (#797 )	2020-01-16 10:47:34 +01:00
Felix Kunde	cd110aabf4	Enforce minimum cpu and memory limits (#731 ) * add validation for PG resources and volume size * check resource requests also on UPDATE and SYNC + update docs * if cluster was running don't error on sync	2019-12-12 16:43:55 +01:00
Felix Kunde	107334fe71	Add global option to enable/disable init containers and sidecars (#478 ) * Add global option to enable/disable init containers and sidecars * update dependencies	2019-12-10 15:45:54 +01:00
Armin Nesiren	5f87384d7f	Passing endpoint, access and secret key to logical-backup container (#628 ) * Added possibility to add custom annotations to LoadBalancer service. * Added parameters for custom endpoint, access and secret key for logical backup. * Modified dump.sh so it knows how to handle new features. Configurable S3 SSE	2019-11-26 10:40:49 +01:00
Felix Kunde	2ce602fcd7	fix errors when changing service type (#716 ) * fix errors when changing service type * nullify service and endpoint before recreation * improve wait for delete logic and reuse config parameters	2019-11-26 10:28:32 +01:00
Felix Kunde	f9487e41c1	inject cluster name label into logical backup pod (#725 ) * inject cluster name label into logical backup pod	2019-11-20 13:58:41 +01:00
Felix Kunde	0b544ae43f	pass additionalSecretMount to logical backup pod (#714 )	2019-11-19 18:06:55 +01:00
Thomas Runyon	535517cd1b	Custom annotations 329 (#657 ) * Add ability for custom annotations to database pods	2019-11-11 10:45:35 +01:00
Eric	6e682fd6b5	Fixing spelling mistake in delete PVC function name (#691 )	2019-10-18 16:41:56 +02:00
Felix Kunde	f0e29060b1	move StatefulSet to apps/v1 (#675 )	2019-09-30 16:42:04 +02:00
Felix Kunde	4a863d2280	Avoid orphaned objects on delete (#654 ) * Make setSpec function work correctly when updating cluster status fails	2019-08-27 12:54:35 +02:00
Felix Kunde	1d45a6aec3	change app label for logical backup pod (#621 ) * change app label for logical backup pod	2019-07-23 15:43:07 +02:00
Felix Kunde	2c3c7fd244	query namespaced K8s API in logical backup script (#623 )	2019-07-18 14:00:30 +02:00
Felix Kunde	3a914f9a3c	camelCasing all manifest parameters (#602 ) * deprecate snake_case manifest parameters * move backward compatible check and update test	2019-07-05 18:14:03 +02:00
Felix Kunde	36003b8264	enable shmVolume setting in OperatorConfiguration (#605 ) * enable shmVolume setting in OperatorConfiguration	2019-07-05 16:48:37 +02:00
Rafia Sabih	540d58d5bd	Adding the support for standby cluster This will set up a continuous wal streaming cluster, by adding the corresponding section in postgres manifest. Instead of having a full-fledged standby cluster as in Patroni, here we use only the wal path of the source cluster and stream from there. Since, standby cluster is streaming from the master and does not require to create or use databases of it's own. Hence, it bypasses the creation of users or databases. There is a separate sample manifest added to set up a standby-cluster.	2019-06-21 10:11:39 +02:00
Markus	93bfed3e75	Add secret mount to operator (#535 ) * add secret mount to operator	2019-06-19 12:40:49 +02:00
Felix Kunde	6918394562	Add PDB configuration toggle (#583 ) * Don't create an impossible disruption budget for smaller clusters. * sync PDB also on update	2019-06-18 10:48:21 +02:00
Maxim Ivanov	3553144cda	Support subPath in generated container (#452 ) * mounted volumes now provide a subPath	2019-06-17 15:49:01 +02:00
Erik Inge Bolsø	c65a9baedf	specify ReadOnlyRootFilesystem: false for pod security policies (#560 ) Explicitly specify ReadOnlyRootFilesystem: false so kubernetes can pick a less restrictive policy the operator has access to.	2019-06-17 14:03:33 +02:00
teuto.net Netzdienste GmbH	bbf28c4df7	Add additional S3 settings for cloning (#497 )	2019-06-14 12:28:00 +02:00
Rafia Sabih	2886027516	Some typos/spelling mistakes fix (#580 ) Harmless typos fix.	2019-06-06 14:20:15 +02:00
Aaron Miller	ec5b1d4d58	StatefulSet fsGroup config option to allow non-root spilo (#531 ) * StatefulSet fsGroup config option to allow non-root spilo * Allow Postgres CRD to overide SpiloFSGroup of the Operator. * Document FSGroup of a Pod cannot be changed after creation.	2019-06-04 16:38:26 +02:00
Erik Inge Bolsø	ebda39368e	database.go: remove hardcoded .svc.cluster.local dns suffix (#561 ) * database.go: substitute hardcoded .svc.cluster.local dns suffix with config parameter Use the pod's configured dns search path, for clusters where .svc.cluster.local is not correct.	2019-05-31 16:32:00 +02:00
Felix Kunde	24d412a562	generate spilo config can return error (with test) (#570 ) * fix: raise explicit error when failing to generate spilo config Signed-off-by: Stephane Tang <hi@stang.sh>	2019-05-22 17:35:03 +02:00
Stephane T	1f4267eb05	fix: remove headless service config when deleting cluster (#567 ) see: https://github.com/zalando/postgres-operator/issues/566 Signed-off-by: Stephane Tang <hi@stang.sh>	2019-05-21 13:49:34 +02:00
Sergey Dudoladov	f3e1e80aaf	Add logical backup (#442 ) * Add k8s cron job to spawn logical backups * Minor doc updates	2019-05-16 15:52:01 +02:00
Sergey Dudoladov	2c02b371e2	fix statefulset sync (#563 )	2019-05-14 11:15:47 +02:00
Dmitry Dolgov	f29bdaf96a	Override clone s3 bucket path (#487 ) Override clone s3 bucket path Add possibility to use a custom s3 bucket path for cloning a cluster from an arbitrary bucket (e.g. from another k8s cluster). For that a new config options is introduced `s3_wal_path`, that should point to a location that spilo would understand.	2019-05-10 12:52:42 +02:00
Felix Kunde	0fbfbb23bb	Use /status subresource instead of plain manifest field (#534 ) * turns PostgresStatus type into a struct with field PostgresClusterStatus * setStatus patch target is now /status subresource * unmarshalling PostgresStatus takes care of previous status field convention * new simple bool functions status.Running(), status.Creating()	2019-05-07 12:01:45 +02:00
Aaron Miller	15ec6a920d	Config option to allow Spilo container to run non-privileged. (#525 ) * Config option to allow Spilo container to run non-privileged. Runs non-privileged by default. Fixes #395 * add spilo_privileged to manifests/configmap.yaml * add spilo_privileged to helm chart's values.yaml	2019-04-03 17:13:39 +02:00
Stephane T	edeb06d39c	fix: update init_containers (#518 ) * fix: PATH expension in Makefile Signed-off-by: Stephane Tang <hi@stang.sh> * refact: pass list of containers to compareContainers() Signed-off-by: Stephane Tang <hi@stang.sh> * compare initContainers while comparing StatefulSet Fixes #517 Signed-off-by: Stephane Tang <hi@stang.sh> * refact: compareContainers() Signed-off-by: Stephane Tang <hi@stang.sh>	2019-03-19 17:46:12 +01:00
Sergey Dudoladov	0b53dbe5dc	Set statefulset update and management policy explicitly (#515 ) * fix logging in retry * explicitly set the stateful set update strategy to onDelete * add podManagementPolicy	2019-03-13 11:49:18 +01:00
Vineeth Reddy	db72d82f14	gofmt and golint fixes (#506 ) * fix gofmt and golint issues	2019-03-04 13:13:55 +01:00
Sergey Dudoladov	587d9091e7	Set HUMAN_ROLE Spilo env var (#409 ) * Set HUMAN_ROLE Spilo env var	2019-02-27 13:40:42 +01:00
Felix Kunde	31e568157b	reflect change in github url (#496 ) Project was moved from the incubator to the Zalando main org, hence the rename	2019-02-25 11:26:55 +01:00
teuto.net Netzdienste GmbH	26a7fdfa9f	Add Pod Anti Affinity (#489 ) * Add Pod Anti Affinity	2019-02-21 16:37:03 +01:00
Stephane T	d11b23bd71	Add inherited_labels (#459 ) * add support for inherited_labels Signed-off-by: Stephane Tang <hi@stang.sh> * update docs with inherited_labels Signed-off-by: Stephane Tang <hi@stang.sh>	2019-02-14 12:29:06 +01:00
Maxim Ivanov	ed6acc1178	Correctly report success in .status on Update (#469 )	2019-01-31 13:09:17 +01:00
Maxim Ivanov	3544cc90fa	Allow specifying init_containers in Postgres CRD (#445 ) * Add support for init_containers	2019-01-29 11:08:44 +01:00
Armin Nesiren	6f6a599c90	Added possibility to add custom annotations to LoadBalancer service. (#461 ) * Added possibility to add custom annotations to LoadBalancer service.	2019-01-25 11:35:27 +01:00
Maxim Ivanov	8330905ce7	Don't panic if Service for the role was not found (#451 )	2019-01-18 13:38:47 +01:00
Jan Mussler	c70905ae8b	Modifying some of the logging to be more descriptive. (#440 ) * Modifying some of the logging to be more descriptive.	2019-01-08 13:07:36 +01:00
zerg-junior	4b5d3cd121	Fix golint failures * Fix golint fails based on the original work from the user u5surf * Skip installing Docker as CDP now have one pre-installed (repairs builds on CDP)	2019-01-08 13:04:48 +01:00
Arve Knudsen	f7058c754d	Pass more variables to Spilo container (#437 ) Pass KUBERNETES_SCOPE_LABEL, KUBERNETES_ROLE_LABEL and KUBERNETES_LABELS to spilo container, so that they could be changed. Fix for #411	2019-01-04 13:42:52 +01:00
zerg-junior	5cfcc453a9	Update CRD configuration docs and fix the CDP build (#414 ) * Update CRD configuration docs * document resource consumption of the operator * Add talks by Oleksii	2019-01-02 12:01:47 +01:00
zerg-junior	c0b0b9a832	[WIP] Add 'admin' option to create role (#425 ) * Add 'admin' option to create role * Fix run_locally_script	2018-12-27 10:14:33 +01:00
Dmitry Dolgov	d6e6b00770	Add shm_volume option (#427 ) Add possibility to mount a tmpfs volume to /dev/shm to avoid issues like [this](https://github.com/docker-library/postgres/issues/416). To achieve that two new options were introduced: * `enableShmVolume` to PostgreSQL manifest, to specify whether or not mount this volume per database cluster * `enable_shm_volume` to operator configuration, to specify whether or not mount per operator. The first one, `enableShmVolume` takes precedence to allow us to be more flexible.	2018-12-21 16:22:30 +01:00
zerg-junior	45c89b3da4	[WIP] Add set_memory_request_to_limit option (#406 ) * Add set_memory_request_to_limit option	2018-11-15 14:00:08 +01:00
zerg-junior	96e3ea9511	Properly overwrite empty allowed source ranges for load balancers (#392 ) * Properly overwrite empty allowed source ranges for load balancers	2018-11-06 11:08:45 +01:00
zerg-junior	86ba92ad02	Rename 'permanent_slots' field to 'slots' (#401 )	2018-10-31 16:11:28 +01:00
zerg-junior	1b4181a724	[WIP] Add the ability to configure replications slots in Patroni (#398 ) * Add the ability to configure replication slots in Patroni * Add debugging to Makefile for CDP builds	2018-10-31 13:10:56 +01:00
zerg-junior	7907f95d2f	Improve reporting about rolling updates (#391 )	2018-09-24 11:57:43 +02:00
Noah Kantrowitz	688d252752	Some tweaks to ensure compat with newer Go. (#383 )	2018-09-17 10:13:07 +02:00
Noah Kantrowitz	0b75a89920	Fix the casing of github.com/Sirupsen/logrus to match what the project itself uses. (#380 ) Dep enforces this.	2018-09-06 10:26:48 +02:00
zerg-junior	25fa45fd58	[WIP] Grant 'superuser' to the members of Postgres admin teams (#371 ) Added support for superuser team in addition to the admin team that owns the postgres cluster.	2018-08-30 10:51:37 +02:00
zerg-junior	aeae0a6ef2	Use cluster's own namespace to patch the cluster manifest (#373 )	2018-08-22 11:07:12 +02:00
Oleksii Kliukin	e1ed4b847d	Use code-generation for CRD API and deepcopy methods (#369 ) Client-go provides a https://github.com/kubernetes/code-generator package in order to provide the API to work with CRDs similar to the one available for built-in types, i.e. Pods, Statefulsets and so on. Use this package to generate deepcopy methods (required for CRDs), instead of using an external deepcopy package; we also generate APIs used to manipulate both Postgres and OperatorConfiguration CRDs, as well as informers and listers for the Postgres CRD, instead of using generic informers and CRD REST API; by using generated code we can get rid of some custom and obscure CRD-related code and use a better API. All generated code resides in /pkg/generated, with an exception of zz_deepcopy.go in apis/acid.zalan.do/v1 Rename postgres-operator-configuration CRD to OperatorConfiguration, since the former broke naming convention in the code-generator. Moved Postgresql, PostgresqlList, OperatorConfiguration and OperatorConfigurationList and other types used by them into Change the type of the Error field in the Postgresql crd to a string, so that client-go could generate a deepcopy for it. Use generated code to set status of CRD objects as well. Right now this is done with patch, however, Kubernetes 1.11 introduces the /status subresources, allowing us to set the status with the special updateStatus call in the future. For now, we keep the code that is compatible with earlier versions of Kubernetes. Rename postgresql.go to database.go and status.go to logs_and_api.go to reflect the purpose of each of those files. Update client-go dependencies. Minor reformatting and renaming.	2018-08-15 17:22:25 +02:00
Oleksii Kliukin	e933908084	Configure pg_hba in the local postgresql configuration of Patroni. (#361 ) Previously, the operator put pg_hba into the bootstrap/pg_hba key of Patroni. That had 2 adverse effects: - pg_hba.conf was shadowed by Spilo default section in the local postgresql configuration - when updating pg_hba in the cluster manifest, the updated lines were not propagated to DCS, since the key was defined in the boostrap section of Patroni. Include some minor refactoring, moving methods to unexported when possible and commenting out usage of md5, so that gosec won't complain. Per https://github.com/zalando-incubator/postgres-operator/issues/330 Review by @zerg-junior	2018-08-08 11:01:26 +02:00
Oleksii Kliukin	acf46bfa62	Include CREATEROLE to the list of allowed flags. (#365 ) Previously it has been supported by the operator, but the validity check excluded it for no reason.	2018-08-08 10:53:08 +02:00
Oleksii Kliukin	b06186eb41	Linter-induced code refactoring, run round 2. (#360 ) Run more linters in the gometalinter, i.e. deadcode, megacheck, nakedret, dup. More consistent code formatting, remove two dead functions, eliminate naked a bunch of naked returns, refactor a few functions to avoid code duplication.	2018-08-06 12:09:19 +02:00
Oleksii Kliukin	59f0c5551e	Allow configuring pod priority globally and per cluster. (#353 ) * Allow configuring pod priority globally and per cluster. Allow to specify pod priority class for all pods managed by the operator, as well as for those belonging to individual clusters. Controlled by the pod_priority_class_name operator configuration parameter and the podPriorityClassName manifest option. See https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass for the explanation on how to define priority classes since Kubernetes 1.8. Some import order changes are due to go fmt. Removal of OrphanDependents deprecated field. Code review by @zerg-junior	2018-08-03 14:03:37 +02:00
Oleksii Kliukin	ac7b132314	Refactoring inspired by gometalinter. (#357 ) Among other things, fix a few issues with deepcopy implementation.	2018-08-03 11:09:45 +02:00
Oleksii Kliukin	d2d3f21dc2	Client go upgrade v6 (#352 ) There are shortcuts in this code, i.e. we created the deepcopy function by using the deepcopy package instead of the generated code, that will be addressed once migrated to client-go v8. Also, some objects, particularly statefulsets, are still taken from v1beta, this will also be addressed in further commits once the changes are stabilized.	2018-08-01 11:08:01 +02:00
Oleksii Kliukin	0181a1b5b1	Introduce a repair scan to fix failing clusters (#304 ) A repair is a sync scan that acts only on those clusters that indicate that the last add, update or sync operation on them has failed. It is supposed to kick in more frequently than the repair scan. The repair scan still remains to be useful to fix the consequences of external actions (i.e. someone deletes a postgres-related service by mistake) unbeknownst to the operator. The repair scan is controlled by the new repair_period parameter in the operator configuration. It has to be at least 2 times more frequent than a sync scan to have any effect (a normal sync scan will update both last synced and last repaired attributes of the controller, since repair is just a sync underneath). A repair scan could be queued for a cluster that is already being synced if the sync period exceeds the interval between repairs. In that case a repair event will be discarded once the corresponding worker finds out that the cluster is not failing anymore. Review by @zerg-junior	2018-07-24 11:21:45 +02:00
Oleksii Kliukin	1a0e5357dc	Improve generation of Scalyr container environment. (#346 ) * Improve generting of Scalyr container environment. Avoid duplicating POD_NAME and POD_NAMESPACE that already bundled every sidecar. Do not complain on the lack of SCLALYR_SERVER_HOST, since it is set to https://upload.eu.scalyr.com in the container we use. Do not mentioned SCALYR_SERVER_HOST in the error messages, since it is derived from the cluster name automatically.	2018-07-24 11:16:24 +02:00
Oleksii Kliukin	12871aad1a	Avoid showing an extra error when resizing volume fails (#350 ) Do not show 'persistent volumes are not compatible' errors for the volumes that failed to be resized because of the other reasons (i.e. the new size is smaller than the existing one).	2018-07-20 14:12:25 +02:00
zerg-junior	417f13c0bd	Submit RBAC credentials during initial Event processing (#344 ) * During initial Event processing submit the service account for pods and bind it to a cluster role that allows Patroni to successfully start. The cluster role is assumed to be created by the k8s cluster administrator.	2018-07-19 16:40:40 +02:00
Oleksii Kliukin	3a9378d3b8	Allow configuring the operator via the YAML manifest. (#326 ) * Up until now, the operator read its own configuration from the configmap. That has a number of limitations, i.e. when the configuration value is not a scalar, but a map or a list. We use a custom code based on github.com/kelseyhightower/envconfig to decode non-scalar values out of plain text keys, but that breaks when the data inside the keys contains both YAML-special elememtns (i.e. commas) and complex quotes, one good example for that is search_path inside `team_api_role_configuration`. In addition, reliance on the configmap forced a flag structure on the configuration, making it hard to write and to read (see https://github.com/zalando-incubator/postgres-operator/pull/308#issuecomment-395131778). The changes allow to supply the operator configuration in a proper YAML file. That required registering a custom CRD to support the operator configuration and provide an example at manifests/postgresql-operator-default-configuration.yaml. At the moment, both old configmap and the new CRD configuration is supported, so no compatibility issues, however, in the future I'd like to deprecate the configmap-based configuration altogether. Contrary to the configmap-based configuration, the CRD one doesn't embed defaults into the operator code, however, one can use the manifests/postgresql-operator-default-configuration.yaml as a starting point in order to build a custom configuration. Since previously `ReadyWaitInterval` and `ReadyWaitTimeout` parameters used to create the CRD were taken from the operator configuration, which is not possible if the configuration itself is stored in the CRD object, I've added the ability to specify them as environment variables `CRD_READY_WAIT_INTERVAL` and `CRD_READY_WAIT_TIMEOUT` respectively. Per review by @zerg-junior and @Jan-M.	2018-07-16 16:20:46 +02:00
Oleksii Kliukin	e90a01050c	Switchover must wait for the inner goroutine before it returns. (#343 ) * Switchover must wait for the inner goroutine before it returns. Otherwise, two corner cases may happen: - waitForPodLabel writes to the podLabelErr channel that has been already closed by the outer routine - the outer routine exists and the caller subscribes to the pod the inner goroutine has already subscribed to, resulting in panic. The previous commit `fe47f9ebea` that touched that code added the cancellation channel, but didn't bother to actually wait for the goroutine to be cancelled. Per report and review from @valer-cara. Original issue: https://github.com/zalando-incubator/postgres-operator/issues/342	2018-07-16 11:50:35 +02:00

... 2 3 4 5 6 ...

600 Commits