postgres-operator

Commit Graph

Author	SHA1	Message	Date
Sergey Dudoladov	2aeff096f7	Make ReplicaLoadBalancer a separate toggler	2018-03-02 13:35:25 +01:00
Sergey Dudoladov	2ef069ee93	Create/delete replica service regardless of load balancer setup	2018-02-27 17:10:49 +01:00
Oleksii Kliukin	2bb7e98268	update individual role secrets from infrastructure roles (#206 ) * Track origin of roles. * Propagate changes on infrastructure roles to corresponding secrets. When the password in the infrastructure role is updated, re-generate the secret for that role. Previously, the password for an infrastructure role was always fetched from the secret, making any updates to such role a no-op after the corresponding secret had been generated.	2018-02-23 17:24:04 +01:00
Dmitrii Dolgov	ef50b147c5	Use list of checks instead of a map	2018-02-23 14:24:33 +01:00
Dmitrii Dolgov	95d86c7600	Move container comparison logic to a separate function	2018-02-23 11:58:37 +01:00
Oleksii Kliukin	c4aab502b3	Remove Patroni leftover objects on cluster deletion. (#244 ) * Remove all endpoints and configmaps from Patroni when Patroni is running with Kubernetes support on cluster deletion.	2018-02-23 09:52:22 +01:00
Oleksii Kliukin	cca73e30b7	Make code around recreating pods and creating objects in the database less brittle (#213 ) There used to be a masterLess flag that was supposed to indicate whether the cluster it belongs to runs without the acting master by design. At some point, as we didn't really have support for such clusters, the flag has been misused to indicate there is no master in the cluster. However, that was not done consistently (a cluster without all pods running would never be masterless, even when the master is not among the running pods) and it was based on the wrong assumption that the masterless cluster will remain masterless until the next attempt to change that flag, ignoring the possibility of master coming up or some node doing a successful promotion. Therefore, this PR gets rid of that flag completely. When the cluster is running with 0 instances, there is obviously no master and it makes no sense to create any database objects inside the non-existing master. Therefore, this PR introduces an additional check for that. recreatePods were assuming that the roles of the pods recorded when the function has stared will not change; for instance, terminated replica pods should start as replicas. Revisit that assumption by looking at the actual role of the re-spawned pods; that avoids a failover if some replica has promoted to the master role while being re-spawned. In addition, if the failover from the old master was unsuccessful, we used to stop and leave the old master running on an old pod, without recording this fact anywhere. This PR makes the failover failure emit a warning, but not stop recreating the last master pod; in the worst case, the running master will be terminated, however, this case is rather unlikely one. As a side effect, make waitForPodLabel return the pod definition it waited for, avoiding extra API calls in recreatePods and movePodFromEndOfLifeNode	2018-02-22 10:42:05 +01:00
Sergey Dudoladov	f194a2ae5a	Introduce changes from the PR #200 by @alexeyklyukin	2018-02-07 14:02:32 +01:00
Manuel Gómez	bf4406d2a4	Consider container names in Statefulset diffs (#210 ) This includes a comparison on container names being equal in the decision of whether a Statefulset has been updated.	2018-01-16 12:06:11 +01:00
Oleksii Kliukin	9720ac1f7e	WIP: Hold the proper locks while examining the list of databases. Introduce a new lock called specMu lock to protect the cluster spec. This lock is held on update and sync, and when retrieving the spec in the API code. There is no need to acquire it for cluster creation and deletion: creation assigns the spec to the cluster before linking it to the controller, and deletion just removes the cluster from the list in the controller, both holding the global clustersMu Lock.	2017-12-22 13:06:11 +01:00
Manuel Gómez	15c278d4e8	Scalyr agent sidecar for log shipping (#190 ) * Scalyr agent sidecar for log shipping * Remove the default for the Scalyr image Now the image needs to be specified explicitly to enable log shipping to Scalyr. This removes the problem of having to generate the config file or publish our agent image repository. * Add configuration variable for Scalyr server URL Defaults to the EU address. * Alter style Newlines are cheap and make code easier to edit/refactor, but ok. * Fix StatefulSet comparison logic I broke it when I made the comparison consider all containers in the PostgreSQL pod.	2017-12-21 15:34:26 +01:00
Oleksii Kliukin	da0de8cff7	Make sure the statefulset that is deleted manually gets re-created. (#191 ) * Make sure the statefulset that is deleted manually gets re-created. Per report and analysis by Manuel Gomez. * Move the existence checks for other objects out of the Create functions. create{Object} for services, endpoints and PDBs refused to continue if there is a cached definition in the cluster, however, the only place where it makes sense is when creating a new cluster. Note that contrary to the statefulset this doesn't fix any issues, since those definitions were nullified correspondingly when the sync code detected there is no object present in the Kubernetes cluster.	2017-12-21 15:20:43 +01:00
Oleksii Kliukin	1c5451cd7d	Spelling fix.	2017-12-14 14:39:33 +01:00
Oleksii Kliukin	55dc12e512	Examine custom environment sources when syncing. When comparing statefulsets, make sure EnvFrom fields are compared as well.	2017-12-14 14:39:33 +01:00
Oleksii Kliukin	87bc47d8d0	Fixes for the case of re-creating the cluster after deletion. - make sure that the secrets for the system users (superuser, replication) are not deleted when the main cluster is. Therefore, we can re-create the cluster, potentially forcing Patroni to restore it from the backup and enable Patroni to connect, since it will use the old password, not the newly generated random one. - when syncing users, always check whether they are already in the DB. Previously, we did this only for the sync cluster case, but the new cluster could be actually the one restored from the backup by Patroni, having all or some of the users already in place. - delete endponts last. Patroni uses the $clustername endpoint in order to store the leader related metadata. If we remove it before removing all pods, one of those pods running Patroni will re-create it and the next attempt to create the cluster with the same name will stuble on the existing endpoint. - Use db.Exec instead of db.Query for queries that expect no result. This also fixes the issue with the DB creation, since we didn't release an empty Row object it was not possible to create more than one database for a cluster.	2017-12-13 16:49:00 +01:00
Oleksii Kliukin	1fb8cf7ea0	Avoid overwriting critical users. (#172 ) * Avoid overwriting critical users. Disallow defining new users either in the cluster manifest, teams API or infrastructure roles with the names mentioned in the new protected_role_names parameter (list of comma-separated names) Additionally, forbid defining a user with the name matching either super_username or replication_username, so that we don't overwrite system roles required for correct working of the operator itself. Also, clear PostgreSQL roles on each sync first in order to avoid using the old definitions that are no longer present in the current manifest, infrastructure roles secret or the teams API.	2017-12-05 14:27:12 +01:00
Oleksii Kliukin	022ce29314	Make an error message more verbose.	2017-12-04 10:49:25 +01:00
Oleksii Kliukin	637921cdee	Tests for initHumanUsers and initinitRobotUsers. Change the Cluster class in the process to implelement Teams API calls and Oauth token fetches as interfaces, so that we can mock them in the tests.	2017-12-04 10:49:25 +01:00
Oleksii Kliukin	611cfe96d6	Fix an issue when not assigning the merge result. Add some tests.	2017-12-04 10:49:25 +01:00
Oleksii Kliukin	831ebb1f32	Fix the error reporting.	2017-12-04 10:49:25 +01:00
Oleksii Kliukin	2e226dee26	Avoid overwriting infrastrure roles. When a role is defined in the infrastructure roles and the cluster manifest use the infrastructure role definition and add flags defined in the manifest. Previously the role has been overwritten by the definition from the manifest. Because a random password is generated for each role from the manifest the applications relying on the infrastructure role credentials from the infrastructure roles secret were unable to connect.	2017-12-04 10:49:25 +01:00
Oleksii Kliukin	975b21f633	Rename api roles configuration parameter. Change api_roles_configuration to team_api_role_configuration	2017-11-22 10:43:35 +01:00
Oleksii Kliukin	2352fc9a39	go fmt run	2017-11-22 10:43:35 +01:00
Oleksii Kliukin	415a7fdc4d	Allow global configuration options for API roles. Add options to the PgUser structure, potentially allowing to set per-role options in the cluster definition as well. Introduce api_roles_configuration operator option with the default of log_statement=all	2017-11-22 10:43:35 +01:00
Oleksii Kliukin	c25e849fe4	Fix a failure to create new statefulset at sync. Also do a fmt run.	2017-11-08 18:24:17 +01:00
Murat Kabilov	86803406db	use sync methods while updating the cluster	2017-11-03 12:00:43 +01:00
Oleksii Kliukin	ce960e892a	Create new databases and change owners of existing ones during sync. (#153 ) * Create new databases and change owners of existing ones during sync.	2017-11-02 17:46:33 +01:00
Oleksii Kliukin	7a76be7d3e	Minor fixes around PDB (pod-distruption-budget) syncing: (#147 ) - Call comparison function in the case of the sync as well as for update - Include full cluster name in PDB name - Assign cluster labels to the PDB object	2017-10-23 12:26:59 +02:00
Murat Kabilov	661b141849	Fix Pod Disruption Budget null pointer exception	2017-10-20 11:43:50 +02:00
Oleksii Kliukin	eba23279c8	Kube cluster upgrade	2017-10-19 10:49:42 +02:00
Murat Kabilov	6c4cb4e9da	Perform manual failover during the scale down	2017-10-16 17:41:23 +02:00
Jan Mussler	cec695d48e	Superuser toggle for team members Make superuser toggleable for team members. Add and "admin" role to team members if superuser is disabled.	2017-10-12 15:01:54 +02:00
Murat Kabilov	8d5faaa5a5	return idle status when worker has nothing to do	2017-10-11 15:42:20 +02:00
Murat Kabilov	83c8d6c419	Extend diagnostic api with worker status info	2017-10-11 12:26:09 +02:00
Murat Kabilov	a35e9c6119	move from tpr to crd	2017-10-06 15:12:08 +02:00
Jan Mussler	c4af0ac6a6	Update cluster.go	2017-10-05 10:58:23 +02:00
Jan M	4a1170855a	Adding '_' to allowed chars.	2017-10-05 10:53:19 +02:00
Murat Kabilov	48ec6b35b9	perform manual failover on pg cluster rolling upgrade	2017-10-04 16:56:47 +03:00
Murat Kabilov	00194d0130	create dbs on cluster create	2017-10-04 16:24:27 +03:00
Murat Kabilov	90b49a24ba	make postgresql roles public	2017-09-11 17:44:32 +02:00
Murat Kabilov	8aa11ecee2	Add patroni api client	2017-08-30 16:01:18 +02:00
Murat Kabilov	899c0bef45	Use warningf instead of warnf	2017-08-30 14:35:56 +02:00
Murat Kabilov	5967837875	pass the name of the status in the log message on set cluster status failure	2017-08-17 12:18:53 +02:00
Murat Kabilov	272d7e1bcf	rename service field to services as it contains service per role	2017-08-15 15:55:56 +02:00
Murat Kabilov	82f58b57d8	add cluster and controller methods for getting status	2017-08-15 12:11:06 +02:00
Murat Kabilov	5470f20be4	always pass a cluster name as a logger field	2017-08-15 10:29:18 +02:00
Murat Kabilov	e26db66cb5	start all the log messages with lowercase letters	2017-08-15 10:12:36 +02:00
Oleksii Kliukin	f15f93f479	Bugfix/close db connections (#78 ) Open and close DB connections on-demand. Previously, we used to leave the DB connection open while the cluster was registered with the operator, potentially resutling in dangled connections if the operator terminates abnormally. Small refactoring around the role syncing code.	2017-08-10 10:10:00 +02:00
Murat Kabilov	cf663cb841	Fix golint warnings	2017-08-01 16:08:56 +02:00
Murat Kabilov	1f8b37f33d	Make use of kubernetes client-go v4 * client-go v4.0.0-beta0 * remove unnecessary methods for tpr object * rest client: use interface instead of structure pointer * proper names for constants; some clean up for log messages * remove teams api client from controller and make it per cluster	2017-07-25 15:25:17 +02:00

1 2 3

106 Commits