postgres-operator

Commit Graph

Author	SHA1	Message	Date
zerg-junior	bb5ce6cbbe	Merge pull request #195 from zalando-incubator/databases-rest-endpoint Add a REST endpoint to list databases in all clusters	2018-01-09 11:53:32 +01:00
Oleksii Kliukin	8e99518eeb	Improve behavior on node decomissionining (#184 ) * Trigger the node migration on the lack of the readiness label. * Examine the node's readiness status on node add. Make sure we don't miss the not ready node, especially when the operator is killed during the migration.	2018-01-04 11:53:15 +01:00
Oleksii Kliukin	5c8bd04169	Sort database by name.	2017-12-22 15:48:13 +01:00
Oleksii Kliukin	6102b0368c	Merge remote-tracking branch 'origin/databases-rest-endpoint' into databases-rest-endpoint # Conflicts: # pkg/apiserver/apiserver.go # pkg/controller/status.go	2017-12-22 13:08:50 +01:00
Oleksii Kliukin	9720ac1f7e	WIP: Hold the proper locks while examining the list of databases. Introduce a new lock called specMu lock to protect the cluster spec. This lock is held on update and sync, and when retrieving the spec in the API code. There is no need to acquire it for cluster creation and deletion: creation assigns the spec to the cluster before linking it to the controller, and deletion just removes the cluster from the list in the controller, both holding the global clustersMu Lock.	2017-12-22 13:06:11 +01:00
Sergey Dudoladov	b8bf97ab76	Integrate comments from code reviews	2017-12-22 12:53:57 +01:00
Sergey Dudoladov	011458fb05	Add a REST endpoint to list databases in all clusters	2017-12-21 17:28:55 +01:00
zerg-junior	3c178f68df	Warn on infrastructure-roles.yaml format violations (#177 ) Emit a warning if there are unprocessed entries in the infrastructure-roles secret.	2017-12-15 17:21:41 +01:00
Oleksii Kliukin	dd0affc390	Tweak our reaction to the cluster upgrade process. Previously, the operator started to move the pods off the nodes to be decomissioned by watching the eol_node_label value. Every new postgres pod has been created with the anti-affinity to that label, making sure that the pods being moved won't land on another to be decomissioned node. The changes introduce another label that indicates the ready node. The new pod affinity will esnure that the pod is only scheduled to the node marked as ready, discarding the previous anti-affinity. That way the nodes can transition from the pending-decomission to the other statuses (drained, terminating) without having pods suddently scaled to them. In addition, rename the label that triggers the start of the upgrade process to node_eol_label (for consistency with node_readiness_label) and set its default vvalue to lifecycle-status:pending-decomission.	2017-11-30 14:11:49 +01:00
Murat Kabilov	86803406db	use sync methods while updating the cluster	2017-11-03 12:00:43 +01:00
Oleksii Kliukin	eba23279c8	Kube cluster upgrade	2017-10-19 10:49:42 +02:00
Murat Kabilov	6c4cb4e9da	Perform manual failover during the scale down	2017-10-16 17:41:23 +02:00
Murat Kabilov	3b32265258	Set status of the cluster on sync fail/success	2017-10-12 15:10:42 +02:00
Murat Kabilov	8d5faaa5a5	return idle status when worker has nothing to do	2017-10-11 15:42:20 +02:00
Murat Kabilov	83c8d6c419	Extend diagnostic api with worker status info	2017-10-11 12:26:09 +02:00
Murat Kabilov	32aa7270e6	Use round-robin strategy while assigning workers	2017-10-09 16:56:27 +02:00
Murat Kabilov	a35e9c6119	move from tpr to crd	2017-10-06 15:12:08 +02:00
Murat Kabilov	9a66e09b88	cluster history api endpoint	2017-09-26 14:30:45 +02:00
Murat Kabilov	f77852a152	store time of the cluster event	2017-09-26 13:17:23 +02:00
Murat Kabilov	4db5bd13d1	delete cluster key from the clusters list only when delete procedure is finished	2017-09-04 18:48:03 +02:00
Murat Kabilov	899c0bef45	Use warningf instead of warnf	2017-08-30 14:35:56 +02:00
Murat Kabilov	53ceede3cb	show worker queue size in the cluster status	2017-08-28 12:05:33 +02:00
Murat Kabilov	83760ebbef	discard cluster events from the queue on cluster delete; delete cluster from the clusters map before deleting cluster itself	2017-08-17 12:24:23 +02:00
Murat Kabilov	f2c23021bb	generate clusterEvent queue key in a separate function	2017-08-17 12:20:03 +02:00
Murat Kabilov	dad8e2f49f	make cluster event queue consumption non-blocking	2017-08-15 16:03:19 +02:00
Murat Kabilov	82d5583809	add diagnostic api http server	2017-08-15 12:20:09 +02:00
Murat Kabilov	51fdfb90f7	log cluster and controller events in the ringlog via logrus hook	2017-08-15 12:16:09 +02:00
Murat Kabilov	82f58b57d8	add cluster and controller methods for getting status	2017-08-15 12:11:06 +02:00
Murat Kabilov	58572bb43f	move controller config to the spec package	2017-08-15 11:41:46 +02:00
Murat Kabilov	5470f20be4	always pass a cluster name as a logger field	2017-08-15 10:29:18 +02:00
Murat Kabilov	e26db66cb5	start all the log messages with lowercase letters	2017-08-15 10:12:36 +02:00
Murat Kabilov	cf663cb841	Fix golint warnings	2017-08-01 16:08:56 +02:00
Murat Kabilov	c02a740e10	Fix setting debug logger level	2017-08-01 11:51:03 +02:00
Murat Kabilov	6183203f4d	fix cluster event queue processing	2017-07-31 10:30:49 +02:00
Murat Kabilov	2fe22ff614	Remove pod dispatcher	2017-07-27 14:16:49 +02:00
Murat Kabilov	3ad4b127c4	Fix graceful shutdown graceful shutdown of goroutines on operator exit	2017-07-27 12:54:22 +02:00
Murat Kabilov	1f8b37f33d	Make use of kubernetes client-go v4 * client-go v4.0.0-beta0 * remove unnecessary methods for tpr object * rest client: use interface instead of structure pointer * proper names for constants; some clean up for log messages * remove teams api client from controller and make it per cluster	2017-07-25 15:25:17 +02:00
Oleksii Kliukin	4455f1b639	Feature/unit tests (#53 ) - Avoid relying on Clientset structure to call Kubernetes API functions. While Clientset is a convinient "catch-all" abstraction for calling REST API related to different Kubernetes objects, it's impossible to mock. Replacing it wih the kubernetes.Interface would be quite straightforward, but would require an exra level of mocked interfaces, because of the versioning. Instead, a new interface is defined, which contains only the objects we need of the pre-defined versions. - Move KubernetesClient to k8sutil package. - Add more tests.	2017-07-24 16:56:46 +02:00
Oleksii Kliukin	e0dacd0ca9	Remove an unused export.	2017-06-08 16:17:01 +02:00
Murat Kabilov	e104a67260	Fix resync of the clusters	2017-06-08 11:51:48 +02:00
Oleksii Kliukin	bc0e9ab4bc	Add error checks per report from errcheck-ng	2017-06-08 10:41:44 +02:00
Oleksii Kliukin	dc36c4ca12	Implement replicaLoadBalancer boolean flag. (#38 ) The flag adds a replica service with the name cluster_name-repl and a DNS name that defaults to {cluster}-repl.{team}.{hostedzone}. The implementation converted Service field of the cluster into a map with one or two elements and deals with the cases when the new flag is changed on a running cluster (the update and the sync should create or delete the replica service). In order to pick up master and replica service and master endpoint when listing cluster resources. * Update the spec when updating the cluster.	2017-06-07 13:54:17 +02:00
Oleksii Kliukin	7b0ca31bfb	Implements EBS volume resizing #35 . In order to support volumes different from EBS and filesystems other than EXT2/3/4 the respective code parts were implemented as interfaces. Adding the new resize for the volume or the filesystem will require implementing the interface, but no other changes in the cluster code itself. Volume resizing first changes the EBS and the filesystem, and only afterwards is reflected in the Kubernetes "PersistentVolume" object. This is done deliberately to be able to check if the volume needs resizing by peeking at the Size of the PersistentVolume structure. We recheck, nevertheless, in the EBSVolumeResizer, whether the actual EBS volume size doesn't match the spec, since call to the AWS ModifyVolume is counted against the resize limit of once every 6 hours, even for those calls that shouldn't result in an actual resize (i.e. when the size matches the one for the running volume). As a collateral, split the constants into multiple files, move the volume code into a separate file and fix minor issues related to the error reporting.	2017-06-06 13:53:27 +02:00
Murat Kabilov	1fb05212a9	Refactor teams API package	2017-05-30 10:14:30 +02:00
Murat Kabilov	009db16c7c	Use queues for the pod events (#30 )	2017-05-23 15:24:14 +02:00
Murat Kabilov	c470bd6646	reset cluster error on successful update or sync (#29 )	2017-05-22 15:45:38 +02:00
Oleksii Kliukin	bc17897478	Run sync cluster when previous add failed. (#28 )	2017-05-22 15:27:26 +02:00
Oleksii Kliukin	afce38f6f0	Fix error messages (#27 ) Use lowercase for kubernetes objects Use %v instead of %s for errors Start error messages with a lowercase letter.	2017-05-22 14:12:06 +02:00
Murat Kabilov	4acaf27a5d	Remove etcd requests (#25 ) update glide	2017-05-19 17:18:37 +02:00
Murat Kabilov	d34273543e	Fix the golint, gosimple warnings	2017-05-18 17:38:54 +02:00

1 2

91 Commits