postgres-operator

Commit Graph

Author	SHA1	Message	Date
Felix Kunde	5acc67ec04	reflect change in github url	2019-02-22 14:45:05 +01:00
teuto.net Netzdienste GmbH	26a7fdfa9f	Add Pod Anti Affinity (#489 ) * Add Pod Anti Affinity	2019-02-21 16:37:03 +01:00
Stephane T	d11b23bd71	Add inherited_labels (#459 ) * add support for inherited_labels Signed-off-by: Stephane Tang <hi@stang.sh> * update docs with inherited_labels Signed-off-by: Stephane Tang <hi@stang.sh>	2019-02-14 12:29:06 +01:00
Rafał Kupka	ba23de3d17	Pass PodEnvironmentConfigMap (#477 )	2019-02-04 12:24:49 +01:00
Armin Nesiren	6f6a599c90	Added possibility to add custom annotations to LoadBalancer service. (#461 ) * Added possibility to add custom annotations to LoadBalancer service.	2019-01-25 11:35:27 +01:00
Maxim Ivanov	1109c861fb	Report new Postgres CR error when previously incorrect one is being updated (#449 )	2019-01-18 13:36:44 +01:00
zerg-junior	26670408c4	Revert "Unify warnings about unmovable pods (#389 )" (#430 ) This reverts commit `4fa09e0dcb`. Reason: the reverted commit bloats the logs	2018-12-21 17:39:34 +01:00
zerg-junior	4fa09e0dcb	Unify warnings about unmovable pods (#389 ) * Unify warnings about unmovable pods * Log conditions that prevent master pod migration	2018-12-21 16:44:31 +01:00
zerg-junior	45c89b3da4	[WIP] Add set_memory_request_to_limit option (#406 ) * Add set_memory_request_to_limit option	2018-11-15 14:00:08 +01:00
Noah Kantrowitz	688d252752	Some tweaks to ensure compat with newer Go. (#383 )	2018-09-17 10:13:07 +02:00
Noah Kantrowitz	0b75a89920	Fix the casing of github.com/Sirupsen/logrus to match what the project itself uses. (#380 ) Dep enforces this.	2018-09-06 10:26:48 +02:00
Noah Kantrowitz	a4224f6063	Move CRD definitions into a formal API to allow access from other controllers. (#378 )	2018-08-31 11:20:02 +02:00
zerg-junior	25fa45fd58	[WIP] Grant 'superuser' to the members of Postgres admin teams (#371 ) Added support for superuser team in addition to the admin team that owns the postgres cluster.	2018-08-30 10:51:37 +02:00
Oleksii Kliukin	e1ed4b847d	Use code-generation for CRD API and deepcopy methods (#369 ) Client-go provides a https://github.com/kubernetes/code-generator package in order to provide the API to work with CRDs similar to the one available for built-in types, i.e. Pods, Statefulsets and so on. Use this package to generate deepcopy methods (required for CRDs), instead of using an external deepcopy package; we also generate APIs used to manipulate both Postgres and OperatorConfiguration CRDs, as well as informers and listers for the Postgres CRD, instead of using generic informers and CRD REST API; by using generated code we can get rid of some custom and obscure CRD-related code and use a better API. All generated code resides in /pkg/generated, with an exception of zz_deepcopy.go in apis/acid.zalan.do/v1 Rename postgres-operator-configuration CRD to OperatorConfiguration, since the former broke naming convention in the code-generator. Moved Postgresql, PostgresqlList, OperatorConfiguration and OperatorConfigurationList and other types used by them into Change the type of the Error field in the Postgresql crd to a string, so that client-go could generate a deepcopy for it. Use generated code to set status of CRD objects as well. Right now this is done with patch, however, Kubernetes 1.11 introduces the /status subresources, allowing us to set the status with the special updateStatus call in the future. For now, we keep the code that is compatible with earlier versions of Kubernetes. Rename postgresql.go to database.go and status.go to logs_and_api.go to reflect the purpose of each of those files. Update client-go dependencies. Minor reformatting and renaming.	2018-08-15 17:22:25 +02:00
Oleksii Kliukin	199aa6508c	Populate list of clusters in the controller at startup. (#364 ) Assign the list of clusters in the controller with the up-to-date list of Postgres manifests on Kubernetes during the startup. Node migration routines launched asynchronously to the cluster processing rely on an up-to-date list of clusters in the controller to detect clusters affected by the migration of the node and lock them when doing migration of master pods. Without the initial list the operator was subject to race conditions like the one described at https://github.com/zalando-incubator/postgres-operator/issues/363 Restructure the code to decouple list cluster function required by the postgresql informer from the one that emits cluster sync events. No extra work is introduced, since cluster sync already runs in a separate goroutine (clusterResync). Introduce explicit initial cluster sync at the end of acquireInitialListOfClusters instead of relying on an implicit one coming from list function of the PostgreSQL informer. Some minor refactoring. Review by @zerg-junior	2018-08-08 11:00:56 +02:00
Oleksii Kliukin	b06186eb41	Linter-induced code refactoring, run round 2. (#360 ) Run more linters in the gometalinter, i.e. deadcode, megacheck, nakedret, dup. More consistent code formatting, remove two dead functions, eliminate naked a bunch of naked returns, refactor a few functions to avoid code duplication.	2018-08-06 12:09:19 +02:00
Oleksii Kliukin	59f0c5551e	Allow configuring pod priority globally and per cluster. (#353 ) * Allow configuring pod priority globally and per cluster. Allow to specify pod priority class for all pods managed by the operator, as well as for those belonging to individual clusters. Controlled by the pod_priority_class_name operator configuration parameter and the podPriorityClassName manifest option. See https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass for the explanation on how to define priority classes since Kubernetes 1.8. Some import order changes are due to go fmt. Removal of OrphanDependents deprecated field. Code review by @zerg-junior	2018-08-03 14:03:37 +02:00
Oleksii Kliukin	ac7b132314	Refactoring inspired by gometalinter. (#357 ) Among other things, fix a few issues with deepcopy implementation.	2018-08-03 11:09:45 +02:00
Oleksii Kliukin	d2d3f21dc2	Client go upgrade v6 (#352 ) There are shortcuts in this code, i.e. we created the deepcopy function by using the deepcopy package instead of the generated code, that will be addressed once migrated to client-go v8. Also, some objects, particularly statefulsets, are still taken from v1beta, this will also be addressed in further commits once the changes are stabilized.	2018-08-01 11:08:01 +02:00
Oleksii Kliukin	f27833b5eb	Fix disabling database access and teams API via command-line options. (#351 )	2018-07-27 10:24:05 +02:00
Oleksii Kliukin	0181a1b5b1	Introduce a repair scan to fix failing clusters (#304 ) A repair is a sync scan that acts only on those clusters that indicate that the last add, update or sync operation on them has failed. It is supposed to kick in more frequently than the repair scan. The repair scan still remains to be useful to fix the consequences of external actions (i.e. someone deletes a postgres-related service by mistake) unbeknownst to the operator. The repair scan is controlled by the new repair_period parameter in the operator configuration. It has to be at least 2 times more frequent than a sync scan to have any effect (a normal sync scan will update both last synced and last repaired attributes of the controller, since repair is just a sync underneath). A repair scan could be queued for a cluster that is already being synced if the sync period exceeds the interval between repairs. In that case a repair event will be discarded once the corresponding worker finds out that the cluster is not failing anymore. Review by @zerg-junior	2018-07-24 11:21:45 +02:00
zerg-junior	417f13c0bd	Submit RBAC credentials during initial Event processing (#344 ) * During initial Event processing submit the service account for pods and bind it to a cluster role that allows Patroni to successfully start. The cluster role is assumed to be created by the k8s cluster administrator.	2018-07-19 16:40:40 +02:00
Oleksii Kliukin	3a9378d3b8	Allow configuring the operator via the YAML manifest. (#326 ) * Up until now, the operator read its own configuration from the configmap. That has a number of limitations, i.e. when the configuration value is not a scalar, but a map or a list. We use a custom code based on github.com/kelseyhightower/envconfig to decode non-scalar values out of plain text keys, but that breaks when the data inside the keys contains both YAML-special elememtns (i.e. commas) and complex quotes, one good example for that is search_path inside `team_api_role_configuration`. In addition, reliance on the configmap forced a flag structure on the configuration, making it hard to write and to read (see https://github.com/zalando-incubator/postgres-operator/pull/308#issuecomment-395131778). The changes allow to supply the operator configuration in a proper YAML file. That required registering a custom CRD to support the operator configuration and provide an example at manifests/postgresql-operator-default-configuration.yaml. At the moment, both old configmap and the new CRD configuration is supported, so no compatibility issues, however, in the future I'd like to deprecate the configmap-based configuration altogether. Contrary to the configmap-based configuration, the CRD one doesn't embed defaults into the operator code, however, one can use the manifests/postgresql-operator-default-configuration.yaml as a starting point in order to build a custom configuration. Since previously `ReadyWaitInterval` and `ReadyWaitTimeout` parameters used to create the CRD were taken from the operator configuration, which is not possible if the configuration itself is stored in the CRD object, I've added the ability to specify them as environment variables `CRD_READY_WAIT_INTERVAL` and `CRD_READY_WAIT_TIMEOUT` respectively. Per review by @zerg-junior and @Jan-M.	2018-07-16 16:20:46 +02:00
Oleksii Kliukin	16a710a99a	Avoid possible skipping SYNC events. OB1 bug in the condition deciding whether to sync.	2018-05-31 18:29:15 +02:00
Oleksii Kliukin	27c7245fed	Avoid terminating delete on errors. When there is an error happening upon deletion of the Kubernetes object belonging to the cluster being removed, it makes no sense to abort the deletion: the manifest will be removed anyway, therefore all the objects after the one we aborted at will stay forever.	2018-05-18 18:10:37 +02:00
Oleksii Kliukin	da4cc2705b	Use deepcopy to propagate the spec to clusters. Avoid sharing pointers to the same spec data between the informer and the clusters. The only catch is that the error field is cleared during deepcopy, since it is an interface that may contain private fields that cannot be copied, however, the error is only used when the manifest is parsed and before it is queued, therefore, we never refer to that field in the cluster structure.	2018-05-17 16:05:12 +02:00
Oleksii Kliukin	ebe50abccb	Make sure we never modify informer cached manifest. (#290 ) `987b434` introduced a new function that modifies the cluster spec in memory before the cluster processes it. Unfortunately, the instance being modified appeared to be the one stored internally in the PostgresInformer, resulting in those modifications to be propagated with futher cluster events and producing update loops in some occasions. This commit makes sure we copy the spec before putting it into the clusterEventQueues.	2018-05-16 18:23:31 +02:00
Oleksii Kliukin	987b43456b	Deprecate old LB options, fix endpoint sync. (#287 ) * Depreate old LB options, fix endpoint sync. - deprecate useLoadBalancer, replicaLoadBalancer from the manifest and enable_load_balancer from the operator configuration. The old operator configuration options become no-op with this commit. For the old manifest options, `useLoadBalancer` and `replicaLoadBalancer` are still consulted, but only in the absense of the new ones (enableMasterLoadBalancer and enableReplicaLoadBalancer). - Make sure the endpoint being created during the sync receives proper addresses subset. This is more critical for the replicas, as for the masters Patroni will normally re-create the endpoint before the operator. - Avoid creating the replica endpoint, since it will be created automatically by the corresponding service. - Update the README and unit tests. Code review by @mgomezch and @zerg-junior	2018-05-15 15:19:18 +02:00
Sergey Dudoladov	4255e702bc	Always empty account's namespace after parsing	2018-04-25 13:57:24 +02:00
Sergey Dudoladov	d99b553ec1	Convert default account definiton into JSON	2018-04-25 12:35:16 +02:00
Sergey Dudoladov	3d0ab40d64	Explicitly warn on account name mismatch	2018-04-24 15:31:22 +02:00
Sergey Dudoladov	485ec4b8ea	Move service account to Controller	2018-04-24 15:13:08 +02:00
Sergey Dudoladov	bd51d2922b	Turn ServiceAccount into struct value to avoid race conditon during account creation	2018-04-20 13:05:05 +02:00
Sergey Dudoladov	a5a65e93f4	Name service account consistenly	2018-04-19 16:15:52 +02:00
Sergey Dudoladov	23f893647c	Remove sync of pod service accounts	2018-04-19 15:48:58 +02:00
Sergey Dudoladov	214ae04aa7	Deploy service account for pod creation on demand	2018-04-18 16:20:20 +02:00
Oleksii Kliukin	26db91c53e	Improve infrastructure role definitions (#208 ) Enhance definitions of infrastructure roles by allowing membership in multiple roles, role options and per-role configuration to be specified in the infrastructure role configmap, which must have the same name as the infrastructure role secret. See manifests/infrastructure-roles-configmap.yaml for the examples and updated README for the description of different types of database roles supposed by the operator and their purposes. Change the logic of merging infrastructure roles with the manifest roles when they have the same name, to return the infrastructure role unchanged instead of merging. Previously, we used to propagate flags from the manifest role to the resulting infrastructure one, as there were no way to define flags for the infrastructure role; however, this is not the case anymore. Code review and tests by @erthalion	2018-04-04 17:21:36 +02:00
Oleksii Kliukin	2bb7e98268	update individual role secrets from infrastructure roles (#206 ) * Track origin of roles. * Propagate changes on infrastructure roles to corresponding secrets. When the password in the infrastructure role is updated, re-generate the secret for that role. Previously, the password for an infrastructure role was always fetched from the secret, making any updates to such role a no-op after the corresponding secret had been generated.	2018-02-23 17:24:04 +01:00
Oleksii Kliukin	f18bb6eaaa	Make errors in the cluster list function visible. Sometimes the operator does not pick up clusters right away when they are created. The change attempts to shed light on the reason behind that.	2018-02-22 16:45:10 +01:00
Sergey Dudoladov	66a3b6830e	Call fatalf if namespace to watch does not exist	2018-02-20 16:13:48 +01:00
Sergey Dudoladov	dcfc9925f6	Respond to code review	2018-02-20 14:43:02 +01:00
Sergey Dudoladov	e3d2434420	Use '*' as an alias to denote all namespaces	2018-02-16 15:20:26 +01:00
Sergey Dudoladov	088bf70e7d	Merge branch 'master' into support-many-namespaces	2018-02-16 15:06:10 +01:00
Sergey Dudoladov	ec7de38f9b	Make operator watch its own namespace instead of controller's one	2018-02-16 14:22:38 +01:00
Sergey Dudoladov	5e9a21456e	Remove the incorrect service account check	2018-02-15 16:33:53 +01:00
Sergey Dudoladov	155ae8d50f	Rename the function that checks service account existence	2018-02-15 11:14:13 +01:00
Sergey Dudoladov	d5d15b7546	Look for secrets in the deployed namespace	2018-02-14 15:37:30 +01:00
Sergey Dudoladov	06fd9e33f5	Watch the namespace where operator deploys to unless told otherwise	2018-02-13 18:17:47 +01:00
Sergey Dudoladov	4c23917d42	Watch all namespaces if the relevant param is empty string / 'default' if param is unset	2018-02-12 11:47:56 +01:00
Sergey Dudoladov	066f11cbbd	Streamline handling of the watched_namespace param/envvar	2018-02-09 11:39:56 +01:00

1 2 3 4

152 Commits