Commit Graph

37 Commits

Author SHA1 Message Date
Murat Kabilov 92d7fbf372 replace github.bus.zalan.do with github.cm/zalando-incubator 2017-05-12 11:50:16 +02:00
Murat Kabilov 1b82009151 Command exec inside the Pod method 2017-05-12 11:41:36 +02:00
Murat Kabilov fd449342e5 Use Kubernetes API instead of API group 2017-05-12 11:41:36 +02:00
Oleksii Kliukin 6983f444ed Periodically sync roles with the running clusters. (#102)
The sync adds or alters database roles based on the roles defined
in the cluster's TPR, Team API and operator's infrastructure roles.
At the moment, roles are not deleted, as it would be dangerous for
the robot roles in case TPR is misconfigured. In addition, ALTER
ROLE does not remove role options, i.e. SUPERUSER or CREATEROLE,
neither it removes role membership: only new options are added and
new role membership is granted. So far, options like NOSUPERUSER
and NOCREATEROLE won't be handed correctly, when mixed with the
non-negative counterparts, also NOLOGIN should be processed correctly.
The code assumes that only MD5 passwords are stored in the DB and
will likely break with the new SCRAM auth in PostgreSQL 10.

On the implementation side, create the new interface to abstract
roles merge and creation, move most of the role-based functionality
from cluster/pg into the new 'users' module, strip create user code
of special cases related to human-based users (moving them to init
instead) and fixed the password md5 generator to avoid processing
already encrypted passwords. In addition, moved the system roles
off the slice containing all other roles in order to avoid extra
efforts to avoid creating them.

Also, fix a leak in DB connections when the new connection is not
considered healthy and discarded without being closed. Initialize
the database during the sync phase before syncing users.
2017-05-12 11:41:35 +02:00
Murat Kabilov 2370659c69 Parallel cluster processing
Run operations concerning multiple clusters in parallel. Each cluster gets its
own worker in order to create, update, sync or delete clusters.  Each worker
acquires the lock on a cluster.  Subsequent operations on the same cluster
have to wait until the current one finishes.  There is a pool of parallel
workers, configurable with the `workers` parameter in the configmap and set by
default to 4. The cluster-related tasks  are assigned to the workers based on
a cluster name: the tasks for the same cluster will be always assigned to the
same worker. There is no blocking between workers, although there is a chance
that a single worker will become a bottleneck if too many clusters are
assigned to it; therefore, for large-scale deployments it might be necessary
to bump up workers from the default value.
2017-05-12 11:41:35 +02:00
Murat Kabilov a7c57874d5 Do not create roles if cluster is masterless
fix pod deletion
2017-05-12 11:41:34 +02:00
Murat Kabilov da438aab3a Use ConfigMap to store operator's config 2017-05-12 11:41:34 +02:00
Murat Kabilov 08c0e3b6dd Use unified type for the namespaced object names 2017-05-12 11:41:34 +02:00
Oleksii Kliukin 71b93b4cc2 Feature/infrastructure roles (#91)
* Add infrastructure roles configured globally.

Those are the roles defined in the operator itself. The operator's
configuration refers to the secret containing role names, passwords
and membership information. While they are referred to as roles, in
reality those are users.

In addition, improve the regex to filter out invalid users and
make sure user secret names are compatible with DNS name spec.

Add an example manifest for the infrastructure roles.
2017-05-12 11:41:33 +02:00
Murat Kabilov db53134cbd Skip syncing Pods 2017-05-12 11:41:33 +02:00
Murat Kabilov 101dc06acb Better logging for teams api calls 2017-05-12 11:41:32 +02:00
Murat Kabilov bb4fec25ae Fix deletion of the failed cluster; more debug messages 2017-05-12 11:41:32 +02:00
Murat Kabilov ce90a54cf9 create key in the cluster map on cluster creation failure 2017-05-12 11:41:32 +02:00
Murat Kabilov 852c5beae5 Check etcd key availability for the new cluster 2017-05-12 11:41:31 +02:00
Oleksii Kliukin 8268b07ad2 Set logger level per package instead of doing this globally 2017-05-12 11:41:30 +02:00
Oleksii Kliukin 3a4c6268be Increase log verbosity, namely for object updates.
- add a new environment variable for triggering debug log level
- show both new, old object and diff during syncs and updates
- use pretty package to pretty-print go structures
-
2017-05-12 11:41:29 +02:00
Murat Kabilov c2d2a67ad5 Get config from environment variables;
ignore pg major version change;
get rid of resources package;
2017-05-12 11:41:29 +02:00
Murat Kabilov 79a6726d4d Increase logging verbosity, restructure code 2017-05-12 11:41:28 +02:00
Oleksii Kliukin 48ba6adf8a Avoid calling Team API with an expired token.
Previously, the controller fetched the Oauth token once at start, so eventually the token would expire and the operator could not create new users. This commit makes the operator fetch the token before each call to the Teams API.
2017-05-12 11:41:28 +02:00
Murat Kabilov 6f7399b36f Sync clusters states
* move statefulset creation from cluster spec to the separate function
* sync cluster state with desired state;
* move out from arrays for cluster resources;
* recreate pods instead of deleting them in case of statefulset change
* check for master while creating cluster/updating pods
* simplify retryutil
* list pvc while listing resources
* name kubernetes resources with capital letter
* do rolling update in case of env variables change
2017-05-12 11:41:27 +02:00
Murat Kabilov 486c8ecb07 use neutral name for set cluster status function 2017-05-12 11:41:26 +02:00
Murat Kabilov 1c6e7ac2e7 loadBalancerSourceRanges update 2017-05-12 11:41:26 +02:00
Murat Kabilov fc127069ab remove unnecessary ControllerNamespace 2017-05-12 11:41:26 +02:00
Murat Kabilov 416dace289 get rid of arrays in the kuberesources;
use shorter form of checking for errors
2017-05-12 11:41:26 +02:00
Murat Kabilov ae77fa15e8 Pod Rolling update
introduce Pod events channel;
add parsing of the MaintenanceWindows section;
skip deleting Etcd key on cluster delete;
use external etcd host;
watch for tpr/pods in the namespace of the operator pod only;
2017-05-12 11:41:25 +02:00
Murat Kabilov dfde075c66 Use TPR object namespace while creating its objects 2017-05-12 11:37:09 +02:00
Murat Kabilov 6e2d64bd50 Create human users from teams api 2017-05-12 11:37:09 +02:00
Murat Kabilov 58506634c4 Create pg users 2017-05-12 11:37:09 +02:00
Murat Kabilov abb1173035 Code refactor 2017-05-12 11:37:09 +02:00
Murat Kabilov 75e6bfa55c makefile improvements 2017-05-12 11:37:07 +02:00
Oleksii Kliukin e96f8a80ee Option to run the operator out of cluster. 2017-05-08 12:10:27 +02:00
Oleksii Kliukin b3a9516bae Add a missing file. 2017-05-08 12:10:26 +02:00
Oleksii Kliukin e5e0e3a148 Use camelCase. 2017-05-08 12:10:26 +02:00
Oleksii Kliukin 38bc9da25a WIP: allow operator to run both in- and out- of cluster. 2017-05-08 12:10:26 +02:00
Murat Kabilov 5b5a64e55d Check if etcd service has its port exposed 2017-05-08 12:10:26 +02:00
Murat Kabilov d5a7683a38 some refactoring 2017-05-08 12:10:26 +02:00
Murat Kabilov 256ff37c19 refactor file tree structure 2017-05-08 12:10:25 +02:00