Commit Graph

1398 Commits

Author SHA1 Message Date
Oleksii Kliukin fb454b809e Merge pull request #3 from zalando-incubator/fix_license
Fix license, add .zappr.
2017-05-12 16:18:11 +02:00
Oleksii Kliukin db661414ee Move license to the root folder. 2017-05-12 16:16:07 +02:00
Oleksii Kliukin ac4757a9c8 On a second though, remove WIP as well
The status clearly defines the state of the operator development.
2017-05-12 15:00:05 +02:00
Oleksii Kliukin 169c8611c5 Still a WIP, but no longer a prototype 2017-05-12 14:58:43 +02:00
Oleksii Kliukin 73e7ac6135 Use zalando incubator instead of an old repo 2017-05-12 14:57:58 +02:00
Oleksii Kliukin d4962105ef Change the broken reference. 2017-05-12 14:56:22 +02:00
Oleksii Kliukin fb007b1622 Add .zappr configuration 2017-05-12 14:51:56 +02:00
Oleksii Kliukin 4d5d34b311 Fix the path when describing git clone 2017-05-12 12:24:53 +02:00
Oleksii Kliukin 5487aa1ad0 Fix a link 2017-05-12 12:21:29 +02:00
Oleksii Kliukin 26edbc0528 Use the correct suffix, convert to Markdown. 2017-05-12 12:20:42 +02:00
Oleksii Kliukin 19474823f2 Add MAINTAINERS and CONTRIBUTING guidelines. 2017-05-12 12:17:49 +02:00
Oleksii Kliukin 2c834412b3 Add a note about the project's development status. 2017-05-12 12:13:38 +02:00
Oleksii Kliukin 37c16190f1 Update README with correct URLS and ETCD installation instructions. 2017-05-12 12:10:53 +02:00
Murat Kabilov e694db9ba7 add MIT License 2017-05-12 11:55:37 +02:00
Murat Kabilov 92d7fbf372 replace github.bus.zalan.do with github.cm/zalando-incubator 2017-05-12 11:50:16 +02:00
Murat Kabilov 1b82009151 Command exec inside the Pod method 2017-05-12 11:41:36 +02:00
Murat Kabilov 28a74622d7 Fix typo in the teams api json spec 2017-05-12 11:41:36 +02:00
Oleksii Kliukin 24638de665 Linux race-enabled builds inside Docker. (#123)
Cross-platform go builds with linux as a target platform and race
enabled require to link against the linux so and have the cross
platform compiler installed, which is a pain. To avoid this we
build everything in Linux-native environment in a container.

Note that running such builds requires replacing the operator
Docker base image from Alpine to something with a real glibc
(i.e. ubuntu), since the target binary needs a dynamic linker.

Use -a flag in order to rebuild all packages, since we might have
built them on a wrong platform.
2017-05-12 11:41:36 +02:00
Murat Kabilov 18700b9ef7 Optimize template constant 2017-05-12 11:41:36 +02:00
Murat Kabilov fd449342e5 Use Kubernetes API instead of API group 2017-05-12 11:41:36 +02:00
Oleksii Kliukin ec3f24c3ee Honor the "spec-by-example" manifest we have.
No longer ignore custom PostgreSQL and Patroni parameters and initdb
options. Since all Patroni parameters that are not under initdb or
pg_hba are specified as a plain map, there is no way to distinguish
those that should go into the bootstrap section from those that should
stay in the local configuration. As the example used only bootstrap
parameters, currently all such options go into the bootstrap section.

Also the initdb options are repsented as a map, while Patroni initdb
options are a list of either maps or strings (i.e. "data-checksums"
doesn't need an argument). For now, there is a work-around, but in the
future we might consider changing the spec.
2017-05-12 11:41:36 +02:00
Murat Kabilov 34ae324fe9 Glide strip vendor 2017-05-12 11:41:36 +02:00
Oleksii Kliukin 6983f444ed Periodically sync roles with the running clusters. (#102)
The sync adds or alters database roles based on the roles defined
in the cluster's TPR, Team API and operator's infrastructure roles.
At the moment, roles are not deleted, as it would be dangerous for
the robot roles in case TPR is misconfigured. In addition, ALTER
ROLE does not remove role options, i.e. SUPERUSER or CREATEROLE,
neither it removes role membership: only new options are added and
new role membership is granted. So far, options like NOSUPERUSER
and NOCREATEROLE won't be handed correctly, when mixed with the
non-negative counterparts, also NOLOGIN should be processed correctly.
The code assumes that only MD5 passwords are stored in the DB and
will likely break with the new SCRAM auth in PostgreSQL 10.

On the implementation side, create the new interface to abstract
roles merge and creation, move most of the role-based functionality
from cluster/pg into the new 'users' module, strip create user code
of special cases related to human-based users (moving them to init
instead) and fixed the password md5 generator to avoid processing
already encrypted passwords. In addition, moved the system roles
off the slice containing all other roles in order to avoid extra
efforts to avoid creating them.

Also, fix a leak in DB connections when the new connection is not
considered healthy and discarded without being closed. Initialize
the database during the sync phase before syncing users.
2017-05-12 11:41:35 +02:00
Martin Linkhorst 411487e66d update annotation for ExternalDNS (#115) 2017-05-12 11:41:35 +02:00
Oleksii Kliukin 9f9a89185f Do rolling update after creating of a statefulset if pods were present. (#110)
Make sure we always re-create pods if we had to create the statefulset, even if the pods from the old statefulset were already there.
2017-05-12 11:41:35 +02:00
Oleksii Kliukin 49cb395aed Set ELB timeout annotation for the service. (#114)
By default the ELB terminates the idle connection after 60 seconds. Increase this interval to a more reasonable one of 1 h.
2017-05-12 11:41:35 +02:00
Murat Kabilov 2370659c69 Parallel cluster processing
Run operations concerning multiple clusters in parallel. Each cluster gets its
own worker in order to create, update, sync or delete clusters.  Each worker
acquires the lock on a cluster.  Subsequent operations on the same cluster
have to wait until the current one finishes.  There is a pool of parallel
workers, configurable with the `workers` parameter in the configmap and set by
default to 4. The cluster-related tasks  are assigned to the workers based on
a cluster name: the tasks for the same cluster will be always assigned to the
same worker. There is no blocking between workers, although there is a chance
that a single worker will become a bottleneck if too many clusters are
assigned to it; therefore, for large-scale deployments it might be necessary
to bump up workers from the default value.
2017-05-12 11:41:35 +02:00
Oleksii Kliukin a9c6c4861c Use git: prefix for the git URL in scm-source.json (#112) 2017-05-12 11:41:35 +02:00
Oleksii Kliukin 1c4bce86df Avoid "bulk-comparing" pod resources during sync. (#109)
* Avoid "bulk-comparing" pod resources during sync.

First attempt to fix bogus restarts due to the reported mismatch
of container resources where one of the resources is an empty struct,
while the other has all fields set to nil.

In addition, add an ability to set limits and requests per pod, as well as the operator-level defaults.
2017-05-12 11:41:35 +02:00
Murat Kabilov 9b0d0d487c Use PATCH while updating Services and StatefulSets 2017-05-12 11:41:34 +02:00
Murat Kabilov 8026c69222 update default config param values 2017-05-12 11:41:34 +02:00
Murat Kabilov a7c57874d5 Do not create roles if cluster is masterless
fix pod deletion
2017-05-12 11:41:34 +02:00
Murat Kabilov da438aab3a Use ConfigMap to store operator's config 2017-05-12 11:41:34 +02:00
Oleksii Kliukin 47e3e29a56 Add version label to the cluster. (#96)
* Add version label to the cluster.

According to the STUPS team the daemon that exports logs to scalyr
stops the export if the version label is missing.

* Move label names to constants. 

* Run go fmt
2017-05-12 11:41:34 +02:00
Murat Kabilov 08c0e3b6dd Use unified type for the namespaced object names 2017-05-12 11:41:34 +02:00
Murat Kabilov 79fdba4ac7 make sure name of the cluster matches format {teamname}-{clustername} 2017-05-12 11:41:34 +02:00
Oleksii Kliukin 71b93b4cc2 Feature/infrastructure roles (#91)
* Add infrastructure roles configured globally.

Those are the roles defined in the operator itself. The operator's
configuration refers to the secret containing role names, passwords
and membership information. While they are referred to as roles, in
reality those are users.

In addition, improve the regex to filter out invalid users and
make sure user secret names are compatible with DNS name spec.

Add an example manifest for the infrastructure roles.
2017-05-12 11:41:33 +02:00
Murat Kabilov b8fba429df typo in service name 2017-05-12 11:41:33 +02:00
Murat Kabilov 3bd9b3b42f typo in config name 2017-05-12 11:41:33 +02:00
Murat Kabilov 16cc517106 Add name for the service port 2017-05-12 11:41:33 +02:00
Murat Kabilov dd2ed5ff9d Add team name to tpr object metadata name 2017-05-12 11:41:33 +02:00
Murat Kabilov db53134cbd Skip syncing Pods 2017-05-12 11:41:33 +02:00
Murat Kabilov 655f6dcadb make cluster resources private 2017-05-12 11:41:33 +02:00
Murat Kabilov 101dc06acb Better logging for teams api calls 2017-05-12 11:41:32 +02:00
Oleksii Kliukin 5b66d0adba Correct go json tags (extra space). 2017-05-12 11:41:32 +02:00
Murat Kabilov 322676a6b9 Skip deleting Pods and PVCs if failed to delete StatefulSet 2017-05-12 11:41:32 +02:00
Murat Kabilov bb4fec25ae Fix deletion of the failed cluster; more debug messages 2017-05-12 11:41:32 +02:00
Murat Kabilov ce90a54cf9 create key in the cluster map on cluster creation failure 2017-05-12 11:41:32 +02:00
Oleksii Kliukin 3b99ce3d2e Improve the diff in cluster resources.
- Use the branch of pretty with this feature fixed:
  https://github.com/kr/pretty/pull/42
- Add the Limit to the resources declaration to avoid dummy
  differences between statefulsets (where both Resource structures
  are empty, but in one case the fields are not mentioned, while
  in another they are assigned to empty values).
2017-05-12 11:41:32 +02:00
Oleksii Kliukin 455f91128f Move master/replica role names into the constants. 2017-05-12 11:41:32 +02:00