Command-line options --nodatabaseaccess and --noteamsapi disable all
teams api interaction and access to the Postgres database. This is
useful for debugging purposes when the operator runs out of cluster
(with --outofcluster flag).
The same effect can be achieved by setting enable_db_access and/or
enable_teams_api to false.
Cross-platform go builds with linux as a target platform and race
enabled require to link against the linux so and have the cross
platform compiler installed, which is a pain. To avoid this we
build everything in Linux-native environment in a container.
Note that running such builds requires replacing the operator
Docker base image from Alpine to something with a real glibc
(i.e. ubuntu), since the target binary needs a dynamic linker.
Use -a flag in order to rebuild all packages, since we might have
built them on a wrong platform.
No longer ignore custom PostgreSQL and Patroni parameters and initdb
options. Since all Patroni parameters that are not under initdb or
pg_hba are specified as a plain map, there is no way to distinguish
those that should go into the bootstrap section from those that should
stay in the local configuration. As the example used only bootstrap
parameters, currently all such options go into the bootstrap section.
Also the initdb options are repsented as a map, while Patroni initdb
options are a list of either maps or strings (i.e. "data-checksums"
doesn't need an argument). For now, there is a work-around, but in the
future we might consider changing the spec.
The sync adds or alters database roles based on the roles defined
in the cluster's TPR, Team API and operator's infrastructure roles.
At the moment, roles are not deleted, as it would be dangerous for
the robot roles in case TPR is misconfigured. In addition, ALTER
ROLE does not remove role options, i.e. SUPERUSER or CREATEROLE,
neither it removes role membership: only new options are added and
new role membership is granted. So far, options like NOSUPERUSER
and NOCREATEROLE won't be handed correctly, when mixed with the
non-negative counterparts, also NOLOGIN should be processed correctly.
The code assumes that only MD5 passwords are stored in the DB and
will likely break with the new SCRAM auth in PostgreSQL 10.
On the implementation side, create the new interface to abstract
roles merge and creation, move most of the role-based functionality
from cluster/pg into the new 'users' module, strip create user code
of special cases related to human-based users (moving them to init
instead) and fixed the password md5 generator to avoid processing
already encrypted passwords. In addition, moved the system roles
off the slice containing all other roles in order to avoid extra
efforts to avoid creating them.
Also, fix a leak in DB connections when the new connection is not
considered healthy and discarded without being closed. Initialize
the database during the sync phase before syncing users.
Run operations concerning multiple clusters in parallel. Each cluster gets its
own worker in order to create, update, sync or delete clusters. Each worker
acquires the lock on a cluster. Subsequent operations on the same cluster
have to wait until the current one finishes. There is a pool of parallel
workers, configurable with the `workers` parameter in the configmap and set by
default to 4. The cluster-related tasks are assigned to the workers based on
a cluster name: the tasks for the same cluster will be always assigned to the
same worker. There is no blocking between workers, although there is a chance
that a single worker will become a bottleneck if too many clusters are
assigned to it; therefore, for large-scale deployments it might be necessary
to bump up workers from the default value.
* Avoid "bulk-comparing" pod resources during sync.
First attempt to fix bogus restarts due to the reported mismatch
of container resources where one of the resources is an empty struct,
while the other has all fields set to nil.
In addition, add an ability to set limits and requests per pod, as well as the operator-level defaults.
* Add version label to the cluster.
According to the STUPS team the daemon that exports logs to scalyr
stops the export if the version label is missing.
* Move label names to constants.
* Run go fmt
* Add infrastructure roles configured globally.
Those are the roles defined in the operator itself. The operator's
configuration refers to the secret containing role names, passwords
and membership information. While they are referred to as roles, in
reality those are users.
In addition, improve the regex to filter out invalid users and
make sure user secret names are compatible with DNS name spec.
Add an example manifest for the infrastructure roles.