Commit Graph

41 Commits

Author SHA1 Message Date
Yusuke Kuoka 15b402bb32 Make RunnerSet much more reliable with or without webhook 2022-03-02 19:03:20 +09:00
Yusuke Kuoka b8e65aa857 Prevent unnecessary ephemeral runner recreations 2022-02-20 13:45:42 +00:00
Pavel Smalenski 91102c8088
Add dockerEnv variable for RunnerDeployment (#912)
Resolves #878

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-12-14 17:13:24 +09:00
Maxim Pogozhiy fce7d6d2a7
Add topologySpreadConstraints (#814) 2021-10-17 21:49:44 +01:00
Yusuke Kuoka b679a54196 Add missing //go:build tag on deepcopy source
But not sure why this is needed yet :)
2021-09-14 16:37:04 +09:00
Tarasovych 7008b0c257
feat: Organization RunnerDeployment with webhook-based autoscaling only for certain repositories (#766)
Resolves #765

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-08-31 09:46:36 +09:00
Sam 0593125d96
Add dnsConfig to runner deployments (#764)
Resolves #761
2021-08-31 09:42:05 +09:00
Rolf Ahrenberg 14564c7b8e
Allow disabling /runner emptydir mounts and setting storage volume (#674)
* Allow disabling /runner emptydir mounts

* Support defining storage medium for emptydirs

* Fix typos
2021-07-15 06:29:58 +09:00
Jonathan Gonzalez V a277489003
Added support to enable and disable enableServiceLinks. (#628)
This option expose internally some `KUBERNETES_*` environment variables
that doesn't allow the runner to use KinD (Kubernetes in Docker) since it will
try to connect to the Kubernetes cluster where the runner it's running.

This option it's set by default to `true` in any Kubernetes deployment.

Signed-off-by: Jonathan Gonzalez V <jonathan.gonzalez@enterprisedb.com>
2021-06-22 17:27:26 +09:00
Yusuke Kuoka 9e4dbf497c
feat: RunnerSet backed by StatefulSet (#629)
* feat: RunnerSet backed by StatefulSet

Unlike a runner deployment, a runner set can manage a set of stateful runners by combining a statefulset and an admission webhook that mutates statefulset-managed pods with required envvars and registration tokens.

Resolves #613
Ref #612

* Upgrade controller-runtime to 0.9.0

* Bump Go to 1.16.x following controller-runtime 0.9.0

* Upgrade kubebuilder to 2.3.2 for updated etcd and apiserver following local setup

* Fix startup failure due to missing LeaderElectionID

* Fix the issue that any pods become unable to start once actions-runner-controller got failed after the mutating webhook has been registered

* Allow force-updating statefulset

* Fix runner container missing work and certs-client volume mounts and DOCKER_HOST and DOCKER_TLS_VERIFY envvars when dockerdWithinRunner=false

* Fix runnerset-controller not applying statefulset.spec.template.spec changes when there were no changes in runnerset spec

* Enable running acceptance tests against arbitrary kind cluster

* RunnerSet supports non-ephemeral runners only today

* fix: docker-build from root Makefile on intel mac

* fix: arch check fixes for mac and ARM

* ci: aligning test data format and patching checks

* fix: removing namespace in test data

* chore: adding more ignores

* chore: removing leading space in shebang

* Re-add metrics to org hra testdata

* Bump cert-manager to v1.1.1 and fix deploy.sh

Co-authored-by: toast-gear <15716903+toast-gear@users.noreply.github.com>
Co-authored-by: Callum James Tait <callum.tait@photobox.com>
2021-06-22 17:10:09 +09:00
Ameer Ghani 7523ea44f1
feat: allow specifying runtime class in runner spec (#580)
This allows using the `runtimeClassName` directive in the runner's spec.

One of the use-cases for this is Kata Containers, which use `runtimeClassName` in a pod spec as an indicator that the pod should run inside a Kata container. This allows us a greater degree of pod isolation.
2021-06-04 08:56:43 +09:00
Yusuke Kuoka cb14d7530b
Add HRA printer column "SCHEDULE" (#561)
Adds a column to help the operator see if they configured HRA.Spec.ScheduledOverrides correctly, in a form of "next override schedule recognized by the controller":

```
$ k get horizontalrunnerautoscaler
NAME                            MIN   MAX   DESIRED   SCHEDULE
actions-runner-aos-autoscaler   0     5     0
org                             0     5     0         min=0 time=2021-05-21 15:00:00 +0000 UTC
```

Ref https://github.com/actions-runner-controller/actions-runner-controller/issues/484
2021-05-22 08:29:53 +09:00
Yusuke Kuoka 0b88b246d3
Fix additionalPrinterColumns (#556)
This fixes human-readable output of `kubectl get` on `runnerdeployment`, `runnerreplicaset`, and `runner`.

Most notably, CURRENT and READY of runner replicasets are now computed and printed correctly. Runner deployments now have UP-TO-DATE and AVAILABLE instead of READY so that it is consistent with columns of K8s deployments.

A few fixes has been also made to runner deployment and runner replicaset controllers so that those numbers stored in Status objects are reliably updated and in-sync with actual values.

Finally, `AGE` columns are added to runnerdeployment, runnerreplicaset, runnner to make that more visible to users.

`kubectl get` outputs should now look like the below examples:

```
# Immediately after runnerdeployment updated/created
$ k get runnerdeployment
NAME                   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
example-runnerdeploy   0         0         0            0           8d
org-runnerdeploy       5         5         5            0           8d

# A few dozens of seconds after update/create all the runners are registered that "available" numbers increase
$ k get runnerdeployment
NAME                   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
example-runnerdeploy   0         0         0            0           8d
org-runnerdeploy       5         5         5            5           8d
```

```
$ k get runnerreplicaset
NAME                         DESIRED   CURRENT   READY   AGE
example-runnerdeploy-wnpf6   0         0         0       61m
org-runnerdeploy-fsnmr       2         2         0       8m41s
```

```
$ k get runner
NAME                                           ENTERPRISE   ORGANIZATION                REPOSITORY                                       LABELS                      STATUS    AGE
example-runnerdeploy-wnpf6-registration-only                                            actions-runner-controller/mumoshu-actions-test                               Running   61m
org-runnerdeploy-fsnmr-n8kkx                                actions-runner-controller                                                    ["mylabel 1","mylabel 2"]             21s
org-runnerdeploy-fsnmr-sq6m8                                actions-runner-controller                                                    ["mylabel 1","mylabel 2"]             21s
```

Fixes #490
2021-05-21 09:10:47 +09:00
Yusuke Kuoka 0e0f385f72
Experimental support for ScheduledOverrides (#515)
This adds the initial version of ScheduledOverrides to HorizontalRunnerAutoscaler.
`MinReplicas` overriding should just work.
When there are two or more ScheduledOverrides, the earliest one that matched is activated. Each ScheduledOverride can be recurring or one-time. If you have two or more ScheduledOverrides, only one of them should be one-time. And the one-time override should be the earliest item in the list to make sense.

Tests will be added in another commit. Logging improvements and additional observability in HRA.Status will also be added in yet another commits.

Ref #484
2021-05-03 23:31:17 +09:00
Yusuke Kuoka b3cae25741
Enhance HorizontalRunnerAutoscaler API for ScheduledOverrides (#514)
This adds types and CRD changes related to HorizontalRunnerAutoscaler for the upcoming ScheduledOverrides feature.

Ref #484
2021-05-03 22:31:54 +09:00
Thejas N 588872a316
feat: allow ephemeral runner to be optional (#498)
- Adds `ephemeral` option to `runner.spec` 
    
    ```
      ....
      template:
         spec:
             ephemeral: false
             repository: mumoshu/actions-runner-controller-ci
      ....
    ```
- `ephemeral` defaults to `true`
- `entrypoint.sh` in runner/Dockerfile modified to read `RUNNER_EPHEMERAL` flag
- Runner images are backward-compatible. `--once` is omitted only when the new envvar `RUNNER_EPHEMERAL` is explicitly set to `false`.

Resolves #457
2021-05-02 19:04:14 +09:00
Rolf Ahrenberg 6b77a2a5a8
feat: Docker registry mirror (#478)
Changes:

- Switched to use `jq` in startup.sh
- Enable docker registry mirror configuration which is useful when e.g. avoiding the Docker Hub rate-limiting

Check #478 for how this feature is tested and supposed to be used.
2021-04-25 14:04:01 +09:00
Manuel Jurado 37c2a62fa8
Allow to configure runner volume size limit (#436)
Enable the user to set a limit size on the volume of the runner to avoid some runner pod affecting other resources of the same cluster

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-04-18 13:56:59 +09:00
Agoney Garcia-Deniz 2e551c9d0a
Add hostAliases to the runner spec (#456) 2021-04-17 17:04:52 +09:00
asoldino b42b8406a2
Add dockerVolumeMounts (#439)
Resolves #435
2021-04-06 10:10:10 +09:00
Yusuke Kuoka 3f23501b8e
Reduce "No runner matching the specified labels was found" errors while runner replacement (#392)
We occasionally encountered those errors while the underlying RunnerReplicaSet is being recreated/replaced on RunnerDeployment.Spec.Template update. It turned out to be due to that the RunnerDeployment controller was waiting for the runner pod becomes `Running`, intead of the new replacement runner to have registered to GitHub. This fixes that, by trying to Runner.Status.Phase to `Running` only after the runner in the runner pod appears to be registered.

A side-effect of this change is that runner controller would call more "ListRunners" GitHub Actions API. I've reviewed and improved the runner controller code and Runner CRD to make make the number of calls minimum. In most cases, ListRunners should be called only twice for each runner creation.
2021-03-16 10:52:30 +09:00
Yusuke Kuoka 8d3a83b07a
Add CheckRun.Names scale-up trigger configuration (#390)
This allows you to trigger autoscaling depending on check_run names(i.e. actions job names). If you are willing to differentiate scale amount only for a specific job, or want to scale only on a specific job, try this.
2021-03-14 10:21:42 +09:00
Brandon Kimbrough 2273b198a1
Add ability to set the MTU size of the docker in docker container (#385)
* adding abilitiy to set docker in docker MTU size

* safeguards to only set MTU env var if it is set
2021-03-12 08:44:49 +09:00
Hiroshi Muraoka 11e58fcc41
Manage runner with label (#355)
* Update RunnerDeploymentSpec to have Selector field

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Update RunnerReplicaSetSpec to have Selector field

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Add CloneSelectorAndAddLabel to add Selector field

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Fix tests

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Use label to find RunnerReplicaSet/Runner

Signed-off-by: binoue <banji-inoue@cybozu.co.jp>

* Update controller-gen versions in CRD

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Update autoscaler to list Pods with labels

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Add debug log

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Modify RunnerDeployment tests

Signed-off-by: binoue <banji-inoue@cybozu.co.jp>

* Modify RunnerReplicaset test

Signed-off-by: binoue <banji-inoue@cybozu.co.jp>

* Modify integration test

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Use RunnerDeployment Template Labels as the default selector for backward compatibility

* Fix labeling

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Update func in Eventually to return (int, error)

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Update RunnerDeployment controller not to use label selector

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Fix potential replicaset controller breakage on replicaset created before v0.17.0

* Fix errors on existing runner replica sets

* Ensure RunnerReplicaSet Spec Selector addition does not break controller

* Ensure RunnerDeployment Template.Spec.Labels change does result in template hash change

* Fix comment

Co-authored-by: binoue <banji-inoue@cybozu.co.jp>
Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-03-05 10:15:39 +09:00
Yusuke Kuoka ab1c39de57
feat: HorizontalRunnerAutoscaler Webhook server (#282)
* feat: HorizontalRunnerAutoscaler Webhook server

This introduces a Webhook server that responds GitHub `check_run`, `pull_request`, and `push` events by scaling up matched HorizontalRunnerAutoscaler by 1 replica. This allows you to immediately add "resource slack" for future GitHub Actions job runs, without waiting next sync period to add insufficient runners.

This feature is highly inspired by https://github.com/philips-labs/terraform-aws-github-runner. terraform-aws-github-runner can manage one set of runners per deployment, where actions-runner-controller with this feature can manage as many sets of runners as you declare with HorizontalRunnerAutoscaler and RunnerDeployment pairs.

On each GitHub event received, the webhook server queries repository-wide and organizational runners from the cluster and searches for the single target to scale up. The webhook server tries to match HorizontalRunnerAutoscaler.Spec.ScaleUpTriggers[].GitHubEvent.[CheckRun|Push|PullRequest] against the event and if it finds only one HRA, it is the scale target. If none or two or more targets are found for repository-wide runners, it does the same on organizational runners.

Changes:

* Fix integration test
* Update manifests
* chart: Add support for github webhook server
* dockerfile: Include github-webhook-server binary
* Do not import unversioned go-github
* Update README
2021-02-07 17:37:27 +09:00
Shinnosuke Sawada 4371de9733
add dockerEnabled option (#191)
Add dockerEnabled option for users who does not need docker and want not to run privileged container.
if `dockerEnabled == false`, dind container not run, and there are no privileged container.

Do the same as closed #96
2020-11-16 09:41:12 +09:00
Yusuke Kuoka faaca10fba
Rename Runner.Spec.dockerWithinRunnerContainer to docker"d"WithinRunnerContainer (#134)
* Rename Runner.Spec.dockerWithinRunnerContainer to dockerdWithinRunnerContainer

Ref https://github.com/summerwind/actions-runner-controller/pull/126#issuecomment-712501790
2020-10-21 21:32:40 +09:00
Juho Saarinen af483d83da
Possibility to define resources for DIND container (#125)
Ref #119
2020-10-21 10:26:19 +09:00
Juho Saarinen 92920926fe
Configurable "runner and DinD in a single container" (#126) 2020-10-20 08:48:28 +09:00
Yusuke Kuoka ae30648985 feat: Use HorizontalRunnerAutoscaler for autoscaling 2020-07-27 20:33:44 +09:00
Yusuke Kuoka eca6917c6a feat: Organizational RunnerDeployment Autoscaling
Enhances #57 to add support for organizational runners.

As GitHub Actions does not have an appropriate API for this, this is the spec you need:

```
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: myrunners
spec:
  minReplicas: 1
  maxReplicas: 3
  autoscaling:
    metrics:
    - type: TotalNumberOfQueuedAndProgressingWorkflowRuns
      repositories:
      # Assumes that you have `github.com/myorg/myrepo1` repo
      - myrepo1
      - myrepo2
  template:
    spec:
      organization: myorg
```

It works by collecting "in_progress" and "queued" workflow runs for the repositories `myrepo1` and `myrepo2` to autoscale the number of replicas, assuming you have this organizational runner deployment only for those two repositories.

For example, if `myrepo1` had 1 `in_progress` and 2 `queued` workflow runs, and `myrepo2` had 4 `in_progress` and 8 `queued` workflow runs at the time of running the reconcilation loop on the runner deployment, it will scale replicas to 1 + 2 + 4 + 8 = 15.

Perhaps we might be better add a kind of "ratio" setting so that you can configure the controller to create e.g. 2x runners than demanded. But that's another story.

Ref #10
2020-07-03 09:12:47 +09:00
KUOKA Yusuke 5bb2694349
feat: Repository-wide RunnerDeployment Autoscaling (#57)
* feat: Repository-wide RunnerDeployment Autoscaling

This adds `maxReplicas` and `minReplicas` to the RunnerDeploymentSpec. If and only if both fields are set, the controller computes and sets desired `replicas` automatically depending on the demand.

The number of demanded runner replicas is computed by `queued workflow runs + in_progress workflow runs` for the repository. The support for organizational runners is not included.

Ref https://github.com/summerwind/actions-runner-controller/issues/10
2020-06-27 17:26:46 +09:00
Moto Ishizawa e889eaeb04 Add validation webhooks 2020-04-30 22:11:59 +09:00
Reinier Timmer 8c5b776807 support runner labels 2020-04-28 11:14:31 +02:00
Aleksandr Stepanov d4c849ee09
Add variants of PodTemplate spec fields into the Runner spec (#7)
Resolves #5
Fixes #11
Fixes #12

Changes:

* Added podtemplate spec

* Rework pod creation logic

* Added most using podspecs

* Added copy of podspec

* Fixed Github List method

* Fixed containers

* Added ability to override runner's containers

* Added ability to override runner's containers

* Added ability to override runner's containers

* Update controllers/runner_controller.go

Co-Authored-By: Moto Ishizawa <summerwind.jp@gmail.com>

* Remove optional restartpolicy

* Changed naming convention

Co-authored-by: Moto Ishizawa <summerwind.jp@gmail.com>
2020-03-20 22:50:50 +09:00
Yusuke Kuoka c19a1b3ffe Rename RunnerSet to RunnerReplicaSet
To hand over the name `RunnerSet` to the new StatefulSet-based implementation of that being developed at #4
2020-03-10 09:14:34 +09:00
Yusuke Kuoka 9d634d88ff feat: RunnerDeployment
Adds the initial version of RunnerDeployment that is intended to manage RunnerSets(#1), like Deployment manages ReplicaSets.

This is the initial version and therefore is bare bone. The only update strategy it supports is `Recreate`, which recreates the underlying RunnerSet when the runner template changes. I'd like to add `RollingUpdate` strategy once this is merged.

This depends on #1 so the diff contains that of #1, too. Please see only the latest commit for review.

Also see https://github.com/mumoshu/actions-runner-controller-ci/runs/471329823?check_suite_focus=true to confirm that `make tests` is passing after changes made in this commit.
2020-02-27 10:46:23 +09:00
Yusuke Kuoka d8d829b734 feat: RunnerSets
RunnerSet is basically ReplicaSet for Runners.

It is responsible for maintaining number of runners to match the desired one. That is, it creates missing runners from `.Spec.Template` and deletes redundant runners.

Similar to ReplicaSet, this does not support rolling update of runners on its own. We might want to later add `RunnerDeployment` for that. But that's another story.
2020-02-24 10:32:44 +09:00
Moto Ishizawa 829a167303 Add 'env' field to runner resource 2020-02-06 22:09:07 +09:00
Moto Ishizawa 6b392aedda Implement Runner resource 2020-01-28 21:56:09 +09:00
Moto Ishizawa 04a8e562c0 Initial commit 2020-01-28 15:03:23 +09:00