actions-runner-controller

Commit Graph

Author	SHA1	Message	Date
Yusuke Kuoka	4e7b8b57c0	edge: Enable scaling from zero with PercentageRunnersBusy (#524 ) `PercentageRunnersBusy`, in combination with a secondary `TotalInProgressAndQueuedWorkflowRuns` metric, enables scale-from-zero for PercentageRunnersBusy. Please see the new `Autoscaling to/from 0` section in the updated documentation about how it works. Resolves #522	2021-05-05 14:27:17 +09:00
Yusuke Kuoka	e00b3b9714	Make development cycle faster (#508 ) Improves Makefile, acceptance/deploy.sh, acceptance/testdata/runnerdeploy.yaml, and the documentation to help developers and contributors.	2021-05-03 13:03:17 +09:00
Thejas N	588872a316	feat: allow ephemeral runner to be optional (#498 ) - Adds `ephemeral` option to `runner.spec` ``` .... template: spec: ephemeral: false repository: mumoshu/actions-runner-controller-ci .... ``` - `ephemeral` defaults to `true` - `entrypoint.sh` in runner/Dockerfile modified to read `RUNNER_EPHEMERAL` flag - Runner images are backward-compatible. `--once` is omitted only when the new envvar `RUNNER_EPHEMERAL` is explicitly set to `false`. Resolves #457	2021-05-02 19:04:14 +09:00
Yusuke Kuoka	0901456320	Update README with more detailed test instructions (#503 ) - You can now use `make acceptance/run` to run only a specific acceptance test case - Add note about Ubuntu 20.04 users / snap-provided docker - Add instruction to run Ginkgo tests - Extract acceptance/load from acceptance/kind - Make `acceptance/pull` not depend on `docker-build`, so that you can do `make docker-build acceptance/load` for faster image reload	2021-05-02 16:31:07 +09:00
Yusuke Kuoka	dbd7b486d2	feat: Support for scaling from/to zero (#465 ) This is an attempt to support scaling from/to zero. The basic idea is that we create a one-off "registration-only" runner pod on RunnerReplicaSet being scaled to zero, so that there is one "offline" runner, which enables GitHub Actions to queue jobs instead of discarding those. GitHub Actions seems to immediately throw away the new job when there are no runners at all. Generally, having runners of any status, `busy`, `idle`, or `offline` would prevent GitHub actions from failing jobs. But retaining `busy` or `idle` runners means that we need to keep runner pods running, which conflicts with our desired to scale to/from zero, hence we retain `offline` runners. In this change, I enhanced the runnerreplicaset controller to create a registration-only runner on very beginning of its reconciliation logic, only when a runnerreplicaset is scaled to zero. The runner controller creates the registration-only runner pod, waits for it to become "offline", and then removes the runner pod. The runner on GitHub stays `offline`, until the runner resource on K8s is deleted. As we remove the registration-only runner pod as soon as it registers, this doesn't block cluster-autoscaler. Related to #447	2021-05-02 16:11:36 +09:00
ToMe25	ba175148c8	Locally build runner image instead of pulling it (#473 ) * Fix acceptance helm test not using newly built controller image * Locally build runner image instead of pulling it * Revert runner controller image pull policy to always and add a line to the test deployment to use IfNotPresent * Change runner repository from summerwind/action-runner to the owner of actions-runner-controller. Also fix some Makefile formatting. * Undo renaming acceptance/pull to docker-pull * Some env var cleanup Rename USERNAME to DOCKER_USER(is still used for github too tho) Add RUNNER_NAME var(defaults to $DOCKER_USER/actions-runner) Add TEST_REPO(defaults to $DOCKER_USER/actions-runner-controller)	2021-05-01 15:10:57 +09:00
callum-tait-pbx	358146ee54	docs: adding note on cloud tooling (#492 ) * docs: adding note on cloud tooling * docs: better grammar	2021-04-30 10:20:01 +09:00
callum-tait-pbx	1ba4098648	docs: updating to reflect new ownership (#491 )	2021-04-30 10:11:58 +09:00
callum-tait-pbx	05fb8569b3	docs: updating helm install command (#485 )	2021-04-27 09:12:30 +09:00
Rolf Ahrenberg	81dd47a893	Document dockerMTU and dockerRegistryMirror (#482 )	2021-04-26 09:52:09 +09:00
Rolf Ahrenberg	6b77a2a5a8	feat: Docker registry mirror (#478 ) Changes: - Switched to use `jq` in startup.sh - Enable docker registry mirror configuration which is useful when e.g. avoiding the Docker Hub rate-limiting Check #478 for how this feature is tested and supposed to be used.	2021-04-25 14:04:01 +09:00
callum-tait-pbx	dc4cf3f57b	docs: better enterprise runner documentation (#477 ) * docs: better Enterprise runner documentation * docs: adding more detail	2021-04-25 13:33:47 +09:00
callum-tait-pbx	aad2615487	docs: improved details on authentication (#472 )	2021-04-23 09:42:29 +09:00
callum-tait-pbx	03d9b6a09f	docs: slightly better wording about support (#471 )	2021-04-23 09:41:08 +09:00
callum-tait-pbx	5d280cc8c8	docs: adding scaling configuration detail (#469 )	2021-04-23 09:40:23 +09:00
callum-tait-pbx	133c4fb21e	docs: clean up Enterprise and fsGroup docs (#460 ) * docs: cleaning up Enterprise docs * docs: better wording in various areas * docs: improving enterprise runner docs * docs: using American English * docs: removing superfluous paragraph * docs: improving grammar * docs: better grammar * docs: better wording * docs: updated to reflect comments * docs: spelling correction	2021-04-20 10:26:10 +09:00
callum-tait-pbx	3b2d2c052e	chore: adding Helm app version back (#412 ) * chore: adding Helm app version back * chore: removing redundant values entry * chore: bumping to newer version * chore: bumping app version to latest Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>	2021-04-18 13:58:54 +09:00
callum-tait-pbx	2eeb56d1c8	docs: removing superfluous title reference (#459 )	2021-04-18 09:45:28 +09:00
ToMe25	a612b38f9b	Cache docker images in acceptance test (#463 ) * Cache docker images locally Cache dind, runner, and kube-rbac-proxy docker image on the host and copy onto the kind node instead of downloading it to the node directly. * Also cache certmanager docker images	2021-04-18 09:44:59 +09:00
callum-tait-pbx	325c2cc385	docs: correct and simplify example (#450 ) * docs: correct and simplify example * docs: removing alternatives Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>	2021-04-17 17:08:57 +09:00
asoldino	7b44454d01	Add documentation of dockerVolumeMount (#453 )	2021-04-17 17:04:38 +09:00
callum-tait-pbx	f2680b2f2d	Bumping runner to Ubuntu 20.04 (#438 ) Images for `actions-runner:v${VERSION}` and `actions-runner:latest` tags are upgraded to Ubuntu 20.04. If you would like not to upgrade Ubuntu in the runner image in the future, migrate to new tags suffixed with `-ubuntu-20.04` like`actions-runner:v${VERSION}-ubuntu-20.04`. We also keep publishing the existing Ubuntu 18.04 images with new `actions-runner:v${VERSION}-ubuntu-18.04` tags. Please use it when it turned out that you had workflows dependent on Ubuntu 18.04. Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>	2021-04-17 17:02:03 +09:00
Gabriel Dantas Gomes	0783ffe989	some readme typos (#423 )	2021-03-29 10:08:21 +09:00
Balazs Gyurak	264cf494e3	Fix "pole" typo in README (#394 ) I think these should be "poll".	2021-03-17 06:34:01 +09:00
callum-tait-pbx	a6270b44d5	docs: fix typos and add PR link (#379 ) * docs: fix typos and add PR link * docs: changes based on feedback * docs: fixing numbers in list * docs: grammer * docs: better wording	2021-03-12 08:52:34 +09:00
Rob Whitby	dbda292f54	fix typo in examples (#373 )	2021-03-08 09:18:10 +09:00
Mike Perry	f220fefe92	Update README.md (#370 )	2021-03-05 09:17:32 +09:00
callum-tait-pbx	9c7372a8e0	docs: styling fixes (#359 ) * docs: styling fixes * docs: grammer fixes	2021-03-01 09:44:35 +09:00
callum-tait-pbx	f987571b64	Improve docs (#303 )	2021-02-26 09:32:18 +09:00
Johannes Nicolai	6cce3fefc5	Add project to awesome-runners list (#319 )	2021-02-17 09:14:42 +09:00
Johannes Nicolai	34c6c3d9cd	Pod eviction policy examples (crashed nodes) (#308 ) * ... otherwise it will take 40 seconds (until a node is detected as unreachable) + 5 minutes (until pods are evicted from unreachable/crashed nodes) * pods stuck in "Terminating" status on unreachable nodes will only be freed once #307 gets merged	2021-02-15 09:33:01 +09:00
Yusuke Kuoka	ab1c39de57	feat: HorizontalRunnerAutoscaler Webhook server (#282 ) * feat: HorizontalRunnerAutoscaler Webhook server This introduces a Webhook server that responds GitHub `check_run`, `pull_request`, and `push` events by scaling up matched HorizontalRunnerAutoscaler by 1 replica. This allows you to immediately add "resource slack" for future GitHub Actions job runs, without waiting next sync period to add insufficient runners. This feature is highly inspired by https://github.com/philips-labs/terraform-aws-github-runner. terraform-aws-github-runner can manage one set of runners per deployment, where actions-runner-controller with this feature can manage as many sets of runners as you declare with HorizontalRunnerAutoscaler and RunnerDeployment pairs. On each GitHub event received, the webhook server queries repository-wide and organizational runners from the cluster and searches for the single target to scale up. The webhook server tries to match HorizontalRunnerAutoscaler.Spec.ScaleUpTriggers[].GitHubEvent.[CheckRun\|Push\|PullRequest] against the event and if it finds only one HRA, it is the scale target. If none or two or more targets are found for repository-wide runners, it does the same on organizational runners. Changes: * Fix integration test * Update manifests * chart: Add support for github webhook server * dockerfile: Include github-webhook-server binary * Do not import unversioned go-github * Update README	2021-02-07 17:37:27 +09:00
Jesse Haka	28e80a2d28	Add support for enterprise runners (#290 ) * Add support for enterprise runners * update docs	2021-02-05 09:31:06 +09:00
Yusuke Kuoka	a2690aa5cb	Update README.md Follow-up for #275	2021-01-29 09:29:26 +09:00
Clément	da020df0fd	docs: fix install installation method (#275 )	2021-01-29 09:28:34 +09:00
Johannes Nicolai	42493d5e01	Adding --name-space parameter in example (#259 ) * when setting a GitHub Enterprise server URL without a namespace, an error occurs: "error: the server doesn't have a resource type "controller-manager" * setting default namespace "actions-runner-system" makes the example work out of the box	2021-01-22 10:12:04 +09:00
Johannes Nicolai	cbb41cbd18	Updating custom container example (#260 ) * use latest instead of outdated version * use sudo for package install (required) * use sudo for package meta data removal (required)	2021-01-22 09:57:42 +09:00
Johannes Nicolai	64a1a58acf	GitHub runner groups have to be created first (#261 ) * in contrast to runner labels, GitHub runner groups are not automatically created	2021-01-22 09:52:35 +09:00
ZacharyBenamram	0dadddfc7d	Update README for "PercentageRunnersBusy" HRA metric type (#237 ) * adding readme for new hpa scheme * callum's comments Co-authored-by: Zachary Benamram <zacharybenamram@blend.com>	2020-12-17 10:21:27 +09:00
Yusuke Kuoka	dfffd3fb62	feat: EKS IAM Roles for Service Accounts for Runner Pods (#226 ) One of the pod recreation conditions has been modified to use hash of runner spec, so that the controller does not keep restarting pods mutated by admission webhooks. This naturally allows us, for example, to use IRSA for EKS that requires its admission webhook to mutate the runner pod to have additional, IRSA-related volumes, volume mounts and env. Resolves #200	2020-12-08 17:56:06 +09:00
Reinier Timmer	ee8fb5a388	parametrized working directory (#185 ) * parametrized working directory * manifests v3.0	2020-11-25 08:55:26 +09:00
Shinnosuke Sawada	6ce6737f61	add dockerEnabled document (#193 ) Follow-up for #191	2020-11-17 09:31:34 +09:00
Yusuke Kuoka	3d531ffcdd	refactor/sync-period (#188 ) * refactor: adding explicit sync-period flag * docs: adding mroe detail for sync period config * docs: spelling 🙃 Co-authored-by: Callum Tait <callum.tait@PBXUK-HH-05772.photobox.priv>	2020-11-14 22:07:22 +09:00
Callum Tait	c623ce0765	docs: spelling 🙃	2020-11-14 12:32:14 +00:00
Callum Tait	04dde518f0	docs: adding mroe detail for sync period config	2020-11-14 11:22:09 +00:00
Yusuke Kuoka	bbfe03f02b	Add acceptance test (#168 ) To ease verifying the controller to work before submitting/merging PRs and releasing a new version of the controller.	2020-11-14 20:07:14 +09:00
Callum Tait	492ea4b583	docs: adding a common errors section	2020-11-13 08:38:25 +00:00
Callum Tait	3446d4d761	docs: reordering to be more logical	2020-11-13 08:27:18 +00:00
Javier de Pedro López	1f82032fe3	Improvements in the documentation for permissions and replication (#175 )	2020-11-13 09:36:22 +09:00
Elliot Maincourt	5b2272d80a	Add read-only permissions to actions for being able to autoscale (#180 )	2020-11-13 09:27:01 +09:00
Dan Webb	dcf8524b5c	Adds RUNNER_GROUP argument to the runner registration (#157 ) * Adds RUNNER_GROUP argument to the runner registration Adds the ability to register a runner to a predefined runner_group Resolves #137 * Update README with runner group example - Updates the README with instructions of how to add the runner to a group - Fix code fencing for shell and yaml blocks in the README - Use consistent bullet points (dash not asterisk)	2020-11-10 17:15:54 +09:00
Juho Saarinen	40c5050978	Added support for other than public GitHub URL (#146 ) Refactoring a bit	2020-10-28 22:15:53 +09:00
Yusuke Kuoka	faaca10fba	Rename Runner.Spec.dockerWithinRunnerContainer to docker"d"WithinRunnerContainer (#134 ) * Rename Runner.Spec.dockerWithinRunnerContainer to dockerdWithinRunnerContainer Ref https://github.com/summerwind/actions-runner-controller/pull/126#issuecomment-712501790	2020-10-21 21:32:40 +09:00
Juho Saarinen	92920926fe	Configurable "runner and DinD in a single container" (#126 )	2020-10-20 08:48:28 +09:00
Yusuke Kuoka	be2e61f209	Add "alternatives" section to README (#124 ) Grow together :)	2020-10-15 08:40:02 +09:00
John Wiebalk	acb1700b7c	Fix a couple typos in readme (#107 )	2020-10-05 08:59:34 +09:00
Moto Ishizawa	e10637ce35	Merge pull request #66 from summerwind/org-runner-autoscale feat: Organizational RunnerDeployment Autoscaling	2020-07-28 19:17:18 +09:00
Yusuke Kuoka	ae30648985	feat: Use HorizontalRunnerAutoscaler for autoscaling	2020-07-27 20:33:44 +09:00
David Liao	c0914743b0	add config to respect image pull policy	2020-07-08 23:53:52 -07:00
KUOKA Yusuke	5bb2694349	feat: Repository-wide RunnerDeployment Autoscaling (#57 ) * feat: Repository-wide RunnerDeployment Autoscaling This adds `maxReplicas` and `minReplicas` to the RunnerDeploymentSpec. If and only if both fields are set, the controller computes and sets desired `replicas` automatically depending on the demand. The number of demanded runner replicas is computed by `queued workflow runs + in_progress workflow runs` for the repository. The support for organizational runners is not included. Ref https://github.com/summerwind/actions-runner-controller/issues/10	2020-06-27 17:26:46 +09:00
Vitali Raikov	905ed18824	Add an extra permission for organization self hosted runners	2020-06-02 16:16:26 +03:00
Moto Ishizawa	390f2a62d9	Merge pull request #50 from summerwind/runner-validation-webhook Add validation webhooks	2020-05-08 22:39:13 +09:00
naka-gawa	1555651325	fix typo readme	2020-05-06 12:25:30 +09:00
Reinier Timmer	79655989d0	Add additional software packages to runner image	2020-05-04 06:55:16 +00:00
Moto Ishizawa	55323c3754	Update installation section	2020-05-02 16:57:19 +09:00
Reinier Timmer	966e0dca37	fix merge conflict on README	2020-04-28 11:24:59 +02:00
Reinier Timmer	8c42b317ec	updated documentation	2020-04-28 11:15:41 +02:00
Varun Priolkar	df5eed52d0	Specify language in additional tweaks code block	2020-04-23 12:57:56 +05:30
Varun Priolkar	5f271a3050	Fix code block on additional tweak section	2020-04-23 12:57:15 +05:30
Varun Priolkar	6997cc97c6	Improve readme	2020-04-23 12:53:46 +05:30
Moto Ishizawa	0bb6f64470	Add section to use GitHub App	2020-04-10 15:42:27 +09:00
Adam Jensen	934ec7f181	Clarify instructions for getting a token (#18 ) * Clarify instructions for getting a token * Fix typo	2020-03-25 21:22:19 +09:00
Jorge Arco	d3aa21f583	Readme fix requirements	2020-03-17 11:48:41 +01:00
Yusuke Kuoka	c19a1b3ffe	Rename RunnerSet to RunnerReplicaSet To hand over the name `RunnerSet` to the new StatefulSet-based implementation of that being developed at #4	2020-03-10 09:14:34 +09:00
Yusuke Kuoka	d12eca268d	Fix validation error on nil for optional slice field (runner.spec.env) I had observed athe exact issue seen for the 4th option described in https://github.com/elastic/cloud-on-k8s/issues/1822, which resulted in actions-runner-controller is unable to create nor update runners. This fixes that. I've also updated README to introduce RunnerDeployment and manually tested it to work after the fix. --- `actions-runner-controller` has been failing while creating and updating runners: ``` 2020-03-05T11:05:16.610+0900 ERROR controllers.Runner Failed to update runner {"runner": "default/example-runner", "error": "Runner.actions.summerwind.dev \"example-runner\" is invalid: []: Invalid value: map[string]interface {}{\"apiVersion\":\"actions.summerwind.dev/v1alpha1\", \"kind\":\"Runner\", \"metadata\":map[string]interface {}{\"creationTimestamp\":\"2020-03-05T02:05:16Z\", \"finalizers\":[]interface {}{\"runner.actions.summerwind.dev\"}, \"generation\":2, \"name\":\"example-runner\", \"namespace\":\"default\", \"resourceVersion\":\"911496\", \"selfLink\":\"/apis/actions.summerwind.dev/v1alpha1/namespaces/default/runners/example-runner\", \"uid\":\"48b62d07-ff2c-42d6-878c-d3f951202209\"}, \"spec\":map[string]interface {}{\"env\":interface {}(nil), \"image\":\"\", \"repository\":\"mumoshu/actions-runner-controller-ci\"}}: validation failure list:\nspec.env in body must be of type array: \"null\""} github.com/go-logr/zapr.(zapLogger).Error /Users/c-ykuoka/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128 github.com/summerwind/actions-runner-controller/controllers.(RunnerReconciler).Reconcile /Users/c-ykuoka/p/actions-runner-controller/controllers/runner_controller.go:88 ``` This seems like the exact issue seen in the 4th option in https://github.com/elastic/cloud-on-k8s/issues/1822 I also observed the same issue is failing while the runnerset controller is trying to create/update runners: ``` Also while creating runner in the runnerset controller: 2020-03-05T11:15:01.223+0900 ERROR controller-runtime.controller Reconciler error {"controller": "runnerset", "request": "default/example-runnerset", "error": "Runner.actions.summerwind.dev \"example-runnersetgp56m\" is invalid: []: Invalid value: map[string]interface {}{\"apiVersion\":\"actions.summerwind.dev/v1alpha1\", \"kind\":\"Runner\", \"metadata\":map[string]interface {}{\"creationTimestamp\":\"2020-03-05T02:15:01Z\", \"generateName\":\"example-runnerset\", \"generation\":1, \"name\":\"example-runnersetgp56m\", \"namespace\":\"default\", \"ownerReferences\":[]interface {}{map[string]interface {}{\"apiVersion\":\"actions.summerwind.dev/v1alpha1\", \"blockOwnerDeletion\":true, \"controller\":true, \"kind\":\"RunnerSet\", \"name\":\"example-runnerset\", \"uid\":\"e26f7d01-3168-496d-931b-8e6f97b776ea\"}}, \"uid\":\"4ee490f5-9a8c-4f30-86f9-61dea799b972\"}, \"spec\":map[string]interface {}{\"env\":interface {}(nil), \"image\":\"\", \"repository\":\"mumoshu/actions-runner-controller-ci\"}}: validation failure list:\nspec.env in body must be of type array: \"null\""} github.com/go-logr/zapr.(*zapLogger).Error ``` and while the runnerdeployment controller is trying to create/update runners. I've fixed it so that the new `RunnerDeployment` example added to README just works.	2020-03-09 22:03:07 +09:00
Moto Ishizawa	f5c8a0e655	Update README	2020-02-03 21:52:28 +09:00
Moto Ishizawa	6d8f833d4f	Update README	2020-02-03 10:10:40 +09:00
Moto Ishizawa	1cd3bdbe25	Add README.md	2020-01-29 23:19:23 +09:00

1 2 3 4 5

228 Commits