Commit Graph

189 Commits

Author SHA1 Message Date
Thomas Boop 0386c0734c
`containerMode` option to allow running jobs in k8's instead of docker (#1546)
* added containerMode=kubernetes env variables to the runner

* removed unused logging

* restored configs and charts

* restored makefile cert version and acceptance/run

* added workVolumeClaimTemplate in pod definition, including logic

* added claim template name based on the runner

* Apply suggestions from code review

update errors

* added concurrent cleanup before runner pod is deleted

* update manifests

* added retry after 30s if pod cleanup contains err

* added admission webhook check, made workVolumeClaimTemplate mandatory for k8s

* style changes and added comments

* added izZero timestamp check for deleting runner-linked pods

* changed order of local variable to avoid copy if p is deleted

* removed docker from container mode k8s

* restored charts, config, makefile

* restored forked files back and not the ARC ones

* created PersistentVolume on containerMode k8s

* create pv only if storage class name is local-storage

* removed actions if storage class name is local-storage

* added service account validation if container mode kubernetes

* changed the coding style to match rest of the ARC

* added validation to the runnerdeployment webhook

* specified fields more precisely, added webhook validation to the replicaset as well

* remake manifests

* wraped delete runner-linked-pods in kube mode

* fixed empty line

* fixed import

* makefile changes for hooks

* added cleanup secrets

* create manifests

* docs

* update access modes

* update dockerfile

* nit changes

* fixed dockerfile

* rewrite allowing reuse for runners and runnersets

* deepcopy forgot to stage

* changed privileged

* make manifests

* partly moved to finalizer, still need to apply finalizer first

* finalizer added if env variable used in container mode exists

* bump runner version

* error message moved from Error to Info on cleanup pods/secrets

* removed useless dereferencing, added transformation tests of workVolumeClaimTemplate

* Apply suggestions from code review

* Update controllers/utils_test.go

Co-authored-by: Thomas Boop <52323235+thboop@users.noreply.github.com>

* Update controllers/utils_test.go

Co-authored-by: Thomas Boop <52323235+thboop@users.noreply.github.com>

* add hook version to cli, update to 0.1.2

* Apply suggestions from code review

* Update controllers/utils_test.go

* Update runner/Makefile

* Fix missing secret permission and the error handling

* Fix a runnerpod reconciler finalizer to not trigger unnecessary retry

Co-authored-by: Nikola Jokic <nikola-jokic@github.com>
Co-authored-by: Nikola Jokic <97525037+nikola-jokic@users.noreply.github.com>
Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2022-06-28 14:12:40 +09:00
Arnaud abb8615796
Webhook server configuration with kustomize (#1312)
* webhook server configuration with kustomize

* Update README.md

* Update README.md

* Update README.md

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2022-06-28 09:08:25 +09:00
Sam Weston bc7a3cab1b
Add priorityClassName to CRDs (#1513)
* Add pod priorityClassName to controller and crds

* Add missing bits in bases directory

* Regenerate crds
2022-06-28 08:45:19 +09:00
Vinícius Garcia 01c8dc237e
Fix example manifests for webhooks-based scaling (#1354)
* Fix example manifests for webhook based scaling

I tried running these on my k8s cluster and I got some easy to fix errors, so I am committing them here.

* Fix example manifests for webhook autoscaling with workflow_jobs

* Fix the explation on how to setup webhooks on your cluster

* Replace unclear comment with actual code examples

There was a comment instructing users to add minReplicas and
maxReplicas to all the HRA yamls, so I just removed it and added
these attributes to the yamls themselves for clarity.

* Make clear that using the ingress example is just a suggestion

* Apply some text improvements suggested by @mumoshu

* Update examples so the webhook server is exposed on a NodePort

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>

* Remove an unnecessary field from one the examples

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>

* Apply suggestion from @mumoshu

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>

* Remove namespace fields from webhook autoscaler examples

This change was suggested by @mumoshu

* Apply final suggestion from @mumoshu

Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com>
Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2022-06-07 08:33:09 +09:00
Yusuke Kuoka 3d88b9630a
doc: Add "people" section (#1498)
Ref #1497
2022-05-31 09:29:15 +09:00
Yusuke Kuoka ef3313d147
doc: Use RunnerSet to retain various cache by leveraging PV (#1464)
* doc: Use RunnerSet to retain various cache

In relation to #1286 and as a follow-up for #1340

* docs: clarify client vs daemon

* docs: better wording

* Separate RunnerSet examples for docker iimage layer caching

* Revert changes on testdata as it is going to be added via #1471 instead

* Update README.md

Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com>

* fixup! Update README.md

* Remove the outdated RunnerSet limitation

Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com>
2022-05-25 11:09:36 +09:00
Yusuke Kuoka 536692181b
docs: Add CII Best Practices badge to README (#1461)
Ref https://github.com/actions-runner-controller/actions-runner-controller/issues/1298
2022-05-19 10:16:39 +01:00
Yusuke Kuoka 3a7e8c844b
feat: Support arbitrarily setting `privileged: true` for runner container (#1383)
Resolves #1282
2022-05-12 09:25:51 +01:00
Jacob Gadikian 5e8cba82c2
docs: simplify wording (#1427)
clarify docs
2022-05-11 11:44:07 +01:00
Yusuke Kuoka c1e5829b03
refactor(runner): ability to opt-out of using --ephemeral / opt-in to legacy --once for GHES older than 3.3 (#1384)
* runner: Remove the ability to use the deprecated `--once` flag

Ref #1196

* runner: Ability to opt-out of using --ephemeral

Although we are going to eventually remove the ability to use the legacy --once flag as proposed in #1196, there might be folks still using legacy GHES versions 3.2 or earlier.

This commit removes the existing feature flag to opt-in for --ephemeral, while adding another feature flag RUNNER_FEATURE_FLAG_ONCE to opt-in for --once so that folks stuck in legacy GHES versions
can still use ARC.

Since this change every user starts using --ephemeral by default. If they see any issues on legacy GHES instance, RUNNER_FEATURE_FLAG_ONCE=true can be set to opt-in to keep using --once, which gives one more ARC release until they upgrade their GHES instance.

But beware, we won't support legacy GHES instances forever as it's going to be a maintenance nightmare. Please upgrade!

Ref #1196
2022-05-11 09:55:33 +01:00
toast-gear 46291c1823 docs: highlight the new scale down delay flag 2022-04-28 16:04:16 +01:00
toast-gear 61c5a112db docs: remove reference to cleared limitation 2022-04-28 10:39:11 +01:00
toast-gear 7bc08fbe7c docs: remove TotalNumberOfQueuedAndInProgressWorkflowRuns limitation 2022-04-28 10:36:12 +01:00
Callum Tait 9f7ea0c014
docs: highlight breaking changes are possible (#1310)
It's probably worth highlighting it's ver 0.X.X by design and that breaking changes are possible until we move it to 1.0.0

Co-authored-by: toast-gear <toast-gear@users.noreply.github.com>
2022-04-26 11:20:11 +09:00
Soham Banerjee e8ef84ab76
Removed the default githubEvent: {} (#1361)
Ref #1358
See also #1379
2022-04-24 13:39:59 +09:00
Callum Tait c3e280eadb
refactor: set sync period default to 1m (#1308)
Fixes: #1294

Co-authored-by: toast-gear <toast-gear@users.noreply.github.com>
Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2022-04-24 10:47:00 +09:00
Vinícius Garcia 9f254a2393
docs: run README files through Grammarly (#1353)
* Update README.md

* Run charts/actions-runner-controller/README.md thorugh Grammarly

* Fix broken link as suggested by @toast-gear

Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com>

Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com>
2022-04-22 16:58:10 +01:00
Yusuke Kuoka 85dea9b67c
Merge pull request #1285 from actions-runner-controller/docs/runnersets
docs: add limitations to runnersets + reorder
2022-04-03 18:18:54 +09:00
Callum Tait 14f9e7229e
docs: highlight why persistent are not ideal 2022-04-01 15:49:15 +01:00
toast-gear eb02f6f26e docs: redundant words 2022-03-30 10:36:34 +01:00
toast-gear 7a750b9285 docs: wording 2022-03-30 10:35:32 +01:00
toast-gear d26c8d6529 docs: add autoscaling also causes problems 2022-03-30 10:26:08 +01:00
toast-gear fd0092d13f chore: new line for consistency 2022-03-30 10:02:33 +01:00
toast-gear 88d17c7988 docs: use the right font 2022-03-30 10:00:34 +01:00
toast-gear 98567dadc9 docs: fix index 2022-03-30 09:59:32 +01:00
toast-gear 7e8d80689b docs: add limitations to runnersets + reorder 2022-03-30 09:53:59 +01:00
Callum Tait d72c396ff1
docs: slight correction for a multi controller env 2022-03-29 16:57:58 +01:00
Michael Goodness a95983fb98
feat(kustomize): add github-webhook-server overlay (#1198)
* feat(kustomize): add github-webhook-server overlay

* chore(kustomize): add image to github-webhook-server overlay

* feat(kustomize): drop sync-period
2022-03-29 11:00:55 +01:00
Callum Tait 459beeafb9
docs: remove the nonsense 2022-03-27 14:15:42 +01:00
Jérôme Foray 1f8a23c129
fix(chart): add namespace selector to webhooks when in singleNamespace mode (#1237)
* fix(chart): add namespace selector to webhooks when in singleNamespace mode

* docs: expand multi controller setup

Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com>
2022-03-27 11:52:39 +01:00
Callum Tait f28cecffe9
docs: various minor changes (#1250)
* docs: various minor changes

* docs: format fixes
2022-03-20 16:05:03 +00:00
Julien Tanay c06a806d75
Add note about having 100+ replicas (#1103) 2022-03-16 21:03:05 +00:00
Callum Tait a40793bb60
chore: bump chart to app 0.22.0 (#1232)
* chore: bump chart to app 0.22.0
2022-03-16 07:57:30 +00:00
Callum Tait 48a7b78bf3
docs: remove runnerset limitation (#1225)
This works great from testing now, this is no longer a limitation due to ARC now creating a statefulset per runner
2022-03-16 09:08:41 +09:00
toast-gear 3beef84f30 docs: better sentences 2022-03-14 12:43:07 +00:00
toast-gear 76cc758d12 docs: minor consistency change 2022-03-14 12:37:57 +00:00
toast-gear ecf74e615e docs: bump versions and upgrade instructions 2022-03-14 10:23:36 +00:00
toast-gear bb19e85037 docs: various cleanups and re-orderings 2022-03-14 09:52:22 +00:00
Yusuke Kuoka c4b24f8366 Prevent static runners from terminating due to unregister timeout
The unregister timeout of 1 minute (no matter how long it is) can negatively impact availability of static runner constantly running workflow jobs, and ephemeral runner that runs a long-running job.
We deal with that by completely removing the unregistaration timeout, so that regarldess of the type of runner(static or ephemeral) it waits forever until it successfully to get unregistered before being terminated.
2022-03-13 07:26:36 +00:00
Yusuke Kuoka 051089733b Use --ephemeral by default
Ref https://github.com/actions-runner-controller/actions-runner-controller/issues/1189
2022-03-12 13:20:07 +00:00
yourmoonlight 132faa13a1
docs: fix the helm command for webhook installation (#1188)
* fix doc for install the webhook server

* modify cmd with single set && add double quote for zsh users
2022-03-08 17:59:01 +00:00
Daniel 8a379ac94b
Add custom volume mount documentation (#1045)
one example for in-memory
and one example for NVME backed storage, also pointing out all the
current flaws/risks for that configuration
2022-03-03 09:13:42 +09:00
Callum Tait 4b0aa92286
docs: better wording 2022-03-01 08:56:30 +00:00
Callum Tait c69c8dd84d
docs: better runner group description 2022-03-01 08:54:24 +00:00
Felipe Galindo Sanchez d0d316252e
Option to consider runner group visibility on scale based on webhook (#1062)
This will work on GHES but GitHub Enterprise Cloud due to excessive GitHub API calls required.
More work is needed, like adding a cache layer to the GitHub client, to make it usable on GitHub Enterprise Cloud.

Fixes additional cases from https://github.com/actions-runner-controller/actions-runner-controller/pull/1012

If GitHub auth is provided in the webhooks controller then runner groups with custom visibility are supported. Otherwise, all runner groups will be assumed to be visible to all repositories

`getScaleUpTargetWithFunction()` will check if there is an HRA available with the following flow:

1. Search for **repository** HRAs - if so it ends here
2. Get available HRAs in k8s
3. Compute visible runner groups
  a. If GitHub auth is provided - get all the runner groups that are visible to the repository of the incoming webhook using GitHub API calls.  
  b. If GitHub auth is not provided - assume all runner groups are visible to all repositories
4. Search for **default organization** runners (a.k.a runners from organization's visible default runner group) with matching labels
5. Search for **default enterprise** runners (a.k.a runners from enterprise's visible default runner group) with matching labels
6. Search for **custom organization runner groups** with matching labels
7. Search for **custom enterprise runner groups** with matching labels

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2022-02-16 19:08:56 +09:00
Yusuke Kuoka 59437ef79f
Update README.md
Ref https://github.com/actions-runner-controller/actions-runner-controller/issues/1100#issuecomment-1032775144
2022-02-09 09:16:46 +09:00
Callum Tait eb53d238d1
docs: move istio to troubleshooting (#1097)
Co-authored-by: toast-gear <toast-gear@users.noreply.github.com>
2022-02-07 20:49:26 +00:00
Chris Bui 1b911749a6
feat: disable automatic runner updates (#1088)
* Add env variable to configure `disablupdate` flag

* Write test for entrypoint disable update

* Rename flag, update docs for DISABLE_RUNNER_UPDATE

* chore: bump runner version in makefile

Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com>
2022-02-03 21:03:38 +00:00
Yusuke Kuoka 01301d3ce8
Stop creating registration-only runners on scale-to-zero (#1028)
Resolves #859
2022-01-07 09:56:21 +09:00
Hyeonmin Park 1a6e5719c3
test: Add tests with self-hosted label for #953 (#1030) 2022-01-07 08:50:26 +09:00