actions-runner-controller

Commit Graph

Author	SHA1	Message	Date
Sam Weston	bc7a3cab1b	Add priorityClassName to CRDs (#1513 ) * Add pod priorityClassName to controller and crds * Add missing bits in bases directory * Regenerate crds	2022-06-28 08:45:19 +09:00
Yusuke Kuoka	e2c8163b8c	Make webhook-based scale race-free (#1477 ) * Make webhook-based scale operation asynchronous This prevents race condition in the webhook-based autoscaler when it received another webhook event while processing another webhook event and both ended up scaling up the same horizontal runner autoscaler. Ref #1321 * Fix typos * Update rather than Patch HRA to avoid race among webhook-based autoscaler servers * Batch capacity reservation updates for efficient use of apiserver * Fix potential never-ending HRA update conflicts in batch update * Extract batchScaler out of webhook-based autoscaler for testability * Fix log levels and batch scaler hang on start * Correlate webhook event with scale trigger amount in logs * Fix log message	2022-06-27 18:31:48 +09:00
Callum Tait	84d16c1c12	revert: "Overhauled `startup.sh` Script (#1454 )" (#1561 ) This reverts commit `071898c96b`.	2022-06-23 12:39:32 +01:00
Richard Fussenegger	071898c96b	Overhauled `startup.sh` Script (#1454 ) This overhaul turns it into a shellcheck valid script with explicit error handling for all possible situations I could think of. This change takes https://github.com/actions-runner-controller/actions-runner-controller/pull/1409 into account and things can be merged in any order. There are a few important changes here to the logic: - The wait logic for checking if docker comes up was fundamentally flawed because it checks for the PID. Docker will always come up and thus become visible in the process list, just to immediately die when it encounters an issue, after which supervisor starts it again. This means that our check so far is flaky due to the `sleep 1` it might encounter a PID, or it might not, and the existence of the PID does not mean anything. The `docker ps` check we have in the `entrypoint.sh` script does not suffer from this as it checks for a feature of docker and not a PID. I thus entirely removed the PID check, and instead I am handing things over to our `entrypoint.sh` script by setting the environment variables correctly. - This change has an influence on the `docker0` interface MTU configuration, because the interface might or might not exist after we started docker. Hence, I changed this to a time boxed loop that tries for one minute to set up the interface's MTU. In case the command fails we log an error and continue with the run. - I changed the entire MTU handling by validating its value before configuring it, logging an error and continuing without if it is set incorrectly. This ensures that we are not going to send our users on a bug hunt. - The way we started supervisord did not make much sense to me. It sends itself into the background automatically, there is no need for us to do so with Bash. The decision to not fail on errors but continue is a deliberate choice, because I believe that running a build is more important than having a perfectly configured system. However, this strategy might also hide issues for all users who are not properly checking their logs. It also makes testing harder. Hence, we could change all error conditions from graceful to panicking. We should then align the exit codes across `startup.sh` and `entrypoint.sh` to ensure that every possible error condition has its own unique error code for easy debugging.	2022-06-23 09:37:01 +09:00
renovate[bot]	f24e2fa44e	chore(deps): update dependency actions/runner to v2.294.0	2022-06-22 21:45:32 +00:00
Callum Tait	3c7d3d6b57	ci: hardcode dockerhub username (#1555 )	2022-06-22 16:15:50 +01:00
Callum Tait	23f091d7fa	ci: don't login on a pr (#1554 ) * ci: don't login on a pr Co-authored-by: toast-gear <toast-gear@users.noreply.github.com>	2022-06-22 16:03:36 +01:00
Callum Tait	667764e027	chore: suggest gist first (#1539 )	2022-06-18 17:38:37 +09:00
Callum Tait	de693c4191	ci: runners trigger on push (#1549 ) * ci: runners trigger on push * ci: comments * ci: comments	2022-06-18 17:34:40 +09:00
Callum Tait	510fc9c834	ci: add GitHub packages to arc release (#1525 ) * ci: add GitHub packages to arc release * ci: use restrictive permissions	2022-06-15 11:37:19 +09:00
Callum Tait	7fd5e24961	chore: bump chart to app 0.24.1 (#1531 )	2022-06-15 11:34:55 +09:00
Yusuke Kuoka	9974b1a2b7	e2e: Enable buildx in more images (#1530 )	2022-06-14 09:29:30 +01:00
Yusuke Kuoka	bd91b73fd9	chore: update bug_report.yml (#1529 )	2022-06-14 09:29:06 +01:00
Callum Tait	a7ae910ee4	docs: add CRD disclaimer to bug report (#1516 )	2022-06-14 09:42:31 +09:00
Callum Tait	2733c36d0e	ci: publish controller canary to github packages (#1524 ) * ci: publish controller canary to github packages * ci: include image name	2022-06-14 09:10:13 +09:00
Yusuke Kuoka	0ef9a22cd4	Fix confusing PV controller log (#1526 ) Ref #1511	2022-06-14 08:35:04 +09:00
Renovate Bot	933b0c7888	chore(deps): update dependency actions/runner to v2.293.0	2022-06-13 17:09:29 +00:00
renovate[bot]	1b7ec33135	chore(deps): update actions/setup-python action to v4 (#1514 ) Co-authored-by: Renovate Bot <bot@renovateapp.com>	2022-06-13 14:07:52 +01:00
Callum Tait	a62882d243	ci: fix permisions (#1512 ) * ci: fix permisions * chore: change to trigger build * ci: add write permission to packages * ci: remove conditionals for docker logins * Update controllers/utils_test.go Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>	2022-06-09 10:25:56 +09:00
Callum Tait	0cd13fe51d	ci: align pipeline files and setups (#1484 ) * ci: align pipeline files and setups * ci: more changes * ci: various changes * ci: fix setup-helm action ref * ci: better pipeline name * ci: more format aligning * ci: more format aligning * ci: better job name * ci: supports multiple languages * ci: better pipeline and job names * ci: do a verb-noun thing for consistency * ci: use 'arc' when talking holistically * ci: add caching scope * ci: put canary in a scope * ci: fix syntax error * ci: better pipeline and job names * ci: better job name Co-authored-by: toast-gear <toast-gear@users.noreply.github.com>	2022-06-08 10:04:14 +09:00
Vinícius Garcia	01c8dc237e	Fix example manifests for webhooks-based scaling (#1354 ) * Fix example manifests for webhook based scaling I tried running these on my k8s cluster and I got some easy to fix errors, so I am committing them here. * Fix example manifests for webhook autoscaling with workflow_jobs * Fix the explation on how to setup webhooks on your cluster * Replace unclear comment with actual code examples There was a comment instructing users to add minReplicas and maxReplicas to all the HRA yamls, so I just removed it and added these attributes to the yamls themselves for clarity. * Make clear that using the ingress example is just a suggestion * Apply some text improvements suggested by @mumoshu * Update examples so the webhook server is exposed on a NodePort Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com> * Remove an unnecessary field from one the examples Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com> * Apply suggestion from @mumoshu Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com> * Remove namespace fields from webhook autoscaler examples This change was suggested by @mumoshu * Apply final suggestion from @mumoshu Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com> Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>	2022-06-07 08:33:09 +09:00
Yusuke Kuoka	7c4db63718	chart: Bump appVersion to 0.24.0 (#1505 )	2022-06-03 22:01:35 +09:00
Yusuke Kuoka	3d88b9630a	doc: Add "people" section (#1498 ) Ref #1497	2022-05-31 09:29:15 +09:00
Yusuke Kuoka	1152e6b31d	Add release note for v0.24.0 (#1493 )	2022-05-30 09:10:36 +09:00
renovate[bot]	ac27df8301	chore(deps): update dependency actions/runner to v2.292.0 (#1475 ) Co-authored-by: Renovate Bot <bot@renovateapp.com>	2022-05-27 09:49:46 +09:00
Yusuke Kuoka	9dd26168d6	Fix confusing logs from pv and pvc controllers (#1487 ) Ref https://github.com/actions-runner-controller/actions-runner-controller/issues/1321#issuecomment-1137431212	2022-05-26 18:21:23 +09:00
Yusuke Kuoka	18bfb28c0b	e2e: ARC_E2E_NO_CLEANUP to prevent cleanup (#1470 ) A small improvement to our E2E test suite which allows you to set `ARC_E2E_NO_CLEANUP=whatever` to let it prevent the kind cluster cleanup on successful test run, so that you can rerun it without waiting for the new kind cluster to come up.	2022-05-26 10:59:50 +09:00
renovate[bot]	84210e900b	chore(deps): update actions/setup-python digest to fff15a2 (#1458 ) Co-authored-by: Renovate Bot <bot@renovateapp.com>	2022-05-25 12:12:22 +09:00
Yusuke Kuoka	ef3313d147	doc: Use RunnerSet to retain various cache by leveraging PV (#1464 ) * doc: Use RunnerSet to retain various cache In relation to #1286 and as a follow-up for #1340 * docs: clarify client vs daemon * docs: better wording * Separate RunnerSet examples for docker iimage layer caching * Revert changes on testdata as it is going to be added via #1471 instead * Update README.md Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com> * fixup! Update README.md * Remove the outdated RunnerSet limitation Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com>	2022-05-25 11:09:36 +09:00
Yusuke Kuoka	c7eea169ad	test: Add test for runner with generic ephemeral volume as "work" (#1472 ) This adds the test to verify the runner pod generation logic for the case that you use a generic ephemeral volume as "work". It is almost an adaptation of the test cases writetn for RunnerSet in #1471, to RunnerDeployment and Runner.	2022-05-25 10:37:23 +09:00
Yusuke Kuoka	63be0223ad	fix: Avoid duplicate volume and mount name error for generic ephemeral volume as "work" (#1471 ) * fix: Avoid duplicate volume and mount name error for generic ephemeral volume as "work" While manually testing configurations being documented in #1464, I discovered that the use of dynamic ephemeral volume for "work" directory was not working correctly due to the valiadation error. This fixes the runner pod generation logic to not add the default volume and volume mount for "work" dir, so that the error disappears. Ref #1464 * e2e: Ensure work generic ephemeral volume to work as expected	2022-05-22 10:25:50 +09:00
Yusuke Kuoka	5bbea772f7	doc: enhance troubleshooting guide with the scale-to-zero issue (#1469 ) Ref https://github.com/actions-runner-controller/actions-runner-controller/issues/1057#issuecomment-1133439061	2022-05-21 19:00:40 +01:00
Callum Tait	2aa3f1e142	chore: remove stale app config (#1465 )	2022-05-21 14:19:41 +09:00
Yusuke Kuoka	3e988afc09	test: add fuzzing to the test suite (#1463 ) As a part of #1298, we add fuzzing based on Go test's fuzzing support to the test suite	2022-05-19 13:34:23 +01:00
Yusuke Kuoka	84210f3d2b	Bump Go to 1.18.2 (#1462 ) As a part of #1298, I'm going to use Go fuzzing which is availabls since Go 1.18. Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com>	2022-05-19 10:33:31 +01:00
Yusuke Kuoka	536692181b	docs: Add CII Best Practices badge to README (#1461 ) Ref https://github.com/actions-runner-controller/actions-runner-controller/issues/1298	2022-05-19 10:16:39 +01:00
renovate[bot]	23403172cb	chore(deps): update dependency golang to v1.18.2 (#1229 ) Co-authored-by: Renovate Bot <bot@renovateapp.com>	2022-05-19 17:36:31 +09:00
renovate[bot]	8a8ec43364	chore(deps): update github/codeql-action action to v2.1.11 (#1455 ) Co-authored-by: Renovate Bot <bot@renovateapp.com>	2022-05-18 09:02:26 +09:00
Felipe Galindo Sanchez	78c01fd31d	cleanup some unused code and minor refactors (#1274 ) Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>	2022-05-16 18:38:32 +09:00
Bernardo Meurer	bf45aa9f6b	refactor(runner/entrypoint): don't mv externalstmp if it's not there (#1315 )	2022-05-16 18:37:37 +09:00
Callum Tait	b5aa1750bb	ci: match renovate with new dockerfile names (#1453 )	2022-05-16 18:15:44 +09:00
Richard Fussenegger	cdc9d20e7a	Renamed Runner Dockerfiles (#1248 ) Renamed the runner dockerfiles so that we have proper syntax highlighting for them, as well as a consistent way to map from the image name to the dockerfile. Added a `.dockerignore` file to avoid uploading things to the daemon that we never use.	2022-05-16 11:41:28 +09:00
Hyeonmin Park	8035d6d9f8	chart: Add extraPaths to Ingress of GitHub Webhook Server (#1129 ) * chart: Add extraPaths to Ingress of GitHub Webhook Server * Update charts/actions-runner-controller/templates/githubwebhook.ingress.yaml Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com> * Prefix the toYaml expression to remove the extra newline before extra paths Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>	2022-05-16 11:34:56 +09:00
Callum Tait	65f7ee92a6	refactor: remove registration runner dead code (#1260 ) We had some dead code left over from the removal of registration runners. Registration runners were removed in #859 #1207 Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>	2022-05-16 11:23:39 +09:00
Matéo Mévollon	fca8a538db	docs: document the Docker MTU problem in troubleshooting guide (#1257 ) * docs: document the Docker MTU problem * Update TROUBLESHOOTING.md Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>	2022-05-16 11:13:05 +09:00
Nicholas Farley	95ddc77245	Allow customizing the controller webhook port (#1410 ) Closes #1314 Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>	2022-05-16 10:33:13 +09:00
Yusuke Kuoka	b5194fd75a	Enhance RunnerSet to optionally retain PVs accross restarts (#1340 ) * Enhance RunnerSet to optionally retain PVs accross restarts This is our initial attempt to bring back the ability to retain PVs across runner pod restarts when using RunnerSet. The implementation is composed of two new controllers, `runnerpersistentvolumeclaim-controller` and `runnerpersistentvolume-controller`. It all starts from our existing `runnerset-controller`. The controller now tries to mark any PVCs created by StatefulSets created for the RunnerSet. Once the controller terminated statefulsets, their corresponding PVCs are clean up by `runnerpersistentvolumeclaim-controller`, then PVs are unbound from their corresponding PVCs by `runnerpersistentvolume-controller` so that they can be reused by future PVCs createf for future StatefulSets that shares the same same StorageClass. Ref #1286 * Update E2E test suite to cover runner, docker, and go caching with RunnerSet + PVs Ref #1286	2022-05-16 09:26:48 +09:00
renovate[bot]	adf69bbea0	fix(deps): update module github.com/prometheus/client_golang to v1.12.2 (#1448 ) Co-authored-by: Renovate Bot <bot@renovateapp.com>	2022-05-16 09:19:55 +09:00
renovate[bot]	b43ef70ac6	fix(deps): update module github.com/bradleyfalzon/ghinstallation/v2 to v2.0.4 (#1452 ) Co-authored-by: Renovate Bot <bot@renovateapp.com>	2022-05-16 08:59:53 +09:00
Yusuke Kuoka	f1caebbaf0	Update codeql.yml (#1451 ) Give up pinning deps with commit IDs because PRs were unreviewable due to missing changelog and it sends PRs for every commit to the master/main branch of the deps, which is undesired. We only need updates for tagged releases!	2022-05-16 08:59:29 +09:00

... 2 3 4 5 6 ...

1088 Commits All Branches Search

1088 Commits

All Branches