actions-runner-controller

Commit Graph

Author	SHA1	Message	Date
Yusuke Kuoka	5030e075a9	dockerfile,e2e: Use buildx and cache mounts for faster rebuilds in E2E	2022-03-02 19:03:20 +09:00
Yusuke Kuoka	3115d71471	acceptance,e2e: Enhance deploy.sh to support more types of runnersets	2022-03-02 19:03:20 +09:00
Renovate Bot	c221b6e278	chore(deps): update actions/checkout action to v3	2022-03-02 11:05:16 +09:00
Renovate Bot	a8dbc8a501	fix(deps): update module github.com/prometheus/client_golang to v1.12.1	2022-03-02 10:56:53 +09:00
Renovate Bot	b1ac63683f	fix(deps): update module go.uber.org/zap to v1.21.0	2022-03-02 10:54:35 +09:00
Renovate Bot	10bc28af75	fix(deps): update module sigs.k8s.io/controller-runtime to v0.11.1	2022-03-02 10:52:43 +09:00
Renovate Bot	e23692b3bc	chore(deps): update actions/setup-python action to v3	2022-03-02 10:51:22 +09:00
renovate[bot]	e7f4a0e200	chore(deps): update actions/setup-go action to v3 (#1163 ) Co-authored-by: Renovate Bot <bot@renovateapp.com>	2022-03-02 10:51:01 +09:00
Yusuke Kuoka	828ddcd44e	Merge pull request #1151 from fgalind1/improve-logs logging: improve logs for scaling	2022-03-02 10:46:53 +09:00
Yusuke Kuoka	fc821fd473	Merge pull request #1168 from actions-runner-controller/docs/better-runner-group-description docs: better runner group description	2022-03-02 10:31:22 +09:00
Callum Tait	4b0aa92286	docs: better wording	2022-03-01 08:56:30 +00:00
Callum Tait	c69c8dd84d	docs: better runner group description	2022-03-01 08:54:24 +00:00
Renovate Bot	e42db00006	chore(deps): update dependency actions/runner to v2.288.1	2022-02-28 22:30:10 +00:00
Felipe Galindo Sanchez	eff0c7364f	Merge branch 'master' into improve-logs	2022-02-28 09:25:30 -08:00
Tingluo Huang	516695b275	Set UserAgent to 'actions-runner-controller' for all Http Client. (#1140 ) I can't find any requests made by user agent `actions-runner-controller` in GitHub.com's telemetry in the past 7 days. Turns out we only set user agent `actions-runner-controller` if we are configured to use BasicAuth which is not the case for most customers I think. I update the code a little bit to make sure it always set `actions-runner-controller` as UserAgent for the GitHub HttpClient in ARC. A further step would be somehow baking the ARC release version into the UserAgent as well.	2022-02-28 09:17:58 +09:00
Yusuke Kuoka	686d40c20d	Merge pull request #1127 from actions-runner-controller/github-api-cache Enhances ARC(both the controller-manager and github-webhook-server) to cache any GitHub API responses with HTTP GET and an appropriate Cache-Control header. Ref #920 ## Cache Implementation `gregjones/httpcache` has been chosen as a library to implement this feature, as it is as recommended in `go-github`'s documentation: https://github.com/google/go-github#conditional-requests `gregjones/httpcache` supports a number of cache backends like `diskcache`, `s3cache`, and so on: https://github.com/gregjones/httpcache#cache-backends We stick to the built-in in-memory cache as a starter. Probably this will never becomes an issue as long as various HTTP responses for all the GitHub API calls that ARC makes, list-runners, list-workflow-jobs, list-runner-groups, etc., doesn't overflow the in-memory cache. `httpcache` has an known unfixed issue that it doesn't update cache on chunked responses. But we assume that the APIs that we call doesn't use chunked responses. See #1503 for more information on that. ## Ephemeral runner pods are no longer recreated The addition of the cache layer resulted in a slow down of a scale-down process and a trade-off between making the runner pod termination process fragile to various race conditions(shorter grace period before runner deletion) or delaying runner pod deletion depending on how long the grace period is(longer grace period). A grace period needs to be at least longer than 60s (which is the same as cache duration of ListRunners API) to not prematurely delete a runner pod that was just created. But once I disabled automatic recreation of ephemeral runner pod, it turned out to be no more of an issue when it's being scaled via workflow_job webhook. Ephemeral runner resources are still automatically added on demand by RunnerDeployment via RunnerReplicaSet(I've added `EffectiveTime` fields to our CRDs but that's an implementation detail so let's omit). A good side-effect of disabling ephemeral runner pod recreations is that ARC will no longer create redundant ephemeral runners when used with webhook-based autoscaler. Basically, autoscaling still works as everyone might expect. It's just better than before overall.	2022-02-28 08:37:26 +09:00
Renovate Bot	f0fa99fc53	chore(deps): update dependency actions/runner to v2.288.0	2022-02-26 01:34:49 +00:00
Javier Sotelo	6b12413fdd	Add optional hostNetwork (#1035 ) Co-authored-by: jsotelo <javier.sotelo@viasat.com>	2022-02-23 20:11:40 +00:00
Felipe Galindo Sanchez	3abecd0f19	logging: improve logs for scaling	2022-02-23 08:29:13 -08:00
Callum Tait	7156ce040e	chore: bump chart (#1138 )	2022-02-21 09:24:14 +00:00
Yusuke Kuoka	1463d4927f	acceptance,e2e: Let capacity reservation expired more later	2022-02-21 00:07:49 +00:00
Yusuke Kuoka	5bc16f2619	Enhance HRA capacity reservation update log	2022-02-21 00:06:26 +00:00
Yusuke Kuoka	b8e65aa857	Prevent unnecessary ephemeral runner recreations	2022-02-20 13:45:42 +00:00
Yusuke Kuoka	d4a9750e20	acceptance,e2e: Enhance E2E test and deploy.sh to support scaleDownDelaySeconds~ and minReplicas for HRA	2022-02-20 13:45:42 +00:00
Yusuke Kuoka	a6f0e0008f	Make unregistration timeout and retry delay configurable in integration tests	2022-02-20 12:05:34 +00:00
Yusuke Kuoka	79a31328a5	Stop recreating ephemeral runner pod Ref https://github.com/actions-runner-controller/actions-runner-controller/issues/911#issuecomment-1046161384	2022-02-20 04:42:19 +00:00
Yusuke Kuoka	4e6bfd8114	e2e: Add ability to toggle dockerdWithinRunnerContainer	2022-02-20 04:37:15 +00:00
Yusuke Kuoka	3c16188371	Introduce consistent timeouts for runner unregistration and runner pod deletion Enhances runner controller and runner pod controller to have consistent timeouts for runner unregistration and runner pod deletion, so that we are very much unlikely to terminate pods that are running any jobs.	2022-02-20 04:36:35 +00:00
Yusuke Kuoka	9e356b419e	chart: Add default-logs-container annotation to controller pods so that you can run `kubectl logs` on controller pods without the specifying the container name. It is especially useful when you want to run kubectl-logs on all ARC pods across controller-manager and github-webhook-server like: ``` kubectl -n actions-runner-system logs -l app.kubernetes.io/name=actions-runner-controller ``` That was previously impossible due to that the selector matches pods from both controller-manager and github-webhook-server and kubectl does not provide a way to specify container names for respective pods.	2022-02-19 12:22:53 +00:00
Yusuke Kuoka	f3ceccd904	acceptance: Improve deploy.sh to recreate ARC (not runner) pods on new test id So that one does not need to manually recreate ARC pods frequently.	2022-02-19 12:22:53 +00:00
Yusuke Kuoka	4b557dc54c	Add logging transport to log HTTP requests in log level -3 The log level -3 is the minimum log level that is supported today, smaller than debug(-1) and -2(used to log some HRA related logs). This commit adds a logging HTTP transport to log HTTP requests and responses to that log level. It implements http.RoundTripper so that it can log each HTTP request with useful metadata like `from_cache` and `ratelimit_remaining`. The former is set to `true` only when the logged request's response was served from ARC's in-memory cache. The latter is set to X-RateLimit-Remaining response header value if and only if the response was served by GitHub, not by ARC's cache.	2022-02-19 12:22:53 +00:00
Yusuke Kuoka	4c53e3aa75	Add GitHub API cache to avoid rate limit This will cache any GitHub API responses with correct Cache-Control header. `gregjones/httpcache` has been chosen as a library to implement this feature, as it is as recommended in `go-github`'s documentation: https://github.com/google/go-github#conditional-requests `gregjones/httpcache` supports a number of cache backends like `diskcache`, `s3cache`, and so on: https://github.com/gregjones/httpcache#cache-backends We stick to the built-in in-memory cache as a starter. Probably this will never becomes an issue as long as various HTTP responses for all the GitHub API calls that ARC makes, list-runners, list-workflow-jobs, list-runner-groups, etc., doesn't overflow the in-memory cache. `httpcache` has an known unfixed issue that it doesn't update cache on chunked responses. But we assume that the APIs that we call doesn't use chunked responses. See #1503 for more information on that. Ref #920	2022-02-19 12:22:53 +00:00
Tingluo Huang	0b9bef2c08	Try to unconfig runner before deleting the pod to recreate (#1125 ) There is a race condition between ARC and GitHub service about deleting runner pod. - The ARC use REST API to find a particular runner in a pod that is not running any jobs, so it decides to delete the pod. - A job is queued on the GitHub service side, and it sends the job to this idle runner right before ARC deletes the pod. - The ARC delete the runner pod which cause the in-progress job to end up canceled. To avoid this race condition, I am calling `r.unregisterRunner()` before deleting the pod. - `r.unregisterRunner()` will return 204 to indicate the runner is deleted from the GitHub service, we should be safe to delete the pod. - `r.unregisterRunner` will return 400 to indicate the runner is still running a job, so we will leave this runner pod as it is. TODO: I need to do some E2E tests to force the race condition to happen. Ref #911	2022-02-19 21:22:31 +09:00
Yusuke Kuoka	a5ed6bd263	Fix RunerSet managed runner pods to terminate more gracefully (#1126 ) Make RunnerSet-managed runners as reliable as RunnerDeployment-managed runners. Ref https://github.com/actions-runner-controller/actions-runner-controller/issues/911#issuecomment-1042404460	2022-02-19 21:19:37 +09:00
Yusuke Kuoka	921f547200	fix: Do recreate runner pod on registration token update (#1087 ) Apparently, we've been missed taking an updated registration token into account when generating the pod template hash which is used to detect if the runner pod needs to be recreated. This shouldn't have been the end of the world since the runner pod is recreated on the next reconciliation loop anyway, but this change will make the pod recreation happen one reconciliation loop earlier so that you're less likely to get runner pods with outdated refresh tokens. Ref https://github.com/actions-runner-controller/actions-runner-controller/pull/1085#issuecomment-1027433365	2022-02-19 21:18:00 +09:00
Felipe Galindo Sanchez	9079c5d85f	fix: configure logger before trying to log (#1128 ) Log about GitHub client not being initialized is not seen as logger is configured after adding the log	2022-02-19 20:56:58 +09:00
Yusuke Kuoka	a9aea0bd9c	Fix issue that visible runner groups are printed as if empty in log	2022-02-19 14:43:41 +09:00
Yusuke Kuoka	fcf4778bac	Fix regression that prevented default organizational runner group from being scale target Fixes #1131	2022-02-19 14:43:41 +09:00
Yusuke Kuoka	eb0a4a9603	chart: Bump to 0.16.0 (with appVersion 0.21.0)	2022-02-18 01:57:37 +00:00
Yusuke Kuoka	b6151ebb8d	Fjx release.yml upload artifacts to not fail due to outdated go (1.15)	2022-02-18 10:27:39 +09:00
Yusuke Kuoka	ba4bd7c0db	e2e,acceptance: Cover enterprise runners (#1124 ) Adds various code and changes I have used while testing #1062	2022-02-17 09:16:28 +09:00
Yusuke Kuoka	5b92c412a4	chart: Allow using different secrets for controller-manager and gh-webhook-server (#1122 ) * chart: Allow using different secrets for controller-manager and gh-webhook-server As it is entirely possible to do so because they are two different K8s deployments. It may provide better scalability because then each component gets its own GitHub API quota.	2022-02-17 09:16:16 +09:00
Yusuke Kuoka	e22d981d58	githubwebhookserver: Tweak log levels of various messages (#1123 ) Some of logs like `HRA keys indexed for HRA` were so excessive that it made testing and debugging the githubwebhookserver harder. This tries to fix that.	2022-02-17 09:15:26 +09:00
Yusuke Kuoka	a7b39cc247	acceptance: Avoid "metadata.annotations too long" errors on applying CRDs	2022-02-17 09:01:44 +09:00
Yusuke Kuoka	1e452358b4	acceptance: Do recreate the controller-manager secret on every deployment We had to manually remove the secret first to update the GitHub credentials used by the controller, which was cumbersome. Note that you still need to recreate the controller pods and the gh webhook server pods to let them remount the recreated secret.	2022-02-17 09:01:44 +09:00
Carlos Tadeu Panato Junior	92e133e007	ci: update helm to 3.8.0 and go to 1.17.7 (#1119 ) Signed-off-by: Carlos Panato <ctadeu@gmail.com>	2022-02-16 20:40:27 +09:00
Felipe Galindo Sanchez	d0d316252e	Option to consider runner group visibility on scale based on webhook (#1062 ) This will work on GHES but GitHub Enterprise Cloud due to excessive GitHub API calls required. More work is needed, like adding a cache layer to the GitHub client, to make it usable on GitHub Enterprise Cloud. Fixes additional cases from https://github.com/actions-runner-controller/actions-runner-controller/pull/1012 If GitHub auth is provided in the webhooks controller then runner groups with custom visibility are supported. Otherwise, all runner groups will be assumed to be visible to all repositories `getScaleUpTargetWithFunction()` will check if there is an HRA available with the following flow: 1. Search for repository HRAs - if so it ends here 2. Get available HRAs in k8s 3. Compute visible runner groups a. If GitHub auth is provided - get all the runner groups that are visible to the repository of the incoming webhook using GitHub API calls. b. If GitHub auth is not provided - assume all runner groups are visible to all repositories 4. Search for default organization runners (a.k.a runners from organization's visible default runner group) with matching labels 5. Search for default enterprise runners (a.k.a runners from enterprise's visible default runner group) with matching labels 6. Search for custom organization runner groups with matching labels 7. Search for custom enterprise runner groups with matching labels Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>	2022-02-16 19:08:56 +09:00
Shu Ambat	b509eb4388	Update the helm chart app version (#1099 )	2022-02-09 09:29:49 +09:00
Yusuke Kuoka	59437ef79f	Update README.md Ref https://github.com/actions-runner-controller/actions-runner-controller/issues/1100#issuecomment-1032775144	2022-02-09 09:16:46 +09:00
Ryo Sakamoto	a51fb90cd2	modify chart ingress (#1098 ) Signed-off-by: cw-sakamoto <sakamoto@chatwork.com>	2022-02-08 12:56:30 +09:00

1 2 3 4 5 ...

693 Commits All Branches Search

693 Commits

All Branches