actions-runner-controller

Commit Graph

Author	SHA1	Message	Date
Rob Whitby	dbda292f54	fix typo in examples (#373 )	2021-03-08 09:18:10 +09:00
callum-tait-pbx	550a864198	chore: bumping helm chart (#372 ) PR 355 made changes to the CRDs but didn't bump the version	2021-03-05 20:27:52 +09:00
Yusuke Kuoka	4fa5315311	Fix possible flapping autoscale on runner update (#371 ) Addresses https://github.com/summerwind/actions-runner-controller/pull/355#discussion_r587199428	2021-03-05 10:21:20 +09:00
Hiroshi Muraoka	11e58fcc41	Manage runner with label (#355 ) * Update RunnerDeploymentSpec to have Selector field Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com> * Update RunnerReplicaSetSpec to have Selector field Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com> * Add CloneSelectorAndAddLabel to add Selector field Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com> * Fix tests Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com> * Use label to find RunnerReplicaSet/Runner Signed-off-by: binoue <banji-inoue@cybozu.co.jp> * Update controller-gen versions in CRD Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com> * Update autoscaler to list Pods with labels Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com> * Add debug log Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com> * Modify RunnerDeployment tests Signed-off-by: binoue <banji-inoue@cybozu.co.jp> * Modify RunnerReplicaset test Signed-off-by: binoue <banji-inoue@cybozu.co.jp> * Modify integration test Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com> * Use RunnerDeployment Template Labels as the default selector for backward compatibility * Fix labeling Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com> * Update func in Eventually to return (int, error) Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com> * Update RunnerDeployment controller not to use label selector Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com> * Fix potential replicaset controller breakage on replicaset created before v0.17.0 * Fix errors on existing runner replica sets * Ensure RunnerReplicaSet Spec Selector addition does not break controller * Ensure RunnerDeployment Template.Spec.Labels change does result in template hash change * Fix comment Co-authored-by: binoue <banji-inoue@cybozu.co.jp> Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>	2021-03-05 10:15:39 +09:00
Mike Perry	f220fefe92	Update README.md (#370 )	2021-03-05 09:17:32 +09:00
Hidetake Iwata	56b4598d1d	Fix helm template error when webhook server is enabled (#365 ) * Fix include block in githubwebhook.deployment.yaml * Fix include block in githubwebhook.secrets.yaml	2021-03-03 09:21:58 +09:00
Taehyun Kim	8f977dbe48	Fix various bugs in helm chart (#364 ) * Fix wrong trim * add missing MutatingWeghookConfiguration.webhooks[].sideEffects fix missing admissionReviewVersions * admissionregistration.k8s.io/v1 for kustomization manifests * revert webhook config	2021-03-03 09:21:20 +09:00
Yusuke Kuoka	9ae3551744	Remove unnecessary GitHub API calls (#363 ) The controller had the 2 extra and redundant calls to List Workflow Runs API. Ref #362	2021-03-02 10:55:30 +09:00
Rolf Ahrenberg	05ad3f5469	Set default python (#361 )	2021-03-01 09:45:13 +09:00
callum-tait-pbx	9c7372a8e0	docs: styling fixes (#359 ) * docs: styling fixes * docs: grammer fixes	2021-03-01 09:44:35 +09:00
Yusuke Kuoka	584590e97c	Use patch instead of update to alleviate HRA conflict on webhook (#358 ) We sometimes see that integration test fails due to runner replicas not meeting the expected number in a timely manner. After investigating a bit, this turned out to be due to that HRA updates on webhook-based autoscaler and HRA controller are conflicting. This changes the controllers to use Patch instead of Update to make conflicts less likely to happen. I have also updated the hra controller to use Patch when updating RunnerDeployment, too. Overall, these changes should make the webhook-based autoscaling more reliable due to less conflicts.	2021-02-26 10:17:09 +09:00
Yusuke Kuoka	d18884a0b9	Fix HRA expired cache entries not cleaned up (#357 ) Fixes #356	2021-02-26 09:54:24 +09:00
callum-tait-pbx	f987571b64	Improve docs (#303 )	2021-02-26 09:32:18 +09:00
Taehyun Kim	450e384c4c	Update helm chart (#343 ) * add replicaCount * Add authSecret.existingSecret * set image.tag null by default * implement ingress for githubwebhook server * fix deprecated and secretName template * backward compat .authSecret.enabled * existingSecret for github webhook secret * use secretName template * set default secret names * do not use app version based image tag * create and name variable for secrets	2021-02-26 09:26:51 +09:00
Yusuke Kuoka	e9eef04993	Fix old HRA capacity reservations not cleaned up (#354 ) Similar to #348 for #346, but for HRA.Spec.CapacityReservations usually modified by the webhook-based autoscaler controller. This patch tries to fix that by improving the webhook-based autoscaler controller to omit expired reservations on updating HRA spec.	2021-02-25 11:08:00 +09:00
Yusuke Kuoka	598dd1d9fe	Fix incorrect DESIRED on `kubectl get hra (#353 ) `kubectl get horizontalrunnerautoscalers.actions.summerwind.dev` shows HRA.status.desiredReplicas as the DESIRED count. However the value had been not taking capacityReservations into account, which resulted in showing incorrect count when you used webhook-based autoscaler, or capacityReservations API directly. This fixes that.	2021-02-25 10:32:09 +09:00
Yusuke Kuoka	9890a90e69	Improve webhook-based autoscaler log (#352 ) The controller had been writing confusing messages like the below on missing scale target: ``` Found too many scale targets: It must be exactly one to avoid ambiguity. Either set WatchNamespace for the webhook-based autoscaler to let it only find HRAs in the namespace, or update Repository or Organization fields in your RunnerDeployment resources to fix the ambiguity.{"scaleTargets": ""} ``` This fixes that, while improving many kinds of messages written while reconcilation, so that the error message is more actionable.	2021-02-25 10:07:41 +09:00
Yusuke Kuoka	9da123ae5e	Fix integration test flakiness (#351 ) Ref https://github.com/summerwind/actions-runner-controller/pull/345#issuecomment-785015406	2021-02-25 09:30:32 +09:00
Johannes Nicolai	4d4137aa28	Avoid zombie runners that missed token expiration by a bit (#345 ) * if a new runner pod was just scheduled to start up right before a registration expired, it will not get a new registration token and go in an infinite update loop (until #341) kicks in * if registzration tokens got updated a little bit before they actually expired, just starting up pods will way more likely get a working token	2021-02-25 09:07:49 +09:00
Yusuke Kuoka	022007078e	Compact excessive error message on runnerreplicaset status update conflict (#350 ) We occasionally see logs like the below: ``` 2021-02-24T02:48:26.769ZERRORFailed to update runner status{"runnerreplicaset": "testns-244ol/example-runnerdeploy-j5wzf", "error": "Operation cannot be fulfilled on runnerreplicasets.actions.summerwind.dev \"example-runnerdeploy-j5wzf\": the object has been modified; please apply your changes to the latest version and try again"} github.com/go-logr/zapr.(zapLogger).Error /home/runner/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128 github.com/summerwind/actions-runner-controller/controllers.(RunnerReplicaSetReconciler).Reconcile /home/runner/work/actions-runner-controller/actions-runner-controller/controllers/runnerreplicaset_controller.go:207 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).reconcileHandler /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:256 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).processNextWorkItem /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:232 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).worker /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:211 k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1 /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:152 k8s.io/apimachinery/pkg/util/wait.JitterUntil /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:153 k8s.io/apimachinery/pkg/util/wait.Until /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:88 2021-02-24T02:48:26.769ZERRORcontroller-runtime.controllerReconciler error{"controller": "testns-244olrunnerreplicaset", "request": "testns-244ol/example-runnerdeploy-j5wzf", "error": "Operation cannot be fulfilled on runnerreplicasets.actions.summerwind.dev \"example-runnerdeploy-j5wzf\": the object has been modified; please apply your changes to the latest version and try again"} github.com/go-logr/zapr.(zapLogger).Error /home/runner/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).reconcileHandler /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:258 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).processNextWorkItem /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:232 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:211 k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1 /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:152 k8s.io/apimachinery/pkg/util/wait.JitterUntil /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:153 k8s.io/apimachinery/pkg/util/wait.Until /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:88 ``` which can be compacted into one-liner, without the useless stack trace, without double-logging the same error from the logger and the controller.	2021-02-25 09:01:02 +09:00
Johannes Nicolai	31e5e61155	Log correct runner that was deleted (#349 )	2021-02-25 08:38:55 +09:00
Aditya Purandare	1d1453c5f2	Fix user used for dind runner group permissions (#337 )	2021-02-24 19:06:52 +09:00
Yusuke Kuoka	e44e53b88e	Fix failure while saving HRA status after running controller for a while (#348 ) Fixes #346	2021-02-24 11:20:21 +09:00
Yusuke Kuoka	398791241e	Fix runner release workflow to do docker-push (#347 ) Apparently I have mistakenly removed `push` option from the workflow in #323 which resulted in new runner build #323 not being pushed. This fixes that.	2021-02-24 11:08:33 +09:00
Yusuke Kuoka	991535e567	Fix panic on webhook for user-owned repository (#344 ) * Fix panic on webhook for user-owned repository Follow-up for #282 and #334	2021-02-23 08:05:25 +09:00
Johannes Nicolai	2d7fbbfb68	Handle offline runners gracefully (#341 ) * if a runner pod starts up with an invalid token, it will go in an infinite retry loop, appearing as RUNNING from the outside * normally, this error situation is detected because no corresponding runner objects exists in GitHub and the pod will get removed after registration timeout * if the GitHub runner object already existed before - e.g. because a finalizer was not properly run as part of a partial Kubernetes crash, the runner will always stay in a running mode, even updating the registration token will not kill the problematic pod * introducing RunnerOffline exception that can be handled in runner controller and replicaset controller * as runners are offline when a pod is completed and marked for restart, only do additional restart checks if no restart was already decided, making code a bit cleaner and saving GitHub API calls after each job completion	2021-02-22 10:08:04 +09:00
Yusuke Kuoka	dd0b9f3e95	Merge pull request #340 from int128/integration-test-check-run Fix index key to find HRA in GitHub webhook handler	2021-02-22 09:49:54 +09:00
Yusuke Kuoka	7cb2bc84c8	Merge pull request #334 from summerwind/integration-test-check-run Add integration test for autoscaling on check_run webhook event	2021-02-22 09:38:07 +09:00
Hidetake Iwata	b0e74bebab	Fix index key to find HRA in GitHub webhook handler	2021-02-20 21:25:23 +09:00
Hidetake Iwata	dfbe53dcca	Fix webhook payload in integration test	2021-02-20 21:08:23 +09:00
Yusuke Kuoka	ebc3970b84	Add integration test for autoscaling on check_run webhook event	2021-02-19 10:33:04 +09:00
Hidetake Iwata	1ddcf6946a	Fix nil pointer error on received check_run event (#331 ) * Reproduce nil pointer error on received check_run event * Fix nil pointer error on received check_run event	2021-02-18 20:22:36 +09:00
Yusuke Kuoka	cfbaad38c8	Merge pull request #328 from int128/fix-port-name-length Changes: 1. Fix length of github-webhook-server port name 2. Add a cluster role binding for github-webhook-server 3. Remove --enable-leader-election from github-webhook-server	2021-02-18 20:20:39 +09:00
Yusuke Kuoka	67f6de010b	feat: Common runner labels configurable per controller (#327 ) * feat: Common runner labels configurable per controller Ref #321	2021-02-18 20:19:08 +09:00
Hidetake Iwata	2db608879a	Remove --enable-leader-election from github-webhook-server	2021-02-18 16:51:47 +09:00
Hidetake Iwata	2c4a6ca90b	Add cluster role binding for github-webhook-server	2021-02-18 16:49:24 +09:00
Hidetake Iwata	829bf20449	Fix length of github-webhook-server port name	2021-02-18 16:42:15 +09:00
Reinier Timmer	be13322816	Update runner to 2.277.1 (#322 ) * Update runner to 2.277.1 * Update build-and-release-runners.yml * integration test condition Don't run integration tests when only updating the runner image * fixup! integration test condition Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>	2021-02-18 09:29:53 +09:00
Johannes Nicolai	7f4a76a39b	Also log into DockerHub for release event (#326 ) * so far, only push events would trigger the DockerHub login step * hence, attempts to release would fail because of a permission problem (tested locally) * adding OR condition to also login in case a release got published	2021-02-18 08:54:44 +09:00
callum-tait-pbx	0fce761686	fix: add trunate to ensure service kinds have valid names (#325 ) * fix: adding truncate for service kinds * chore : bumping chart version	2021-02-18 08:43:48 +09:00
Yusuke Kuoka	c88ff44518	Fix wip.yml workflow for building controller canary tags (#323 ) In #306 we seem to have accidentally updated a wrong workflow, which was for runner builds. This updates the one for the controller. Resolves #302	2021-02-18 08:42:24 +09:00
Yusuke Kuoka	2fdf35ac9d	Refactor integration test to use helpers (#320 ) This should make the test code a bit more DRY and readable.	2021-02-17 10:23:35 +09:00
Johannes Nicolai	6cce3fefc5	Add project to awesome-runners list (#319 )	2021-02-17 09:14:42 +09:00
Yusuke Kuoka	eb2eaf8130	Fix TotalNumberOfQueuedAndInProgressWorkflowRuns to work with a lot of remaining `completed` jobs (#316 ) I have heard from some user that they have hundred thousands of `status=completed` workflow runs in their repository which effectively blocked TotalNumberOfQueuedAndInProgressWorkflowRuns from working because of GitHub API rate limit due to excessive paginated requests. This fixes that by separating list-workflow-runs calls to two - one for `queued` and one for `in_progress`, which can make the minimum API call from 1 to 2, but allows it to work regardless of number of remaining `completed` workflow runs.	2021-02-16 18:55:55 +09:00
callum-tait-pbx	7bf712d0d4	fix: duplicate name attribute (#318 )	2021-02-16 18:52:08 +09:00
Yusuke Kuoka	7d024a6c05	Fix "duplicate metrics collector registration attempted" errors at startup (#317 ) I have seen this error a lot in our integration test. It turned out due to https://github.com/kubernetes-sigs/controller-runtime/issues/484 and is being fixed with this change.	2021-02-16 18:51:33 +09:00
Yusuke Kuoka	434823bcb3	`scale{Up,Down}Adjustment` to add/remove constant number of replicas on scaling (#315 ) * `scale{Up,Down}Adjustment` to add/remove constant number of replicas on scaling Ref #305 * Bump chart version	2021-02-16 17:16:26 +09:00
Yusuke Kuoka	35d047db01	Fix enterprise runners misusing cached token (#314 ) Follow-up for #290	2021-02-16 12:56:52 +09:00
Yusuke Kuoka	f1db6af1c5	Add repository runners support for PercentageRunnersBusy-based autoscaling (#313 ) Resolves #258	2021-02-16 12:44:51 +09:00
Hidetake Iwata	4f3f2fb60d	Add metrics for GitHub API rate limit (#312 )	2021-02-16 09:58:09 +09:00

... 26 27 28 29 30 ...

1660 Commits All Branches Search

1660 Commits

All Branches