actions-runner-controller

Commit Graph

Author	SHA1	Message	Date
Yusuke Kuoka	f858e2e432	Add POC of GitHub Webhook Delivery Forwarder (#682 ) * Add POC of GitHub Webhook Delivery Forwarder * multi-forwarder and ctrl-c existing and fix for non-woring http post * Rename source files * Extract signal handling into a dedicated source file * Faster ctrl-c handling * Enable automatic creation of repo hook on startup * Add support for forwarding org hook deliveries * Set hook secret on hook creation via envvar (HOOK_SECRET) * Fix org hook support * Fix HOOK_SECRET for consistency * Refactor to prepare for custom log position provider * Refactor to extract inmemory log position provider * Add configmap-based log position provider * Rename githubwebhookdeliveryforwarder to hookdeliveryforwarder * Refactor to rename LogPositionProvider to Checkpointer and extract ConfigMap checkpointer into a dedicated pkg * Refactor to extract logger initialization * Add hookdeliveryforwarder README and bump go-github to unreleased ver	2021-07-14 10:18:55 +09:00
Yusuke Kuoka	f19e7ea8a8	chore: Upgrade go-github to v36 (#681 )	2021-07-04 17:43:52 +09:00
Yusuke Kuoka	8b90b0f0e3	Clean up import list (#645 ) Resolves #644	2021-06-22 17:55:06 +09:00
Yusuke Kuoka	9e4dbf497c	feat: RunnerSet backed by StatefulSet (#629 ) * feat: RunnerSet backed by StatefulSet Unlike a runner deployment, a runner set can manage a set of stateful runners by combining a statefulset and an admission webhook that mutates statefulset-managed pods with required envvars and registration tokens. Resolves #613 Ref #612 * Upgrade controller-runtime to 0.9.0 * Bump Go to 1.16.x following controller-runtime 0.9.0 * Upgrade kubebuilder to 2.3.2 for updated etcd and apiserver following local setup * Fix startup failure due to missing LeaderElectionID * Fix the issue that any pods become unable to start once actions-runner-controller got failed after the mutating webhook has been registered * Allow force-updating statefulset * Fix runner container missing work and certs-client volume mounts and DOCKER_HOST and DOCKER_TLS_VERIFY envvars when dockerdWithinRunner=false * Fix runnerset-controller not applying statefulset.spec.template.spec changes when there were no changes in runnerset spec * Enable running acceptance tests against arbitrary kind cluster * RunnerSet supports non-ephemeral runners only today * fix: docker-build from root Makefile on intel mac * fix: arch check fixes for mac and ARM * ci: aligning test data format and patching checks * fix: removing namespace in test data * chore: adding more ignores * chore: removing leading space in shebang * Re-add metrics to org hra testdata * Bump cert-manager to v1.1.1 and fix deploy.sh Co-authored-by: toast-gear <15716903+toast-gear@users.noreply.github.com> Co-authored-by: Callum James Tait <callum.tait@photobox.com>	2021-06-22 17:10:09 +09:00
Yusuke Kuoka	25f5817a5e	Improve debug log in webhook-based autoscaling Adds some helpful debug log messages I have used while verifying #534	2021-05-11 15:49:03 +09:00
Yusuke Kuoka	f5c639ae28	Make webhook-based autoscaler github event logs more operator-friendly (#384 ) Adds fields like `pullRequest.base.ref` and `checkRun.status` that are useful for verifying the autoscaling behaviour without browsing GitHub. Ref https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-794175312	2021-03-10 09:40:44 +09:00
Yusuke Kuoka	1b8a656051	Use --watch-namespace flag to restrict the namespace to watch Ref https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-793172995	2021-03-09 09:46:21 +09:00
Rob Whitby	1753fa3530	handle GET requests in webhook hra (#378 )	2021-03-09 08:46:27 +09:00
Yusuke Kuoka	584590e97c	Use patch instead of update to alleviate HRA conflict on webhook (#358 ) We sometimes see that integration test fails due to runner replicas not meeting the expected number in a timely manner. After investigating a bit, this turned out to be due to that HRA updates on webhook-based autoscaler and HRA controller are conflicting. This changes the controllers to use Patch instead of Update to make conflicts less likely to happen. I have also updated the hra controller to use Patch when updating RunnerDeployment, too. Overall, these changes should make the webhook-based autoscaling more reliable due to less conflicts.	2021-02-26 10:17:09 +09:00
Yusuke Kuoka	e9eef04993	Fix old HRA capacity reservations not cleaned up (#354 ) Similar to #348 for #346, but for HRA.Spec.CapacityReservations usually modified by the webhook-based autoscaler controller. This patch tries to fix that by improving the webhook-based autoscaler controller to omit expired reservations on updating HRA spec.	2021-02-25 11:08:00 +09:00
Yusuke Kuoka	9890a90e69	Improve webhook-based autoscaler log (#352 ) The controller had been writing confusing messages like the below on missing scale target: ``` Found too many scale targets: It must be exactly one to avoid ambiguity. Either set WatchNamespace for the webhook-based autoscaler to let it only find HRAs in the namespace, or update Repository or Organization fields in your RunnerDeployment resources to fix the ambiguity.{"scaleTargets": ""} ``` This fixes that, while improving many kinds of messages written while reconcilation, so that the error message is more actionable.	2021-02-25 10:07:41 +09:00
Yusuke Kuoka	991535e567	Fix panic on webhook for user-owned repository (#344 ) * Fix panic on webhook for user-owned repository Follow-up for #282 and #334	2021-02-23 08:05:25 +09:00
Hidetake Iwata	b0e74bebab	Fix index key to find HRA in GitHub webhook handler	2021-02-20 21:25:23 +09:00
Yusuke Kuoka	ebc3970b84	Add integration test for autoscaling on check_run webhook event	2021-02-19 10:33:04 +09:00
Hidetake Iwata	1ddcf6946a	Fix nil pointer error on received check_run event (#331 ) * Reproduce nil pointer error on received check_run event * Fix nil pointer error on received check_run event	2021-02-18 20:22:36 +09:00
Yusuke Kuoka	7d024a6c05	Fix "duplicate metrics collector registration attempted" errors at startup (#317 ) I have seen this error a lot in our integration test. It turned out due to https://github.com/kubernetes-sigs/controller-runtime/issues/484 and is being fixed with this change.	2021-02-16 18:51:33 +09:00
Yusuke Kuoka	ab1c39de57	feat: HorizontalRunnerAutoscaler Webhook server (#282 ) * feat: HorizontalRunnerAutoscaler Webhook server This introduces a Webhook server that responds GitHub `check_run`, `pull_request`, and `push` events by scaling up matched HorizontalRunnerAutoscaler by 1 replica. This allows you to immediately add "resource slack" for future GitHub Actions job runs, without waiting next sync period to add insufficient runners. This feature is highly inspired by https://github.com/philips-labs/terraform-aws-github-runner. terraform-aws-github-runner can manage one set of runners per deployment, where actions-runner-controller with this feature can manage as many sets of runners as you declare with HorizontalRunnerAutoscaler and RunnerDeployment pairs. On each GitHub event received, the webhook server queries repository-wide and organizational runners from the cluster and searches for the single target to scale up. The webhook server tries to match HorizontalRunnerAutoscaler.Spec.ScaleUpTriggers[].GitHubEvent.[CheckRun\|Push\|PullRequest] against the event and if it finds only one HRA, it is the scale target. If none or two or more targets are found for repository-wide runners, it does the same on organizational runners. Changes: * Fix integration test * Update manifests * chart: Add support for github webhook server * dockerfile: Include github-webhook-server binary * Do not import unversioned go-github * Update README	2021-02-07 17:37:27 +09:00

17 Commits