docs: updating to reflect the latest release (#844)

* docs: updating to reflect the latest release

* docs: further updates

* docs: format updates

* docs: correcting title sizes

* docs: correcting links

* docs: correcting the grammar of multi controllers

* docs: slightly less awkward English
This commit is contained in:
Callum Tait 2021-09-28 09:44:15 +01:00 committed by GitHub
parent 3f331e9a39
commit 24224613f3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 66 additions and 70 deletions

136
README.md
View File

@ -20,7 +20,7 @@ ToC:
- [Runner Deployments](#runnerdeployments) - [Runner Deployments](#runnerdeployments)
- [Note on scaling to/from 0](#note-on-scaling-tofrom-0) - [Note on scaling to/from 0](#note-on-scaling-tofrom-0)
- [Autoscaling](#autoscaling) - [Autoscaling](#autoscaling)
- [Faster Autoscaling with GitHub Webhook](#faster-autoscaling-with-github-webhook) - [Webhook Driven Scaling](#webhook-driven-scaling)
- [Autoscaling to/from 0](#autoscaling-tofrom-0) - [Autoscaling to/from 0](#autoscaling-tofrom-0)
- [Scheduled Overrides](#scheduled-overrides) - [Scheduled Overrides](#scheduled-overrides)
- [Runner with DinD](#runner-with-dind) - [Runner with DinD](#runner-with-dind)
@ -51,8 +51,8 @@ Subsequent to this, install the custom resource definitions and actions-runner-c
**Kubectl Deployment:** **Kubectl Deployment:**
```shell ```shell
# REPLACE "v0.18.2" with the version you wish to deploy # REPLACE "v0.20.1" with the version you wish to deploy
kubectl apply -f https://github.com/actions-runner-controller/actions-runner-controller/releases/download/v0.18.2/actions-runner-controller.yaml kubectl apply -f https://github.com/actions-runner-controller/actions-runner-controller/releases/download/v0.20.1/actions-runner-controller.yaml
``` ```
**Helm Deployment:** **Helm Deployment:**
@ -101,7 +101,7 @@ _Note: Links are provided further down to create an app for your logged in user
* Actions (read) * Actions (read)
* Administration (read / write) * Administration (read / write)
* Checks (read) (if you are going to use [Faster Autoscaling with GitHub Webhook](#faster-autoscaling-with-github-webhook)) * Checks (read) (if you are going to use [Webhook Driven Scaling](#webhook-driven-scaling))
* Metadata (read) * Metadata (read)
**Required Permissions for Organization Runners:**<br /> **Required Permissions for Organization Runners:**<br />
@ -114,7 +114,7 @@ _Note: Links are provided further down to create an app for your logged in user
* Self-hosted runners (read / write) * Self-hosted runners (read / write)
**Subscribe to events** **Subscribe to events**
* Check run (if you are going to use [Faster Autoscaling with GitHub Webhook](#faster-autoscaling-with-github-webhook)) * Check run (if you are going to use [Webhook Driven Scaling](#webhook-driven-scaling))
_Note: All API routes mapped to their permissions can be found [here](https://docs.github.com/en/rest/reference/permissions-required-for-github-apps) if you wish to review_ _Note: All API routes mapped to their permissions can be found [here](https://docs.github.com/en/rest/reference/permissions-required-for-github-apps) if you wish to review_
@ -214,9 +214,9 @@ Configure your values.yaml, see the chart's [README](./charts/actions-runner-con
By default the controller will look for runners in all namespaces, the watch namespace feature allows you to restrict the controller to monitoring a single namespace. This then lets you deploy multiple controllers in a single cluster. You may want to do this either because you wish to scale beyond the API rate limit of a single PAT / GitHub App configuration or you wish to support multiple GitHub organizations with runners installed at the organization level in a single cluster. By default the controller will look for runners in all namespaces, the watch namespace feature allows you to restrict the controller to monitoring a single namespace. This then lets you deploy multiple controllers in a single cluster. You may want to do this either because you wish to scale beyond the API rate limit of a single PAT / GitHub App configuration or you wish to support multiple GitHub organizations with runners installed at the organization level in a single cluster.
This feature is configured via the controller `--watch-namespace` flag. When a namespace is provided via this flag the controller will only monitor runners in that namespace. This feature is configured via the controller's `--watch-namespace` flag. When a namespace is provided via this flag, the controller will only monitor runners in that namespace.
If you plan on installing all instances of the controller stack into a single namespace you will need to make the names of the resources are unique for each stack. In the case of Helm this can be done by giving each a unique release name, or via the `fullnameOverride` properties. If you plan on installing all instances of the controller stack into a single namespace you will need to make the names of the resources unique to each stack. In the case of Helm this can be done by giving each install a unique release name, or via the `fullnameOverride` properties.
Alternatively, you can install each controller stack into its own unique namespace (relative to other controller stacks in the cluster), avoiding the need to uniquely prefix resources. Alternatively, you can install each controller stack into its own unique namespace (relative to other controller stacks in the cluster), avoiding the need to uniquely prefix resources.
@ -374,13 +374,11 @@ This, in combination with a correctly configured HorizontalRunnerAutoscaler, all
### Autoscaling ### Autoscaling
__**IMPORTANT : Due to limitations / a bug with GitHub's [routing engine](https://docs.github.com/en/actions/hosting-your-own-runners/using-self-hosted-runners-in-a-workflow#routing-precedence-for-self-hosted-runners) autoscaling does NOT work correctly with RunnerDeployments that target the enterprise level. Scaling activity works as expected however jobs fail to get assigned to the scaled out replicas. This was explored in issue [#470](https://github.com/actions-runner-controller/actions-runner-controller/issues/470). Once GitHub resolves the issue with their backend service we expect the solution to be able to support autoscaled enterprise runnerdeploments without any additional changes.**__ A `RunnerDeployment` can scale the number of runners between `minReplicas` and `maxReplicas` fields on either a pull based scaling metric as defined in the `metrics` attribute or a webhook event. Since GitHub's release of the [`workflow_job` webhook](https://docs.github.com/en/developers/webhooks-and-events/webhooks/webhook-events-and-payloads#workflow_job), webhook-based autoscaling is the preferred way of autoscaling, it offers more accurate quicker scaling compared to the pull based metrics and is easy to setup.
**NOTE: Once `workflow_job` webhook events are released on GitHub, the webhook-based autoscaling is the preferred way of autoscaling, because it is easy to configure and has the ability to accurately detect which runners to scale. See [Example 3: Scale on each `workflow_job` event](#example-3-scale-on-each-workflow_job-event)** The below section covers the pull based metrics which you may want to consider if your scaling demands are minor. To configure your webhook based scaling see the [Webhook Driven Scaling](#webhook-driven-scaling) section.
A `RunnerDeployment` (excluding enterprise runners) can scale the number of runners between `minReplicas` and `maxReplicas` fields based the chosen scaling metric as defined in the `metrics` attribute **Pull Driven Scaling Metrics**
**Scaling Metrics**
**TotalNumberOfQueuedAndInProgressWorkflowRuns** **TotalNumberOfQueuedAndInProgressWorkflowRuns**
@ -405,7 +403,7 @@ The scale out performance is controlled via the manager containers startup `--sy
Example `RunnerDeployment` backed by a `HorizontalRunnerAutoscaler`: Example `RunnerDeployment` backed by a `HorizontalRunnerAutoscaler`:
_Important!!! We no longer include the attribute `replicas` in our `RunnerDeployment` if we are configuring autoscaling!_ **_Important!!! We no longer include the attribute `replicas` in our `RunnerDeployment` if we are configuring autoscaling!_**
```yaml ```yaml
apiVersion: actions.summerwind.dev/v1alpha1 apiVersion: actions.summerwind.dev/v1alpha1
@ -460,7 +458,7 @@ The `HorizontalRunnerAutoscaler` will poll GitHub based on the configuration syn
Examples of each scaling type implemented with a `RunnerDeployment` backed by a `HorizontalRunnerAutoscaler`: Examples of each scaling type implemented with a `RunnerDeployment` backed by a `HorizontalRunnerAutoscaler`:
_Important!!! We no longer include the attribute `replicas` in our `RunnerDeployment` if we are configuring autoscaling!_ **_Important!!! We no longer include the attribute `replicas` in our `RunnerDeployment` if we are configuring autoscaling!_**
```yaml ```yaml
--- ---
@ -507,17 +505,12 @@ spec:
scaleDownDelaySecondsAfterScaleOut: 60 scaleDownDelaySecondsAfterScaleOut: 60
``` ```
#### Faster Autoscaling with GitHub Webhook #### Webhook Driven Scaling
__**IMPORTANT : Due to missing webhook events, webhook based scaling is not available for enterprise level RunnerDeployments. This was explored in issue [#470](https://github.com/actions-runner-controller/actions-runner-controller/issues/470).**__
> This feature is an ADVANCED feature which may require more work to set up.
> Please get prepared to put some time and effort to learn and leverage this feature!
`actions-runner-controller` has an optional Webhook server that receives GitHub Webhook events and scale `actions-runner-controller` has an optional Webhook server that receives GitHub Webhook events and scale
[`RunnerDeployments`](#runnerdeployments) by updating corresponding [`HorizontalRunnerAutoscalers`](#autoscaling). [`RunnerDeployments`](#runnerdeployments) by updating corresponding [`HorizontalRunnerAutoscalers`](#autoscaling).
Today, the Webhook server can be configured to respond GitHub `check_run`, `pull_request`, and `push` events Today, the Webhook server can be configured to respond GitHub `check_run`, `workflow_job`, `pull_request` and `push` events
by scaling up the matching `HorizontalRunnerAutoscaler` by N replica(s), where `N` is configurable within by scaling up the matching `HorizontalRunnerAutoscaler` by N replica(s), where `N` is configurable within
`HorizontalRunnerAutoscaler's` `Spec`. `HorizontalRunnerAutoscaler's` `Spec`.
@ -541,7 +534,7 @@ spec:
With the above example, the webhook server scales `myrunners` by `1` replica for 5 minutes on each `check_run` event With the above example, the webhook server scales `myrunners` by `1` replica for 5 minutes on each `check_run` event
with the type of `created` and the status of `queued` received. with the type of `created` and the status of `queued` received.
The primary benefit of autoscaling on Webhook compared to the standard autoscaling is that this one allows you to The primary benefit of autoscaling on Webhook compared to the standard autoscaling is that it is far quicker as it allows you to
immediately add "resource slack" for future GitHub Actions job runs. immediately add "resource slack" for future GitHub Actions job runs.
In contrast, the standard autoscaling requires you to wait next sync period to add In contrast, the standard autoscaling requires you to wait next sync period to add
@ -552,7 +545,8 @@ But doing so eventually results in the controller not being functional due to it
To enable this feature, you firstly need to install the webhook server. To enable this feature, you firstly need to install the webhook server.
Currently, only our Helm chart has the ability install it. Currently, only our Helm chart has the ability install it:
_[see the values documentation for all configuration options](https://github.com/actions-runner-controller/actions-runner-controller/blob/master/charts/actions-runner-controller/README.md)_
```console ```console
$ helm --upgrade install actions-runner-controller/actions-runner-controller \ $ helm --upgrade install actions-runner-controller/actions-runner-controller \
@ -568,10 +562,44 @@ Once you were able to confirm that the Webhook server is ready and running from
GitHub sending PING events to the Webhook server - create or update your `HorizontalRunnerAutoscaler` resources GitHub sending PING events to the Webhook server - create or update your `HorizontalRunnerAutoscaler` resources
by learning the following configuration examples. by learning the following configuration examples.
- [Example 1: Scale up on each `check_run` event](#example-1-scale-up-on-each-check_run-event) - [Example 1: Scale on each `workflow_job` event](#example-1-scale-on-each-workflow_job-event)
- [Example 2: Scale on each `pull_request` event against `develop` or `main` branches](#example-2-scale-on-each-pull_request-event-against-develop-or-main-branches) - [Example 2: Scale up on each `check_run` event](#example-2-scale-up-on-each-check_run-event)
- [Example 3: Scale on each `pull_request` event against a given set of branches](#example-3-scale-on-each-pull_request-event-against-a-given-set-of-branches)
##### Example 1: Scale up on each `check_run` event ##### Example 1: Scale on each `workflow_job` event
> This feature requires controller version => [v0.20.0](https://github.com/actions-runner-controller/actions-runner-controller/releases/tag/v0.20.0)
The most flexible webhook GitHub offers is the `workflow_job` webhook, it includes the `runs-on` information in the payload allowing scaling based on runner labels.
This webhook should cover most people's needs, please experiment with this webhook before considering the others.
```yaml
kind: RunnerDeployment
metadata:
name: myrunners
spec:
repository: example/myrepo
---
kind: HorizontalRunnerAutoscaler
spec:
scaleTargetRef:
name: myrunners
scaleUpTriggers:
- githubEvent: {}
duration: "30m"
```
You can configure your GitHub webhook settings to only include `Workflows Job` events, so that it sends us three kinds of `workflow_job` events per a job run.
Each kind has a `status` of `queued`, `in_progress` and `completed`.
With the above configuration, `actions-runner-controller` adds one runner for a `workflow_job` event whose `status` is `queued`. Similarly, it removes one runner for a `workflow_job` event whose `status` is `completed`.
Beware that a scale-down after a scale-up is deferred until `scaleDownDelaySecondsAfterScaleOut` elapses. Let's say you had configured `scaleDownDelaySecondsAfterScaleOut` of 60 seconds, 2 consequtive workflow jobs will result in immediately adding 2 runners. The 2 runners are removed only after 60 seconds. This basically gives you 60 seconds of a "grace period" that makes it possible for self-hosted runners to immediately run additional workflow jobs enqueued in that 60 seconds.
You must not include `spec.metrics` like `PercentageRunnersBusy` when using this feature, as it is unnecessary. That is, if you've configured the webhook for `workflow_job`, it should be enough for all your scale-out needs.
##### Example 2: Scale up on each `check_run` event
> Note: This should work almost like https://github.com/philips-labs/terraform-aws-github-runner > Note: This should work almost like https://github.com/philips-labs/terraform-aws-github-runner
@ -615,13 +643,15 @@ spec:
checkRun: checkRun:
types: ["created"] types: ["created"]
status: "queued" status: "queued"
# allow only certain repositories within your organization to trigger autoscaling # Optionally restrict autoscaling to being triggered by events from specific repositories within your organization still
# repositories: ["myrepo", "myanotherrepo"] # repositories: ["myrepo", "myanotherrepo"]
amount: 1 amount: 1
duration: "5m" duration: "5m"
``` ```
###### Example 2: Scale on each `pull_request` event against `develop` or `main` branches ##### Example 3: Scale on each `pull_request` event against a given set of branches
To scale up replicas of the runners for `example/myrepo` by 1 for 5 minutes on each `pull_request` against the `main` or `develop` branch you write manifests like the below:
```yaml ```yaml
kind: RunnerDeployment kind: RunnerDeployment
@ -645,34 +675,6 @@ spec:
See ["activity types"](https://docs.github.com/en/actions/reference/events-that-trigger-workflows#pull_request) for the list of valid values for `scaleUpTriggers[].githubEvent.pullRequest.types`. See ["activity types"](https://docs.github.com/en/actions/reference/events-that-trigger-workflows#pull_request) for the list of valid values for `scaleUpTriggers[].githubEvent.pullRequest.types`.
###### Example 3: Scale on each `workflow_job` event
> This feature depends on an unreleased GitHub feature
```yaml
kind: RunnerDeployment
metadata:
name: myrunners
spec:
repository: example/myrepo
---
kind: HorizontalRunnerAutoscaler
spec:
scaleTargetRef:
name: myrunners
scaleUpTriggers:
- githubEvent: {}
duration: "30m"
```
You can configure your GitHub webhook settings to only include `Workflows Job` events, so that it sends us three kinds of `workflow_job` events per a job run.
Each kind has a `status` of `queued`, `in_progress` and `completed`.
With the above configuration, `actions-runner-controller` adds one runner for a `workflow_job` event whose `status` is `queued`. Similarly, it removes one runner for a `workflow_job` event whose `status` is `completed`.
Beware that a scale-down after a scale-up is deferred until `scaleDownDelaySecondsAfterScaleOut` elapses. Let's say you had configured `scaleDownDelaySecondsAfterScaleOut` of 60 seconds, 2 consequtive workflow jobs will result in immediately adding 2 runners. The 2 runners are removed only after 60 seconds. This basically gives you 60 seconds of a "grace period" that makes it possible for self-hosted runners to immediately run additional workflow jobs enqueued in that 60 seconds.
You must not include `spec.metrics` like `PercentageRunnersBusy` when using this feature, as it is unnecessary. That is, if you've configured the webhook for `workflow_job`, it should be enough for all the scale-out need.
#### Autoscaling to/from 0 #### Autoscaling to/from 0
@ -680,17 +682,16 @@ You must not include `spec.metrics` like `PercentageRunnersBusy` when using this
Previously, we've discussed about [how to scale a RunnerDeployment to/from 0](#note-on-scaling-tofrom-0) Previously, we've discussed about [how to scale a RunnerDeployment to/from 0](#note-on-scaling-tofrom-0)
To automate the process of scaling to/from 0, you can use `HorizontalRunnerAutoscaler` with a caveat. To scale from 0 whilst still being able to provision runners as jobs are queued we must use the `HorizontalRunnerAutoscaler` with only certain metric configurations, only the below configurations support scaling from 0 whilst also being able to provision runners as jobs are queued:
That is, you need to choose one of the following configuration for metrics and triggers:
- `TotalNumberOfQueuedAndInProgressWorkflowRuns` - `TotalNumberOfQueuedAndInProgressWorkflowRuns`
- `PercentageRunnersBusy` + `TotalNumberOfQueuedAndInProgressWorkflowRuns` - `PercentageRunnersBusy` + `TotalNumberOfQueuedAndInProgressWorkflowRuns`
- `PercentageRunnersBusy` + Webhook-based autoscaling - `PercentageRunnersBusy` + Webhook-based autoscaling
- Webhook-based autoscaling only
This is due to that `PercentageRunnersBusy`, by its definition, needs one or more GitHub runners that can become `busy`, which cannot happen at all when you have 0 active runners. This is due to that `PercentageRunnersBusy`, by its definition, needs one or more GitHub runners that can become `busy` to be able to scale. If there isn't a runner to pick up a job and enter a `busy` state then the controller will never know to provision a runner to begin with as this metric is no knowledge of the jobs queued and is relying using the number of busy runners as a means for calculating the desired replica count.
If and only if HorizontalRunnerAutoscaler is configured to have a secondary metric of `TotalNumberOfQueuedAndInProgressWorkflowRuns` and the controller sees the primary metric of `PercentageRunnersBusy` returned 0 desired replicas, it uses the secondary metric for calculating the desired replicas once again. If a HorizontalRunnerAutoscaler is configured with a secondary metric of `TotalNumberOfQueuedAndInProgressWorkflowRuns` then be aware that the controller will check the primary metric of `PercentageRunnersBusy` first and will only use the secondary metric to calculate the desired replica count if the primary metric returns 0 desired replicas.
A correctly configured `TotalNumberOfQueuedAndInProgressWorkflowRuns` can return non-zero desired replicas even when there are no runners other than [registration-only runners](#note-on-scaling-tofrom-0), hence the `PercentageRunnersBusy` + `TotalNumberOfQueuedAndInProgressWorkflowRuns` configuration makes scaling from zero possible. A correctly configured `TotalNumberOfQueuedAndInProgressWorkflowRuns` can return non-zero desired replicas even when there are no runners other than [registration-only runners](#note-on-scaling-tofrom-0), hence the `PercentageRunnersBusy` + `TotalNumberOfQueuedAndInProgressWorkflowRuns` configuration makes scaling from zero possible.
@ -700,12 +701,10 @@ Similarly, Webhook-based autoscaling works regardless of there are active runner
> This feature requires controller version => [v0.19.0](https://github.com/actions-runner-controller/actions-runner-controller/releases/tag/v0.19.0) > This feature requires controller version => [v0.19.0](https://github.com/actions-runner-controller/actions-runner-controller/releases/tag/v0.19.0)
`Scheduled Overrides` allows you to configure HorizontalRunnerAutoscaler so that its Spec gets updated only during a certain period of time. `Scheduled Overrides` allows you to configure HorizontalRunnerAutoscaler so that its Spec gets updated only during a certain period of time. This feature is usually used for following scenarios:
usually, this feature is used for following scenarios: - You want to reduce your infrastructure costs by scaling your Kubernetes nodes down outside of business hours
- You want to scale for scheduled spikes in workloads
- You want to pay for your infrastructure cost running runners only in business hours
- You want to prepare for scheduled spikes in workloads
For the first scenario, you might consider configuration like the below: For the first scenario, you might consider configuration like the below:
@ -1021,10 +1020,7 @@ Note that there's no official Istio integration in actions-runner-controller. It
### Stateful Runners ### Stateful Runners
> This is a documentation about a unreleased version of actions-runner-controller. > This feature requires controller version => [v0.20.0](https://github.com/actions-runner-controller/actions-runner-controller/releases/tag/v0.20.0)
>
> It would be great if you could try building the latest controller image following https://github.com/actions-runner-controller/actions-runner-controller#contributing if you are eager to test it early and help
> developers by reporting any bugs :smile:
`actions-runner-controller` supports `RunnerSet` API that let you deploy stateful runners. A stateful runner is designed to be able to store some data persists across GitHub Actions workflow and job runs. You might find it useful, for example, to speed up your docker builds by persisting the docker layer cache. `actions-runner-controller` supports `RunnerSet` API that let you deploy stateful runners. A stateful runner is designed to be able to store some data persists across GitHub Actions workflow and job runs. You might find it useful, for example, to speed up your docker builds by persisting the docker layer cache.