246 lines
12 KiB
Markdown
246 lines
12 KiB
Markdown
# Autoscaling Runner Scale Sets mode
|
||
|
||
This new autoscaling mode brings numerous enhancements (described in the following sections) that will make your experience more reliable and secure.
|
||
|
||
## How it works
|
||
|
||

|
||
|
||
In addition to the increased reliability of the automatic scaling, we have worked on these improvements:
|
||
|
||
- No longer require cert-manager as a prerequisite for installing actions-runner-controller
|
||
- Reliable scale-up based on job demands and scale-down to zero runner pods
|
||
- Reduce API requests to `api.github.com`, no more API rate-limiting problems
|
||
- The GitHub Personal Access Token (PAT) or the GitHub App installation token is no longer passed to the runner pod for runner registration
|
||
- Maximum flexibility for customizing your runner pod template
|
||
|
||
### Demo
|
||
|
||
https://user-images.githubusercontent.com/568794/212668313-8946ddc5-60c1-461f-a73e-27f5e8c75720.mp4
|
||
|
||
## Setup
|
||
|
||
### Prerequisites
|
||
|
||
1. Create a K8s cluster, if not available.
|
||
- If you don't have a K8s cluster, you can install a local environment using minikube. See [installing minikube](https://minikube.sigs.k8s.io/docs/start/).
|
||
1. Install helm 3, if not available. See [installing Helm](https://helm.sh/docs/intro/install/).
|
||
|
||
### Install actions-runner-controller
|
||
|
||
1. Install actions-runner-controller using helm 3. For additional configuration options, see [values.yaml](https://github.com/actions/actions-runner-controller/blob/master/charts/gha-runner-scale-set-controller/values.yaml)
|
||
|
||
```bash
|
||
NAMESPACE="arc-systems"
|
||
helm install arc \
|
||
--namespace "${NAMESPACE}" \
|
||
--create-namespace \
|
||
oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set-controller \
|
||
--version 0.3.0
|
||
```
|
||
|
||
1. Generate a Personal Access Token (PAT) or create and install a GitHub App. See [Creating a personal access token](https://docs.github.com/en/github/authenticating-to-github/creating-a-personal-access-token) and [Creating a GitHub App](https://docs.github.com/en/developers/apps/creating-a-github-app).
|
||
- ℹ For the list of required permissions, see [Authenticating to the GitHub API](https://github.com/actions/actions-runner-controller/blob/master/docs/authenticating-to-the-github-api.md#authenticating-to-the-github-api).
|
||
|
||
1. You're ready to install the autoscaling runner set. For additional configuration options, see [values.yaml](https://github.com/actions/actions-runner-controller/blob/master/charts/gha-runner-scale-set/values.yaml)
|
||
- ℹ **Choose your installation name carefully**, you will use it as the value of `runs-on` in your workflow.
|
||
- ℹ **We recommend you choose a unique namespace in the following steps**. As a good security measure, it's best to have your runner pods created in a different namespace than the one containing the manager and listener pods.
|
||
|
||
```bash
|
||
# Using a Personal Access Token (PAT)
|
||
INSTALLATION_NAME="arc-runner-set"
|
||
NAMESPACE="arc-runners"
|
||
GITHUB_CONFIG_URL="https://github.com/<your_enterprise/org/repo>"
|
||
GITHUB_PAT="<PAT>"
|
||
helm install "${INSTALLATION_NAME}" \
|
||
--namespace "${NAMESPACE}" \
|
||
--create-namespace \
|
||
--set githubConfigUrl="${GITHUB_CONFIG_URL}" \
|
||
--set githubConfigSecret.github_token="${GITHUB_PAT}" \
|
||
oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set --version 0.3.0
|
||
```
|
||
|
||
```bash
|
||
# Using a GitHub App
|
||
INSTALLATION_NAME="arc-runner-set"
|
||
NAMESPACE="arc-runners"
|
||
GITHUB_CONFIG_URL="https://github.com/<your_enterprise/org/repo>"
|
||
GITHUB_APP_ID="<GITHUB_APP_ID>"
|
||
GITHUB_APP_INSTALLATION_ID="<GITHUB_APP_INSTALLATION_ID>"
|
||
GITHUB_APP_PRIVATE_KEY="<GITHUB_APP_PRIVATE_KEY>"
|
||
helm install arc-runner-set \
|
||
--namespace "${NAMESPACE}" \
|
||
--create-namespace \
|
||
--set githubConfigUrl="${GITHUB_CONFIG_URL}" \
|
||
--set githubConfigSecret.github_app_id="${GITHUB_APP_ID}" \
|
||
--set githubConfigSecret.github_app_installation_id="${GITHUB_APP_INSTALLATION_ID}" \
|
||
--set githubConfigSecret.github_app_private_key="${GITHUB_APP_PRIVATE_KEY}" \
|
||
oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set --version 0.3.0
|
||
```
|
||
|
||
1. Check your installation. If everything went well, you should see the following:
|
||
|
||
```bash
|
||
$ helm list -n "${NAMESPACE}"
|
||
|
||
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
|
||
arc arc-systems 1 2023-01-18 10:03:36.610534934 +0000 UTC deployed gha-runner-scale-set-controller-0.3.0 preview
|
||
arc-runner-set arc-systems 1 2023-01-18 10:20:14.795285645 +0000 UTC deployed gha-runner-scale-set-0.3.0 0.3.0
|
||
```
|
||
|
||
```bash
|
||
$ kubectl get pods -n "${NAMESPACE}"
|
||
|
||
NAME READY STATUS RESTARTS AGE
|
||
arc-gha-runner-scale-set-controller-8c74b6f95-gr7zr 1/1 Running 0 20m
|
||
arc-runner-set-6cd58d58-listener 1/1 Running 0 21s
|
||
```
|
||
|
||
1. In a repository, create a simple test workflow as follows. The `runs-on` value should match the helm installation name you used in the previous step.
|
||
|
||
```yaml
|
||
name: Test workflow
|
||
on:
|
||
workflow_dispatch:
|
||
|
||
jobs:
|
||
test:
|
||
runs-on: arc-runner-set
|
||
steps:
|
||
- name: Hello world
|
||
run: echo "Hello world"
|
||
```
|
||
|
||
1. Run the workflow. You should see the runner pod being created and the workflow being executed.
|
||
|
||
```bash
|
||
$ kubectl get pods -A
|
||
|
||
NAMESPACE NAME READY STATUS RESTARTS AGE
|
||
arc-systems arc-gha-runner-scale-set-controller-8c74b6f95-gr7zr 1/1 Running 0 27m
|
||
arc-systems arc-runner-set-6cd58d58-listener 1/1 Running 0 7m52s
|
||
arc-runners arc-runner-set-rmrgw-runner-p9p5n 1/1 Running 0 21s
|
||
```
|
||
|
||
### Upgrade to newer versions
|
||
|
||
Upgrading actions-runner-controller requires a few extra steps because CRDs will not be automatically upgraded (this is a helm limitation).
|
||
|
||
1. Uninstall the autoscaling runner set first
|
||
|
||
```bash
|
||
INSTALLATION_NAME="arc-runner-set"
|
||
NAMESPACE="arc-runners"
|
||
helm uninstall "${INSTALLATION_NAME}" --namespace "${NAMESPACE}"
|
||
```
|
||
|
||
1. Wait for all the pods to drain
|
||
|
||
1. Pull the new helm chart, unpack it and update the CRDs. When applying this step, don't forget to replace `<PATH>` with the path of the `gha-runner-scale-set-controller` helm chart:
|
||
|
||
```bash
|
||
helm pull oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set-controller \
|
||
--version 0.3.0 \
|
||
--untar && \
|
||
kubectl replace -f <PATH>/gha-runner-scale-set-controller/crds/
|
||
```
|
||
|
||
1. Reinstall actions-runner-controller using the steps from the previous section
|
||
|
||
## Troubleshooting
|
||
|
||
### Check the logs
|
||
|
||
You can check the logs of the controller pod using the following command:
|
||
|
||
```bash
|
||
# Controller logs
|
||
kubectl logs -n "${NAMESPACE}" -l app.kubernetes.io/name=gha-runner-scale-set-controller
|
||
```
|
||
|
||
```bash
|
||
# Runner set listener logs
|
||
kubectl logs -n "${NAMESPACE}" -l actions.github.com/scale-set-namespace=arc-systems -l actions.github.com/scale-set-name=arc-runner-set
|
||
```
|
||
|
||
### Naming error: `Name must have up to characters`
|
||
|
||
We are using some of the resources generated names as labels for other resources. Resource names have a max length of `263 characters` while labels are limited to `63 characters`. Given this constraint, we have to limit the resource names to `63 characters`.
|
||
|
||
Since part of the resource name is defined by you, we have to impose a limit on the amount of characters you can use for the installation and namespace names.
|
||
|
||
If you see these errors, you have to use shorter installation or namespace names.
|
||
|
||
```bash
|
||
Error: INSTALLATION FAILED: execution error at (gha-runner-scale-set/templates/autoscalingrunnerset.yaml:5:5): Name must have up to 45 characters
|
||
|
||
Error: INSTALLATION FAILED: execution error at (gha-runner-scale-set/templates/autoscalingrunnerset.yaml:8:5): Namespace must have up to 63 characters
|
||
```
|
||
|
||
### If you installed the autoscaling runner set, but the listener pod is not created
|
||
|
||
Verify that the secret you provided is correct and that the `githubConfigUrl` you provided is accurate.
|
||
|
||
### Access to the path `/home/runner/_work/_tool` is denied error
|
||
|
||
You might see this error if you're using kubernetes mode with persistent volumes. This is because the runner container is running with a non-root user and is causing a permissions mismatch with the mounted volume.
|
||
|
||
To fix this, you can either:
|
||
|
||
1. Use a volume type that supports `securityContext.fsGroup` (`hostPath` volumes don't support it, `local` volumes do as well as other types). Update the `fsGroup` of your runner pod to match the GID of the runner. You can do that by updating the `gha-runner-scale-set` helm chart values to include the following:
|
||
|
||
```yaml
|
||
spec:
|
||
securityContext:
|
||
fsGroup: 123
|
||
containers:
|
||
- name: runner
|
||
image: ghcr.io/actions/actions-runner:<VERSION> # Replace <VERSION> with the version you want to use
|
||
command: ["/home/runner/run.sh"]
|
||
```
|
||
|
||
1. If updating the `securityContext` of your runner pod is not a viable solution, you can workaround the issue by using `initContainers` to change the mounted volume's ownership, as follows:
|
||
|
||
```yaml
|
||
template:
|
||
spec:
|
||
initContainers:
|
||
- name: kube-init
|
||
image: ghcr.io/actions/actions-runner:latest
|
||
command: ["sudo", "chown", "-R", "1001:123", "/home/runner/_work"]
|
||
volumeMounts:
|
||
- name: work
|
||
mountPath: /home/runner/_work
|
||
containers:
|
||
- name: runner
|
||
image: ghcr.io/actions/actions-runner:latest
|
||
command: ["/home/runner/run.sh"]
|
||
```
|
||
|
||
## Changelog
|
||
|
||
### v0.3.0
|
||
|
||
#### Major changes
|
||
|
||
1. Runner pods are more similar to hosted runners [#2348](https://github.com/actions/actions-runner-controller/pull/2348)
|
||
1. Add support for self-signed CA certificates [#2268](https://github.com/actions/actions-runner-controller/pull/2268)
|
||
1. Fixed trailing slashes in config URLs breaking installations [#2381](https://github.com/actions/actions-runner-controller/pull/2381)
|
||
1. Fixed a bug where the listener pod would ignore proxy settings from env [#2366](https://github.com/actions/actions-runner-controller/pull/2366)
|
||
1. Added runner set name field making it optionally configurable [#2279](https://github.com/actions/actions-runner-controller/pull/2279)
|
||
1. Name and namespace labels of listener pod have been split [#2341](https://github.com/actions/actions-runner-controller/pull/2341)
|
||
1. Added chart name constraints validation on AutoscalingRunnerSet install [#2347](https://github.com/actions/actions-runner-controller/pull/2347)
|
||
|
||
### v0.2.0
|
||
|
||
#### Major changes
|
||
|
||
1. Added proxy support for the controller and the runner pods, see the new helm chart fields [#2286](https://github.com/actions/actions-runner-controller/pull/2286)
|
||
1. Added the abiilty to provide a pre-defined kubernetes secret for the auto scaling runner set helm chart [#2234](https://github.com/actions/actions-runner-controller/pull/2234)
|
||
1. Enhanced security posture by removing un-required permissions for the manager-role [#2260](https://github.com/actions/actions-runner-controller/pull/2260)
|
||
1. Enhanced our logging by returning an error when a runner group is defined in the values file but it's not created in GitHub [#2215](https://github.com/actions/actions-runner-controller/pull/2215)
|
||
1. Fixed helm charts issues that were preventing the use of DinD [#2291](https://github.com/actions/actions-runner-controller/pull/2291)
|
||
1. Fixed a bug that was preventing runner scale from being removed from the backend when they were deleted from the cluster [#2255](https://github.com/actions/actions-runner-controller/pull/2255) [#2223](https://github.com/actions/actions-runner-controller/pull/2223)
|
||
1. Fixed bugs with the helm chart definitions preventing certain values from being set [#2222](https://github.com/actions/actions-runner-controller/pull/2222)
|
||
1. Fixed a bug that prevented the configuration of a runner group for a runner scale set [#2216](https://github.com/actions/actions-runner-controller/pull/2216)
|