# Autoscaling Runner Scale Sets mode This new autoscaling mode brings numerous enhancements (described in the following sections) that will make your experience more reliable and secure. ## How it works ![arc_hld_v1 drawio (1)](https://user-images.githubusercontent.com/568794/212665433-2d1f3d6e-0ba8-4f02-9d1b-27d00c49abd1.png) In addition to the increased reliability of the automatic scaling, we have worked on these improvements: - No longer require cert-manager as a prerequisite for installing actions-runner-controller - Reliable scale-up based on job demands and scale-down to zero runner pods - Reduce API requests to `api.github.com`, no more API rate-limiting problems - The GitHub Personal Access Token (PAT) or the GitHub App installation token is no longer passed to the runner pod for runner registration - Maximum flexibility for customizing your runner pod template ### Demo https://user-images.githubusercontent.com/568794/212668313-8946ddc5-60c1-461f-a73e-27f5e8c75720.mp4 ## Setup ### Prerequisites 1. Create a K8s cluster, if not available. - If you don't have a K8s cluster, you can install a local environment using minikube. See [installing minikube](https://minikube.sigs.k8s.io/docs/start/). 1. Install helm 3, if not available. See [installing Helm](https://helm.sh/docs/intro/install/). ### Install actions-runner-controller 1. Install actions-runner-controller using helm 3. For additional configuration options, see [values.yaml](https://github.com/actions/actions-runner-controller/blob/master/charts/gha-runner-scale-set-controller/values.yaml) ```bash NAMESPACE="arc-systems" helm install arc \ --namespace "${NAMESPACE}" \ --create-namespace \ oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set-controller \ --version 0.3.0 ``` 1. Generate a Personal Access Token (PAT) or create and install a GitHub App. See [Creating a personal access token](https://docs.github.com/en/github/authenticating-to-github/creating-a-personal-access-token) and [Creating a GitHub App](https://docs.github.com/en/developers/apps/creating-a-github-app). - ℹ For the list of required permissions, see [Authenticating to the GitHub API](https://github.com/actions/actions-runner-controller/blob/master/docs/authenticating-to-the-github-api.md#authenticating-to-the-github-api). 1. You're ready to install the autoscaling runner set. For additional configuration options, see [values.yaml](https://github.com/actions/actions-runner-controller/blob/master/charts/gha-runner-scale-set/values.yaml) - ℹ **Choose your installation name carefully**, you will use it as the value of `runs-on` in your workflow. - ℹ **We recommend you choose a unique namespace in the following steps**. As a good security measure, it's best to have your runner pods created in a different namespace than the one containing the manager and listener pods. ```bash # Using a Personal Access Token (PAT) INSTALLATION_NAME="arc-runner-set" NAMESPACE="arc-runners" GITHUB_CONFIG_URL="https://github.com/" GITHUB_PAT="" helm install "${INSTALLATION_NAME}" \ --namespace "${NAMESPACE}" \ --create-namespace \ --set githubConfigUrl="${GITHUB_CONFIG_URL}" \ --set githubConfigSecret.github_token="${GITHUB_PAT}" \ oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set --version 0.3.0 ``` ```bash # Using a GitHub App INSTALLATION_NAME="arc-runner-set" NAMESPACE="arc-runners" GITHUB_CONFIG_URL="https://github.com/" GITHUB_APP_ID="" GITHUB_APP_INSTALLATION_ID="" GITHUB_APP_PRIVATE_KEY="" helm install arc-runner-set \ --namespace "${NAMESPACE}" \ --create-namespace \ --set githubConfigUrl="${GITHUB_CONFIG_URL}" \ --set githubConfigSecret.github_app_id="${GITHUB_APP_ID}" \ --set githubConfigSecret.github_app_installation_id="${GITHUB_APP_INSTALLATION_ID}" \ --set githubConfigSecret.github_app_private_key="${GITHUB_APP_PRIVATE_KEY}" \ oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set --version 0.3.0 ``` 1. Check your installation. If everything went well, you should see the following: ```bash $ helm list -n "${NAMESPACE}" NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION arc arc-systems 1 2023-01-18 10:03:36.610534934 +0000 UTC deployed gha-runner-scale-set-controller-0.3.0 preview arc-runner-set arc-systems 1 2023-01-18 10:20:14.795285645 +0000 UTC deployed gha-runner-scale-set-0.3.0 0.3.0 ``` ```bash $ kubectl get pods -n "${NAMESPACE}" NAME READY STATUS RESTARTS AGE arc-gha-runner-scale-set-controller-8c74b6f95-gr7zr 1/1 Running 0 20m arc-runner-set-6cd58d58-listener 1/1 Running 0 21s ``` 1. In a repository, create a simple test workflow as follows. The `runs-on` value should match the helm installation name you used in the previous step. ```yaml name: Test workflow on: workflow_dispatch: jobs: test: runs-on: arc-runner-set steps: - name: Hello world run: echo "Hello world" ``` 1. Run the workflow. You should see the runner pod being created and the workflow being executed. ```bash $ kubectl get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE arc-systems arc-gha-runner-scale-set-controller-8c74b6f95-gr7zr 1/1 Running 0 27m arc-systems arc-runner-set-6cd58d58-listener 1/1 Running 0 7m52s arc-runners arc-runner-set-rmrgw-runner-p9p5n 1/1 Running 0 21s ``` ### Upgrade to newer versions Upgrading actions-runner-controller requires a few extra steps because CRDs will not be automatically upgraded (this is a helm limitation). 1. Uninstall the autoscaling runner set first ```bash INSTALLATION_NAME="arc-runner-set" NAMESPACE="arc-runners" helm uninstall "${INSTALLATION_NAME}" --namespace "${NAMESPACE}" ``` 1. Wait for all the pods to drain 1. Pull the new helm chart, unpack it and update the CRDs. When applying this step, don't forget to replace `` with the path of the `gha-runner-scale-set-controller` helm chart: ```bash helm pull oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set-controller \ --version 0.3.0 \ --untar && \ kubectl replace -f /gha-runner-scale-set-controller/crds/ ``` 1. Reinstall actions-runner-controller using the steps from the previous section ## Troubleshooting ### Check the logs You can check the logs of the controller pod using the following command: ```bash # Controller logs kubectl logs -n "${NAMESPACE}" -l app.kubernetes.io/name=gha-runner-scale-set-controller ``` ```bash # Runner set listener logs kubectl logs -n "${NAMESPACE}" -l actions.github.com/scale-set-namespace=arc-systems -l actions.github.com/scale-set-name=arc-runner-set ``` ### Naming error: `Name must have up to characters` We are using some of the resources generated names as labels for other resources. Resource names have a max length of `263 characters` while labels are limited to `63 characters`. Given this constraint, we have to limit the resource names to `63 characters`. Since part of the resource name is defined by you, we have to impose a limit on the amount of characters you can use for the installation and namespace names. If you see these errors, you have to use shorter installation or namespace names. ```bash Error: INSTALLATION FAILED: execution error at (gha-runner-scale-set/templates/autoscalingrunnerset.yaml:5:5): Name must have up to 45 characters Error: INSTALLATION FAILED: execution error at (gha-runner-scale-set/templates/autoscalingrunnerset.yaml:8:5): Namespace must have up to 63 characters ``` ### If you installed the autoscaling runner set, but the listener pod is not created Verify that the secret you provided is correct and that the `githubConfigUrl` you provided is accurate. ### Access to the path `/home/runner/_work/_tool` is denied error You might see this error if you're using kubernetes mode with persistent volumes. This is because the runner container is running with a non-root user and is causing a permissions mismatch with the mounted volume. To fix this, you can either: 1. Use a volume type that supports `securityContext.fsGroup` (`hostPath` volumes don't support it, `local` volumes do as well as other types). Update the `fsGroup` of your runner pod to match the GID of the runner. You can do that by updating the `gha-runner-scale-set` helm chart values to include the following: ```yaml spec: securityContext: fsGroup: 123 containers: - name: runner image: ghcr.io/actions/actions-runner: # Replace with the version you want to use command: ["/home/runner/run.sh"] ``` 1. If updating the `securityContext` of your runner pod is not a viable solution, you can workaround the issue by using `initContainers` to change the mounted volume's ownership, as follows: ```yaml template: spec: initContainers: - name: kube-init image: ghcr.io/actions/actions-runner:latest command: ["sudo", "chown", "-R", "1001:123", "/home/runner/_work"] volumeMounts: - name: work mountPath: /home/runner/_work containers: - name: runner image: ghcr.io/actions/actions-runner:latest command: ["/home/runner/run.sh"] ``` ## Changelog ### v0.3.0 #### Major changes 1. Runner pods are more similar to hosted runners [#2348](https://github.com/actions/actions-runner-controller/pull/2348) 1. Add support for self-signed CA certificates [#2268](https://github.com/actions/actions-runner-controller/pull/2268) 1. Fixed trailing slashes in config URLs breaking installations [#2381](https://github.com/actions/actions-runner-controller/pull/2381) 1. Fixed a bug where the listener pod would ignore proxy settings from env [#2366](https://github.com/actions/actions-runner-controller/pull/2366) 1. Added runner set name field making it optionally configurable [#2279](https://github.com/actions/actions-runner-controller/pull/2279) 1. Name and namespace labels of listener pod have been split [#2341](https://github.com/actions/actions-runner-controller/pull/2341) 1. Added chart name constraints validation on AutoscalingRunnerSet install [#2347](https://github.com/actions/actions-runner-controller/pull/2347) ### v0.2.0 #### Major changes 1. Added proxy support for the controller and the runner pods, see the new helm chart fields [#2286](https://github.com/actions/actions-runner-controller/pull/2286) 1. Added the abiilty to provide a pre-defined kubernetes secret for the auto scaling runner set helm chart [#2234](https://github.com/actions/actions-runner-controller/pull/2234) 1. Enhanced security posture by removing un-required permissions for the manager-role [#2260](https://github.com/actions/actions-runner-controller/pull/2260) 1. Enhanced our logging by returning an error when a runner group is defined in the values file but it's not created in GitHub [#2215](https://github.com/actions/actions-runner-controller/pull/2215) 1. Fixed helm charts issues that were preventing the use of DinD [#2291](https://github.com/actions/actions-runner-controller/pull/2291) 1. Fixed a bug that was preventing runner scale from being removed from the backend when they were deleted from the cluster [#2255](https://github.com/actions/actions-runner-controller/pull/2255) [#2223](https://github.com/actions/actions-runner-controller/pull/2223) 1. Fixed bugs with the helm chart definitions preventing certain values from being set [#2222](https://github.com/actions/actions-runner-controller/pull/2222) 1. Fixed a bug that prevented the configuration of a runner group for a runner scale set [#2216](https://github.com/actions/actions-runner-controller/pull/2216)