16 KiB

Raw Blame History

Autoscaling Runner Scale Sets mode

This new autoscaling mode brings numerous enhancements (described in the following sections) that will make your experience more reliable and secure.

How it works

In addition to the increased reliability of the automatic scaling, we have worked on these improvements:

No longer require cert-manager as a prerequisite for installing actions-runner-controller
Reliable scale-up based on job demands and scale-down to zero runner pods
Reduce API requests to api.github.com, no more API rate-limiting problems
The GitHub Personal Access Token (PAT) or the GitHub App installation token is no longer passed to the runner pod for runner registration
Maximum flexibility for customizing your runner pod template

Demo

Will take you to Youtube for a short walkthrough of the Autoscaling Runner Scale Sets mode.

Setup

Prerequisites

Create a K8s cluster, if not available.
- If you don't have a K8s cluster, you can install a local environment using minikube. See installing minikube.
Install helm 3, if not available. See installing Helm.

Install actions-runner-controller

Install actions-runner-controller using helm 3. For additional configuration options, see values.yaml

NAMESPACE="arc-systems"
helm install arc \
    --namespace "${NAMESPACE}" \
    --create-namespace \
    oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set-controller

Generate a Personal Access Token (PAT) or create and install a GitHub App. See Creating a personal access token and Creating a GitHub App.
- ℹ For the list of required permissions, see Authenticating to the GitHub API.

You're ready to install the autoscaling runner set. For additional configuration options, see values.yaml

ℹ Choose your installation name carefully, you will use it as the value of runs-on in your workflow.
ℹ We recommend you choose a unique namespace in the following steps. As a good security measure, it's best to have your runner pods created in a different namespace than the one containing the manager and listener pods.

# Using a Personal Access Token (PAT)
INSTALLATION_NAME="arc-runner-set"
NAMESPACE="arc-runners"
GITHUB_CONFIG_URL="https://github.com/<your_enterprise/org/repo>"
GITHUB_PAT="<PAT>"
helm install "${INSTALLATION_NAME}" \
    --namespace "${NAMESPACE}" \
    --create-namespace \
    --set githubConfigUrl="${GITHUB_CONFIG_URL}" \
    --set githubConfigSecret.github_token="${GITHUB_PAT}" \
    oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set

# Using a GitHub App
INSTALLATION_NAME="arc-runner-set"
NAMESPACE="arc-runners"
GITHUB_CONFIG_URL="https://github.com/<your_enterprise/org/repo>"
GITHUB_APP_ID="<GITHUB_APP_ID>"
GITHUB_APP_INSTALLATION_ID="<GITHUB_APP_INSTALLATION_ID>"
GITHUB_APP_PRIVATE_KEY="<GITHUB_APP_PRIVATE_KEY>"
helm install "${INSTALLATION_NAME}" \
    --namespace "${NAMESPACE}" \
    --create-namespace \
    --set githubConfigUrl="${GITHUB_CONFIG_URL}" \
    --set githubConfigSecret.github_app_id="${GITHUB_APP_ID}" \
    --set githubConfigSecret.github_app_installation_id="${GITHUB_APP_INSTALLATION_ID}" \
    --set githubConfigSecret.github_app_private_key="${GITHUB_APP_PRIVATE_KEY}" \
    oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set

Check your installation. If everything went well, you should see the following:

$ helm list -n "${NAMESPACE}"

NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                                    APP VERSION
arc             arc-systems     1               2023-01-18 10:03:36.610534934 +0000 UTC deployed        gha-runner-scale-set-controller-0.4.0        preview
arc-runner-set  arc-systems     1               2023-01-18 10:20:14.795285645 +0000 UTC deployed        gha-runner-scale-set-0.4.0            0.4.0

$ kubectl get pods -n "${NAMESPACE}"

NAME                                              READY   STATUS    RESTARTS   AGE
arc-gha-runner-scale-set-controller-8c74b6f95-gr7zr   1/1     Running   0          20m
arc-runner-set-6cd58d58-listener                  1/1     Running   0          21s

In a repository, create a simple test workflow as follows. The runs-on value should match the helm installation name you used in the previous step.

name: Test workflow
on:
  workflow_dispatch:
jobs:
  test:
    runs-on: arc-runner-set
      steps:
      - name: Hello world
        run: echo "Hello world"

Run the workflow. You should see the runner pod being created and the workflow being executed.

$ kubectl get pods -A

NAMESPACE     NAME                                                  READY   STATUS    RESTARTS      AGE
arc-systems   arc-gha-runner-scale-set-controller-8c74b6f95-gr7zr   1/1     Running   0             27m
arc-systems   arc-runner-set-6cd58d58-listener                      1/1     Running   0             7m52s
arc-runners   arc-runner-set-rmrgw-runner-p9p5n                     1/1     Running   0             21s

Upgrade to newer versions

Upgrading actions-runner-controller requires a few extra steps because CRDs will not be automatically upgraded (this is a helm limitation).

Uninstall the autoscaling runner set first

INSTALLATION_NAME="arc-runner-set"
NAMESPACE="arc-runners"
helm uninstall "${INSTALLATION_NAME}" --namespace "${NAMESPACE}"

Wait for all the pods to drain

Pull the new helm chart, unpack it and update the CRDs. When applying this step, don't forget to replace <PATH> with the path of the gha-runner-scale-set-controller helm chart:

helm pull oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set-controller \
    --untar && \
    kubectl replace -f <PATH>/gha-runner-scale-set-controller/crds/

Reinstall actions-runner-controller using the steps from the previous section

Troubleshooting

I'm using the charts from the `master` branch and the controller is not working

The master branch is highly unstable! We offer no guarantees that the charts in the master branch will work at any given time. If you're using the charts from the master branch, you should expect to encounter issues. Please use the latest release instead.

Controller pod is running but the runner set listener pod is not

You need to inspect the logs of the controller first and see if there are any errors. If there are no errors, and the runner set listener pod is still not running, you need to make sure that the controller pod has access to the Kubernetes API server in your cluster!

You'll see something similar to the following in the logs of the controller pod:

kubectl logs <controller_pod_name> -c manager
17:35:28.661069       1 request.go:690] Waited for 1.032376652s due to client-side throttling, not priority and fairness, request: GET:https://10.0.0.1:443/apis/monitoring.coreos.com/v1alpha1?timeout=32s
2023-03-15T17:35:29Z    INFO    starting manager

If you have a proxy configured or you're using a sidecar proxy that's automatically injected (think Istio), you need to make sure it's configured appropriately to allow traffic from the controller container (manager) to the Kubernetes API server.

Check the logs

You can check the logs of the controller pod using the following command:

# Controller logs
kubectl logs -n "${NAMESPACE}" -l app.kubernetes.io/name=gha-runner-scale-set-controller

# Runner set listener logs
kubectl logs -n "${NAMESPACE}" -l actions.github.com/scale-set-namespace=arc-systems -l actions.github.com/scale-set-name=arc-runner-set

Naming error: `Name must have up to characters`

We are using some of the resources generated names as labels for other resources. Resource names have a max length of 263 characters while labels are limited to 63 characters. Given this constraint, we have to limit the resource names to 63 characters.

Since part of the resource name is defined by you, we have to impose a limit on the amount of characters you can use for the installation and namespace names.

If you see these errors, you have to use shorter installation or namespace names.

Error: INSTALLATION FAILED: execution error at (gha-runner-scale-set/templates/autoscalingrunnerset.yaml:5:5): Name must have up to 45 characters

Error: INSTALLATION FAILED: execution error at (gha-runner-scale-set/templates/autoscalingrunnerset.yaml:8:5): Namespace must have up to 63 characters

If you installed the autoscaling runner set, but the listener pod is not created

Verify that the secret you provided is correct and that the githubConfigUrl you provided is accurate.

Access to the path `/home/runner/_work/_tool` is denied error

You might see this error if you're using kubernetes mode with persistent volumes. This is because the runner container is running with a non-root user and is causing a permissions mismatch with the mounted volume.

To fix this, you can either:

Use a volume type that supports securityContext.fsGroup (hostPath volumes don't support it, local volumes do as well as other types). Update the fsGroup of your runner pod to match the GID of the runner. You can do that by updating the gha-runner-scale-set helm chart values to include the following:
```
spec:
  securityContext:
    fsGroup: 123
  containers:
  - name: runner
    image: ghcr.io/actions/actions-runner:<VERSION> # Replace <VERSION> with the version you want to use
    command: ["/home/runner/run.sh"]
```

If updating the securityContext of your runner pod is not a viable solution, you can workaround the issue by using initContainers to change the mounted volume's ownership, as follows:

template:
spec:
  initContainers:
  - name: kube-init
    image: ghcr.io/actions/actions-runner:latest
    command: ["sudo", "chown", "-R", "1001:123", "/home/runner/_work"]
    volumeMounts:
    - name: work
      mountPath: /home/runner/_work
  containers:
  - name: runner
    image: ghcr.io/actions/actions-runner:latest
    command: ["/home/runner/run.sh"]

Changelog

v0.4.0

⚠️ Warning

This release contains a major change related to the way permissions are applied to the manager (#2276 and #2363).

Please evaluate these changes carefully before upgrading.

Major changes

Surface EphemeralRunnerSet stats to AutoscalingRunnerSet #2382
Improved security posture by removing list/watch secrets permission from manager cluster role #2276
Improved security posture by delaying role/rolebinding creation to gha-runner-scale-set during installation #2363
Improved security posture by supporting watching a single namespace from the controller #2374
Added labels to AutoscalingRunnerSet subresources to allow easier inspection #2391
Fixed bug preventing env variables from being specified #2450
Enhance quickstart troubleshooting guides #2435
Fixed ignore extra dind container when container mode type is "dind" #2418
Added additional cleanup finalizers #2433
gha-runner-scale-set listener pod inherits the ImagePullPolicy from the manager pod #2477
Treat .ghe.com domain as hosted environment #2480

v0.3.0

Major changes

Runner pods are more similar to hosted runners #2348
Add support for self-signed CA certificates #2268
Fixed trailing slashes in config URLs breaking installations #2381
Fixed a bug where the listener pod would ignore proxy settings from env #2366
Added runner set name field making it optionally configurable #2279
Name and namespace labels of listener pod have been split #2341
Added chart name constraints validation on AutoscalingRunnerSet install #2347

v0.2.0

Major changes

Added proxy support for the controller and the runner pods, see the new helm chart fields #2286
Added the abiilty to provide a pre-defined kubernetes secret for the auto scaling runner set helm chart #2234
Enhanced security posture by removing un-required permissions for the manager-role #2260
Enhanced our logging by returning an error when a runner group is defined in the values file but it's not created in GitHub #2215
Fixed helm charts issues that were preventing the use of DinD #2291
Fixed a bug that was preventing runner scale from being removed from the backend when they were deleted from the cluster #2255 #2223
Fixed bugs with the helm chart definitions preventing certain values from being set #2222
Fixed a bug that prevented the configuration of a runner group for a runner scale set #2216

16 KiB Raw Blame History Unescape Escape

Autoscaling Runner Scale Sets mode

How it works

Demo

Setup

Prerequisites

Install actions-runner-controller

Upgrade to newer versions

Troubleshooting

I'm using the charts from the master branch and the controller is not working

Controller pod is running but the runner set listener pod is not

Check the logs

Naming error: Name must have up to characters

If you installed the autoscaling runner set, but the listener pod is not created

Access to the path /home/runner/_work/_tool is denied error

Changelog

v0.4.0

⚠️ Warning

Major changes

v0.3.0

Major changes

v0.2.0

Major changes

16 KiB

Raw Blame History

I'm using the charts from the `master` branch and the controller is not working

Naming error: `Name must have up to characters`

Access to the path `/home/runner/_work/_tool` is denied error