diff --git a/README.md b/README.md index 96ed5712..6b164e8f 100644 --- a/README.md +++ b/README.md @@ -473,7 +473,6 @@ Under the hood, `RunnerSet` relies on Kubernetes's `StatefulSet` and Mutating We **Limitations** * For autoscaling the `RunnerSet` kind only supports pull driven scaling or the `workflow_job` event for webhook driven scaling. -* Whilst `RunnerSets` support all runner modes as well as autoscaling, currently PVs are **NOT** automatically cleaned up as they are still bound to their respective PVCs when a runner is deleted by the controller. This has **major** implications when using `RunnerSets` in the standard runner mode, `ephemeral: true`, see [persistent runners](#persistent-runners) for more details. As a result of this, using the default ephemeral configuration or implementing autoscaling for your `RunnerSets`, you will get a build-up of PVCs and PVs without some sort of custom solution for cleaning up. ### Persistent Runners @@ -1168,7 +1167,8 @@ spec: You can configure your own custom volume mounts. For example to have the work/docker data in memory or on NVME SSD, for i/o intensive builds. Other custom volume mounts should be possible as well, see [kubernetes documentation](https://kubernetes.io/docs/concepts/storage/volumes/) -**RAM Disk Runner**
+#### RAM Disk + Example how to place the runner work dir, docker sidecar and /tmp within the runner onto a ramdisk. ```yaml kind: RunnerDeployment @@ -1194,7 +1194,8 @@ spec: emphemeral: true # recommended to not leak data between builds. ``` -**NVME SSD Runner**
+#### NVME SSD + In this example we provide NVME backed storage for the workdir, docker sidecar and /tmp within the runner. Here we use a working example on GKE, which will provide the NVME disk at /mnt/disks/ssd0. We will be placing the respective volumes in subdirs here and in order to be able to run multiple runners we will use the pod name as a prefix for subdirectories. Also the disk will fill up over time and disk space will not be freed until the node is removed. @@ -1242,6 +1243,125 @@ spec: emphemeral: true # VERY important. otherwise data inside the workdir and /tmp is not cleared between builds ``` +#### Docker image layers caching + +> **Note**: Ensure that the volume mount is added to the container that is running the Docker daemon. + +`docker` stores pulled and built image layers in the [daemon's (note not client)](https://docs.docker.com/get-started/overview/#docker-architecture) [local storage area](https://docs.docker.com/storage/storagedriver/#sharing-promotes-smaller-images) which is usually at `/var/lib/docker`. + +By leveraging RunnerSet's dynamic PV provisioning feature and your CSI driver, you can let ARC maintain a pool of PVs that are +reused across runner pods to retain `/var/lib/docker`. + +_Be sure to add the volume mount to the container that is supposed to run the docker daemon._ + +By default, ARC creates a sidecar container named `docker` within the runner pod for running the docker daemon. In that case, +it's where you need the volume mount so that the manifest looks like: + +```yaml +kind: RunnerSet +metadata: + name: example +spec: + template: + spec: + containers: + - name: docker + volumeMounts: + - name: var-lib-docker + mountPath: /var/lib/docker + volumeClaimtemplates: + - metadata: + name: var-lib-docker + spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 10Mi + storageClassName: var-lib-docker +``` + +With `dockerdWithinRunnerContainer: true`, you need to add the volume mount to the `runner` container. + +#### Go module and build caching + +`Go` is known to cache builds under `$HOME/.cache/go-build` and downloaded modules under `$HOME/pkg/mod`. +The module cache dir can be customized by setting `GOMOD_CACHE` so by setting it to somewhere under `$HOME/.cache`, +we can have a single PV to host both build and module cache, which might improve Go module downloading and building time. + +```yaml +kind: RunnerSet +metadata: + name: example +spec: + template: + spec: + containers: + - name: runner + env: + - name: GOMODCACHE + value: "/home/runner/.cache/go-mod" + volumeMounts: + - name: cache + mountPath: "/home/runner/.cache" + volumeClaimTemplates: + - metadata: + name: cache + spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 10Mi + storageClassName: cache +``` + +#### PV-backed runner work directory + +ARC works by automatically creating runner pods for running [`actions/runner`](https://github.com/actions/runner) and [running `config.sh`](https://docs.github.com/en/actions/hosting-your-own-runners/adding-self-hosted-runners#adding-a-self-hosted-runner-to-a-repository) which you had to ran manually without ARC. + +`config.sh` is the script provided by `actions/runner` to pre-configure the runner process before being started. One of the options provided by `config.sh` is `--work`, +which specifies the working directory where the runner runs your workflow jobs in. + +The volume and the partition that hosts the work directory should have several or dozens of GBs free space that might be used by your workflow jobs. + +By default, ARC uses `/runner/_work` as work directory, which is powered by Kubernetes's `emptyDir`. [`emptyDir` is usually backed by a directory created within a host's volume](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir), somewhere under `/var/lib/kuberntes/pods`. Therefore +your host's volume that is backing `/var/lib/kubernetes/pods` must have enough free space to serve all the concurrent runner pods that might be deployed onto your host at the same time. + +So, in case you see a job failure seemingly due to "disk full", it's very likely you need to reconfigure your host to have more free space. + +In case you can't rely on host's volume, consider using `RunnerSet` and backing the work directory with a ephemeral PV. + +Kubernetes 1.23 or greater provides the support for [generic ephemeral volumes](https://kubernetes.io/docs/concepts/storage/ephemeral-volumes/#generic-ephemeral-volumes), which is designed to support this exact use-case. It's defined in the Pod spec API so it isn't currently available for `RunnerDeployment`. `RunnerSet` is based on Kubernetes' `StatefulSet` which mostly embeds the Pod spec under `spec.template.spec`, so there you go. + +```yaml +kind: RunnerSet +metadata: + name: example +spec: + template: + spec: + containers: + - name: runner + volumeMounts: + - mountPath: /runner/_work + name: work + - name: docker + volumeMounts: + - mountPath: /runner/_work + name: work + volumes: + - name: work + ephemeral: + volumeClaimTemplate: + spec: + accessModes: [ "ReadWriteOnce" ] + storageClassName: "runner-work-dir" + resources: + requests: + storage: 10Gi +``` + ### Runner Labels To run a workflow job on a self-hosted runner, you can use the following syntax in your workflow: