doc: Use RunnerSet to retain various cache by leveraging PV (#1464)
* doc: Use RunnerSet to retain various cache In relation to #1286 and as a follow-up for #1340 * docs: clarify client vs daemon * docs: better wording * Separate RunnerSet examples for docker iimage layer caching * Revert changes on testdata as it is going to be added via #1471 instead * Update README.md Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com> * fixup! Update README.md * Remove the outdated RunnerSet limitation Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com>
This commit is contained in:
parent
c7eea169ad
commit
ef3313d147
126
README.md
126
README.md
|
|
@ -473,7 +473,6 @@ Under the hood, `RunnerSet` relies on Kubernetes's `StatefulSet` and Mutating We
|
||||||
**Limitations**
|
**Limitations**
|
||||||
|
|
||||||
* For autoscaling the `RunnerSet` kind only supports pull driven scaling or the `workflow_job` event for webhook driven scaling.
|
* For autoscaling the `RunnerSet` kind only supports pull driven scaling or the `workflow_job` event for webhook driven scaling.
|
||||||
* Whilst `RunnerSets` support all runner modes as well as autoscaling, currently PVs are **NOT** automatically cleaned up as they are still bound to their respective PVCs when a runner is deleted by the controller. This has **major** implications when using `RunnerSets` in the standard runner mode, `ephemeral: true`, see [persistent runners](#persistent-runners) for more details. As a result of this, using the default ephemeral configuration or implementing autoscaling for your `RunnerSets`, you will get a build-up of PVCs and PVs without some sort of custom solution for cleaning up.
|
|
||||||
|
|
||||||
### Persistent Runners
|
### Persistent Runners
|
||||||
|
|
||||||
|
|
@ -1168,7 +1167,8 @@ spec:
|
||||||
You can configure your own custom volume mounts. For example to have the work/docker data in memory or on NVME SSD, for
|
You can configure your own custom volume mounts. For example to have the work/docker data in memory or on NVME SSD, for
|
||||||
i/o intensive builds. Other custom volume mounts should be possible as well, see [kubernetes documentation](https://kubernetes.io/docs/concepts/storage/volumes/)
|
i/o intensive builds. Other custom volume mounts should be possible as well, see [kubernetes documentation](https://kubernetes.io/docs/concepts/storage/volumes/)
|
||||||
|
|
||||||
**RAM Disk Runner**<br />
|
#### RAM Disk
|
||||||
|
|
||||||
Example how to place the runner work dir, docker sidecar and /tmp within the runner onto a ramdisk.
|
Example how to place the runner work dir, docker sidecar and /tmp within the runner onto a ramdisk.
|
||||||
```yaml
|
```yaml
|
||||||
kind: RunnerDeployment
|
kind: RunnerDeployment
|
||||||
|
|
@ -1194,7 +1194,8 @@ spec:
|
||||||
emphemeral: true # recommended to not leak data between builds.
|
emphemeral: true # recommended to not leak data between builds.
|
||||||
```
|
```
|
||||||
|
|
||||||
**NVME SSD Runner**<br />
|
#### NVME SSD
|
||||||
|
|
||||||
In this example we provide NVME backed storage for the workdir, docker sidecar and /tmp within the runner.
|
In this example we provide NVME backed storage for the workdir, docker sidecar and /tmp within the runner.
|
||||||
Here we use a working example on GKE, which will provide the NVME disk at /mnt/disks/ssd0. We will be placing the respective volumes in subdirs here and in order to be able to run multiple runners we will use the pod name as a prefix for subdirectories. Also the disk will fill up over time and disk space will not be freed until the node is removed.
|
Here we use a working example on GKE, which will provide the NVME disk at /mnt/disks/ssd0. We will be placing the respective volumes in subdirs here and in order to be able to run multiple runners we will use the pod name as a prefix for subdirectories. Also the disk will fill up over time and disk space will not be freed until the node is removed.
|
||||||
|
|
||||||
|
|
@ -1242,6 +1243,125 @@ spec:
|
||||||
emphemeral: true # VERY important. otherwise data inside the workdir and /tmp is not cleared between builds
|
emphemeral: true # VERY important. otherwise data inside the workdir and /tmp is not cleared between builds
|
||||||
```
|
```
|
||||||
|
|
||||||
|
#### Docker image layers caching
|
||||||
|
|
||||||
|
> **Note**: Ensure that the volume mount is added to the container that is running the Docker daemon.
|
||||||
|
|
||||||
|
`docker` stores pulled and built image layers in the [daemon's (note not client)](https://docs.docker.com/get-started/overview/#docker-architecture) [local storage area](https://docs.docker.com/storage/storagedriver/#sharing-promotes-smaller-images) which is usually at `/var/lib/docker`.
|
||||||
|
|
||||||
|
By leveraging RunnerSet's dynamic PV provisioning feature and your CSI driver, you can let ARC maintain a pool of PVs that are
|
||||||
|
reused across runner pods to retain `/var/lib/docker`.
|
||||||
|
|
||||||
|
_Be sure to add the volume mount to the container that is supposed to run the docker daemon._
|
||||||
|
|
||||||
|
By default, ARC creates a sidecar container named `docker` within the runner pod for running the docker daemon. In that case,
|
||||||
|
it's where you need the volume mount so that the manifest looks like:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
kind: RunnerSet
|
||||||
|
metadata:
|
||||||
|
name: example
|
||||||
|
spec:
|
||||||
|
template:
|
||||||
|
spec:
|
||||||
|
containers:
|
||||||
|
- name: docker
|
||||||
|
volumeMounts:
|
||||||
|
- name: var-lib-docker
|
||||||
|
mountPath: /var/lib/docker
|
||||||
|
volumeClaimtemplates:
|
||||||
|
- metadata:
|
||||||
|
name: var-lib-docker
|
||||||
|
spec:
|
||||||
|
accessModes:
|
||||||
|
- ReadWriteOnce
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
storage: 10Mi
|
||||||
|
storageClassName: var-lib-docker
|
||||||
|
```
|
||||||
|
|
||||||
|
With `dockerdWithinRunnerContainer: true`, you need to add the volume mount to the `runner` container.
|
||||||
|
|
||||||
|
#### Go module and build caching
|
||||||
|
|
||||||
|
`Go` is known to cache builds under `$HOME/.cache/go-build` and downloaded modules under `$HOME/pkg/mod`.
|
||||||
|
The module cache dir can be customized by setting `GOMOD_CACHE` so by setting it to somewhere under `$HOME/.cache`,
|
||||||
|
we can have a single PV to host both build and module cache, which might improve Go module downloading and building time.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
kind: RunnerSet
|
||||||
|
metadata:
|
||||||
|
name: example
|
||||||
|
spec:
|
||||||
|
template:
|
||||||
|
spec:
|
||||||
|
containers:
|
||||||
|
- name: runner
|
||||||
|
env:
|
||||||
|
- name: GOMODCACHE
|
||||||
|
value: "/home/runner/.cache/go-mod"
|
||||||
|
volumeMounts:
|
||||||
|
- name: cache
|
||||||
|
mountPath: "/home/runner/.cache"
|
||||||
|
volumeClaimTemplates:
|
||||||
|
- metadata:
|
||||||
|
name: cache
|
||||||
|
spec:
|
||||||
|
accessModes:
|
||||||
|
- ReadWriteOnce
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
storage: 10Mi
|
||||||
|
storageClassName: cache
|
||||||
|
```
|
||||||
|
|
||||||
|
#### PV-backed runner work directory
|
||||||
|
|
||||||
|
ARC works by automatically creating runner pods for running [`actions/runner`](https://github.com/actions/runner) and [running `config.sh`](https://docs.github.com/en/actions/hosting-your-own-runners/adding-self-hosted-runners#adding-a-self-hosted-runner-to-a-repository) which you had to ran manually without ARC.
|
||||||
|
|
||||||
|
`config.sh` is the script provided by `actions/runner` to pre-configure the runner process before being started. One of the options provided by `config.sh` is `--work`,
|
||||||
|
which specifies the working directory where the runner runs your workflow jobs in.
|
||||||
|
|
||||||
|
The volume and the partition that hosts the work directory should have several or dozens of GBs free space that might be used by your workflow jobs.
|
||||||
|
|
||||||
|
By default, ARC uses `/runner/_work` as work directory, which is powered by Kubernetes's `emptyDir`. [`emptyDir` is usually backed by a directory created within a host's volume](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir), somewhere under `/var/lib/kuberntes/pods`. Therefore
|
||||||
|
your host's volume that is backing `/var/lib/kubernetes/pods` must have enough free space to serve all the concurrent runner pods that might be deployed onto your host at the same time.
|
||||||
|
|
||||||
|
So, in case you see a job failure seemingly due to "disk full", it's very likely you need to reconfigure your host to have more free space.
|
||||||
|
|
||||||
|
In case you can't rely on host's volume, consider using `RunnerSet` and backing the work directory with a ephemeral PV.
|
||||||
|
|
||||||
|
Kubernetes 1.23 or greater provides the support for [generic ephemeral volumes](https://kubernetes.io/docs/concepts/storage/ephemeral-volumes/#generic-ephemeral-volumes), which is designed to support this exact use-case. It's defined in the Pod spec API so it isn't currently available for `RunnerDeployment`. `RunnerSet` is based on Kubernetes' `StatefulSet` which mostly embeds the Pod spec under `spec.template.spec`, so there you go.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
kind: RunnerSet
|
||||||
|
metadata:
|
||||||
|
name: example
|
||||||
|
spec:
|
||||||
|
template:
|
||||||
|
spec:
|
||||||
|
containers:
|
||||||
|
- name: runner
|
||||||
|
volumeMounts:
|
||||||
|
- mountPath: /runner/_work
|
||||||
|
name: work
|
||||||
|
- name: docker
|
||||||
|
volumeMounts:
|
||||||
|
- mountPath: /runner/_work
|
||||||
|
name: work
|
||||||
|
volumes:
|
||||||
|
- name: work
|
||||||
|
ephemeral:
|
||||||
|
volumeClaimTemplate:
|
||||||
|
spec:
|
||||||
|
accessModes: [ "ReadWriteOnce" ]
|
||||||
|
storageClassName: "runner-work-dir"
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
storage: 10Gi
|
||||||
|
```
|
||||||
|
|
||||||
### Runner Labels
|
### Runner Labels
|
||||||
|
|
||||||
To run a workflow job on a self-hosted runner, you can use the following syntax in your workflow:
|
To run a workflow job on a self-hosted runner, you can use the following syntax in your workflow:
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue