doc: Use RunnerSet to retain various cache by leveraging PV (#1464)
* doc: Use RunnerSet to retain various cache In relation to #1286 and as a follow-up for #1340 * docs: clarify client vs daemon * docs: better wording * Separate RunnerSet examples for docker iimage layer caching * Revert changes on testdata as it is going to be added via #1471 instead * Update README.md Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com> * fixup! Update README.md * Remove the outdated RunnerSet limitation Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com>
This commit is contained in:
		
							parent
							
								
									c7eea169ad
								
							
						
					
					
						commit
						ef3313d147
					
				
							
								
								
									
										126
									
								
								README.md
								
								
								
								
							
							
						
						
									
										126
									
								
								README.md
								
								
								
								
							|  | @ -473,7 +473,6 @@ Under the hood, `RunnerSet` relies on Kubernetes's `StatefulSet` and Mutating We | |||
| **Limitations** | ||||
| 
 | ||||
| * For autoscaling the `RunnerSet` kind only supports pull driven scaling or the `workflow_job` event for webhook driven scaling. | ||||
| * Whilst `RunnerSets` support all runner modes as well as autoscaling, currently PVs are **NOT** automatically cleaned up as they are still bound to their respective PVCs when a runner is deleted by the controller. This has **major** implications when using `RunnerSets` in the standard runner mode, `ephemeral: true`, see [persistent runners](#persistent-runners) for more details. As a result of this, using the default ephemeral configuration or implementing autoscaling for your `RunnerSets`, you will get a build-up of PVCs and PVs without some sort of custom solution for cleaning up. | ||||
| 
 | ||||
| ### Persistent Runners | ||||
| 
 | ||||
|  | @ -1168,7 +1167,8 @@ spec: | |||
| You can configure your own custom volume mounts. For example to have the work/docker data in memory or on NVME SSD, for | ||||
| i/o intensive builds. Other custom volume mounts should be possible as well, see [kubernetes documentation](https://kubernetes.io/docs/concepts/storage/volumes/) | ||||
| 
 | ||||
| **RAM Disk Runner**<br /> | ||||
| #### RAM Disk | ||||
| 
 | ||||
| Example how to place the runner work dir, docker sidecar and /tmp within the runner onto a ramdisk. | ||||
| ```yaml | ||||
| kind: RunnerDeployment | ||||
|  | @ -1194,7 +1194,8 @@ spec: | |||
|       emphemeral: true # recommended to not leak data between builds. | ||||
| ``` | ||||
| 
 | ||||
| **NVME SSD Runner**<br /> | ||||
| #### NVME SSD | ||||
| 
 | ||||
| In this example we provide NVME backed storage for the workdir, docker sidecar and /tmp within the runner. | ||||
| Here we use a working example on GKE, which will provide the NVME disk at /mnt/disks/ssd0.  We will be placing the respective volumes in subdirs here and in order to be able to run multiple runners we will use the pod name as a prefix for subdirectories. Also the disk will fill up over time and disk space will not be freed until the node is removed. | ||||
| 
 | ||||
|  | @ -1242,6 +1243,125 @@ spec: | |||
|     emphemeral: true # VERY important. otherwise data inside the workdir and /tmp is not cleared between builds | ||||
| ``` | ||||
| 
 | ||||
| #### Docker image layers caching | ||||
| 
 | ||||
| > **Note**: Ensure that the volume mount is added to the container that is running the Docker daemon. | ||||
| 
 | ||||
| `docker` stores pulled and built image layers in the [daemon's (note not client)](https://docs.docker.com/get-started/overview/#docker-architecture) [local storage area](https://docs.docker.com/storage/storagedriver/#sharing-promotes-smaller-images) which is usually at `/var/lib/docker`. | ||||
| 
 | ||||
| By leveraging RunnerSet's dynamic PV provisioning feature and your CSI driver, you can let ARC maintain a pool of PVs that are | ||||
| reused across runner pods to retain `/var/lib/docker`. | ||||
| 
 | ||||
| _Be sure to add the volume mount to the container that is supposed to run the docker daemon._ | ||||
| 
 | ||||
| By default, ARC creates a sidecar container named `docker` within the runner pod for running the docker daemon. In that case, | ||||
| it's where you need the volume mount so that the manifest looks like: | ||||
| 
 | ||||
| ```yaml | ||||
| kind: RunnerSet | ||||
| metadata: | ||||
|   name: example | ||||
| spec: | ||||
|   template: | ||||
|     spec: | ||||
|       containers: | ||||
|       - name: docker | ||||
|         volumeMounts: | ||||
|         - name: var-lib-docker | ||||
|           mountPath: /var/lib/docker | ||||
|   volumeClaimtemplates: | ||||
|   - metadata: | ||||
|       name: var-lib-docker | ||||
|     spec: | ||||
|       accessModes: | ||||
|       - ReadWriteOnce | ||||
|       resources: | ||||
|         requests: | ||||
|           storage: 10Mi | ||||
|       storageClassName: var-lib-docker | ||||
| ``` | ||||
| 
 | ||||
| With `dockerdWithinRunnerContainer: true`, you need to add the volume mount to the `runner` container. | ||||
| 
 | ||||
| #### Go module and build caching | ||||
| 
 | ||||
| `Go` is known to cache builds under `$HOME/.cache/go-build` and downloaded modules under `$HOME/pkg/mod`. | ||||
| The module cache dir can be customized by setting `GOMOD_CACHE` so by setting it to somewhere under `$HOME/.cache`, | ||||
| we can have a single PV to host both build and module cache, which might improve Go module downloading and building time. | ||||
| 
 | ||||
| ```yaml | ||||
| kind: RunnerSet | ||||
| metadata: | ||||
|   name: example | ||||
| spec: | ||||
|   template: | ||||
|     spec: | ||||
|       containers: | ||||
|       - name: runner | ||||
|         env: | ||||
|         - name: GOMODCACHE | ||||
|           value: "/home/runner/.cache/go-mod" | ||||
|         volumeMounts: | ||||
|         - name: cache | ||||
|           mountPath: "/home/runner/.cache" | ||||
|   volumeClaimTemplates: | ||||
|   - metadata: | ||||
|       name: cache | ||||
|     spec: | ||||
|       accessModes: | ||||
|       - ReadWriteOnce | ||||
|       resources: | ||||
|         requests: | ||||
|           storage: 10Mi | ||||
|       storageClassName: cache | ||||
| ``` | ||||
| 
 | ||||
| #### PV-backed runner work directory | ||||
| 
 | ||||
| ARC works by automatically creating runner pods for running [`actions/runner`](https://github.com/actions/runner) and [running `config.sh`](https://docs.github.com/en/actions/hosting-your-own-runners/adding-self-hosted-runners#adding-a-self-hosted-runner-to-a-repository) which you had to ran manually without ARC. | ||||
| 
 | ||||
| `config.sh` is the script provided by `actions/runner` to pre-configure the runner process before being started. One of the options provided by `config.sh` is `--work`, | ||||
| which specifies the working directory where the runner runs your workflow jobs in. | ||||
| 
 | ||||
| The volume and the partition that hosts the work directory should have several or dozens of GBs free space that might be used by your workflow jobs. | ||||
| 
 | ||||
| By default, ARC uses `/runner/_work` as work directory, which is powered by Kubernetes's `emptyDir`. [`emptyDir` is usually backed by a directory created within a host's volume](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir), somewhere under `/var/lib/kuberntes/pods`. Therefore | ||||
| your host's volume that is backing `/var/lib/kubernetes/pods` must have enough free space to serve all the concurrent runner pods that might be deployed onto your host at the same time. | ||||
| 
 | ||||
| So, in case you see a job failure seemingly due to "disk full", it's very likely you need to reconfigure your host to have more free space. | ||||
| 
 | ||||
| In case you can't rely on host's volume, consider using `RunnerSet` and backing the work directory with a ephemeral PV. | ||||
| 
 | ||||
| Kubernetes 1.23 or greater provides the support for [generic ephemeral volumes](https://kubernetes.io/docs/concepts/storage/ephemeral-volumes/#generic-ephemeral-volumes), which is designed to support this exact use-case. It's defined in the Pod spec API so it isn't currently available for `RunnerDeployment`. `RunnerSet` is based on Kubernetes' `StatefulSet` which mostly embeds the Pod spec under `spec.template.spec`, so there you go. | ||||
| 
 | ||||
| ```yaml | ||||
| kind: RunnerSet | ||||
| metadata: | ||||
|   name: example | ||||
| spec: | ||||
|   template: | ||||
|     spec: | ||||
|       containers: | ||||
|       - name: runner | ||||
|         volumeMounts: | ||||
|         - mountPath: /runner/_work | ||||
|           name: work | ||||
|       - name: docker | ||||
|         volumeMounts: | ||||
|         - mountPath: /runner/_work | ||||
|           name: work | ||||
|       volumes: | ||||
|       - name: work | ||||
|         ephemeral: | ||||
|           volumeClaimTemplate: | ||||
|             spec: | ||||
|               accessModes: [ "ReadWriteOnce" ] | ||||
|               storageClassName: "runner-work-dir" | ||||
|               resources: | ||||
|                 requests: | ||||
|                   storage: 10Gi | ||||
| ``` | ||||
| 
 | ||||
| ### Runner Labels | ||||
| 
 | ||||
| To run a workflow job on a self-hosted runner, you can use the following syntax in your workflow: | ||||
|  |  | |||
		Loading…
	
		Reference in New Issue