# Summary
- add lifecycle, terminationGracePeriodSeconds, and loadBalancerSource ranges to metrics server
- these were missed when copying from the other webhook server
- original PR adding them to the other webhook server is here https://github.com/actions/actions-runner-controller/pull/2305
Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
In response to https://github.com/actions/actions-runner-controller/issues/2212 , the ARC helm chart is missing ClusterRoleBinding and ClusterRole for the ActionsMetricsServer resulting on missing permissions.
This also fix the labels of the ActionsMetricsServer Service as it is selected by the ServiceMonitor with those labels.
Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
Some requests send method in lowercase (verified with curl and as a default for AWS ALB health check requests), but Go HTTP library constant MethodGet is in upper.
I observed that 100% of canceled jobs in my runner pool were not causing scale down events. This PR fixes that.
The problem was caused by #2119.
#2119 ignores certain webhook events in order to fix#2118. However, #2119 overdoes it and filters out valid job cancellation events. This PR uses stricter filtering and add visibility for future troubleshooting.
<details><summary>Example cancellation event</summary>
This is the redacted top portion of a valid cancellation event my runner pool received and ignored.
```json
{
"action": "completed",
"workflow_job": {
"id": 12848997134,
"run_id": 4738060033,
"workflow_name": "slack-notifier",
"head_branch": "auto-update/slack-notifier-0.5.1",
"run_url": "https://api.github.com/repos/nuru/<redacted>/actions/runs/4738060033",
"run_attempt": 1,
"node_id": "CR_kwDOB8Xtbc8AAAAC_dwjDg",
"head_sha": "55bada8f3d0d3e12a510a1bf34d0c3e169b65f89",
"url": "https://api.github.com/repos/nuru/<redacted>/actions/jobs/12848997134",
"html_url": "https://github.com/nuru/<redacted>/actions/runs/4738060033/jobs/8411515430",
"status": "completed",
"conclusion": "cancelled",
"created_at": "2023-04-19T00:03:12Z",
"started_at": "2023-04-19T00:03:42Z",
"completed_at": "2023-04-19T00:03:42Z",
"name": "build (arm64)",
"steps": [
],
"check_run_url": "https://api.github.com/repos/nuru/<redacted>/check-runs/12848997134",
"labels": [
"self-hosted",
"arm64"
],
"runner_id": 0,
"runner_name": "",
"runner_group_id": 0,
"runner_group_name": ""
},
```
</details>
Starting ARC v0.27.2, we've changed the `docker.sock` path from `/var/run/docker.sock` to `/var/run/docker/docker.sock`. That resulted in breaking some container-based actions due to the hard-coded `docker.sock` path in various places.
Even `actions/runner` seem to use `/var/run/docker.sock` for building container-based actions and for service containers?
Anyway, this fixes that by moving the sock file back to the previous location.
Once this gets merged, users stuck at ARC v0.27.1, previously upgraded to 0.27.2 or 0.27.3 and reverted back to v0.27.1 due to #2519, should be able to upgrade to the upcoming v0.27.4.
Resolves#2519Resolves#2538
#2490 has been happening since v0.27.2 for non-dind runners based on Ubuntu 20.04 runner images. It does not affect Ubuntu 22.04 non-dind runners(i.e. runners with dockerd sidecars) and Ubuntu 20.04/22.04 dind runners(i.e. runners without dockerd sidecars). However, presuming many folks are still using Ubuntu 20.04 runners and non-dind runners, we should fix it.
This change tries to fix it by defaulting to the docker group id 1001 used by Ubuntu 20.04 runners, and use gid 121 for Ubuntu 22.04 runners. We use the image tag to see which Ubuntu version the runner is based on. The algorithm is so simple- we assume it's Ubuntu-22.04-based if the image tag contains "22.04".
This might be a breaking change for folks who have already upgraded to Ubuntu 22.04 runners using their own custom runner images. Note again; we rely on the image tag to detect Ubuntu 22.04 runner images and use the proper docker gid- Folks using our official Ubuntu 22.04 runner images are not affected. It is a breaking change anyway, so I have added a remedy-
ARC got a new flag, `--docker-gid`, which defaults to `1001` but can be set to `121` or whatever gid the operator/admin likes. This can be set to `--docker-gid=121`, for example, if you are using your own custom runner image based on Ubuntu 22.04 and the image tag does not contain "22.04".
Fixes#2490