Describe slow/failure to boot dind sidecar in TROUBLESHOOTING.md (#1972)

* Update TROUBLESHOOTING.md

Describe solution for slowly starting dind containers

* Fix typeos

* Fix typeos
This commit is contained in:
Jakub Bielawski 2022-11-03 12:58:40 +01:00 committed by GitHub
parent 2110fa5122
commit 362b6c10a3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 10 additions and 0 deletions

View File

@ -290,3 +290,13 @@ HRA doesn't scale the RunnerDeployment to zero, even though you did configure HR
You very likely have some dangling workflow jobs stuck in `queued` or `in_progress` as seen in [#1057](https://github.com/actions-runner-controller/actions-runner-controller/issues/1057#issuecomment-1133439061).
Manually call [the "list workflow runs" API](https://docs.github.com/en/rest/actions/workflow-runs#list-workflow-runs-for-a-repository), and [remove the dangling workflow job(s)](https://docs.github.com/en/rest/actions/workflow-runs#delete-a-workflow-run).
## Slow / failure to boot dind sidecar (default runner)
**Problem**
If you noticed that it takes several minutes for sidecar dind container to be created or it exits with with error just after being created it might indicate that you are experiencing disk performance issue. You might see message `failed to reserve container name` when scaling up multiple runners at once. When you ssh on kubernetes node that problematic pods were scheduled on you can use tools like `atop`, `htop` or `iotop` to check IO usage and cpu time percentage used on iowait. If you see that disk usage is high (80-100%) and iowaits are taking a significant chunk of you cpu time (normally it should not be higher than 10%) it means that performance is being bottlenecked by slow disk.
**Solution**
The solution is to switch to using faster storage, if you are experiencing this issue you are probably using hdd, switch to ssh fixed the problem in my case. Most cloud providers have a list of storage options to use just pick something faster that your current disk, for on prem clusters you will need to invest in some ssds.