actions-runner-controller

History

Yusuke Kuoka 686d40c20d Merge pull request #1127 from actions-runner-controller/github-api-cache Enhances ARC(both the controller-manager and github-webhook-server) to cache any GitHub API responses with HTTP GET and an appropriate Cache-Control header. Ref #920 ## Cache Implementation `gregjones/httpcache` has been chosen as a library to implement this feature, as it is as recommended in `go-github`'s documentation: https://github.com/google/go-github#conditional-requests `gregjones/httpcache` supports a number of cache backends like `diskcache`, `s3cache`, and so on: https://github.com/gregjones/httpcache#cache-backends We stick to the built-in in-memory cache as a starter. Probably this will never becomes an issue as long as various HTTP responses for all the GitHub API calls that ARC makes, list-runners, list-workflow-jobs, list-runner-groups, etc., doesn't overflow the in-memory cache. `httpcache` has an known unfixed issue that it doesn't update cache on chunked responses. But we assume that the APIs that we call doesn't use chunked responses. See #1503 for more information on that. ## Ephemeral runner pods are no longer recreated The addition of the cache layer resulted in a slow down of a scale-down process and a trade-off between making the runner pod termination process fragile to various race conditions(shorter grace period before runner deletion) or delaying runner pod deletion depending on how long the grace period is(longer grace period). A grace period needs to be at least longer than 60s (which is the same as cache duration of ListRunners API) to not prematurely delete a runner pod that was just created. But once I disabled automatic recreation of ephemeral runner pod, it turned out to be no more of an issue when it's being scaled via workflow_job webhook. Ephemeral runner resources are still automatically added on demand by RunnerDeployment via RunnerReplicaSet(I've added `EffectiveTime` fields to our CRDs but that's an implementation detail so let's omit). A good side-effect of disabling ephemeral runner pod recreations is that ARC will no longer create redundant ephemeral runners when used with webhook-based autoscaler. Basically, autoscaling still works as everyone might expect. It's just better than before overall.	2022-02-28 08:37:26 +09:00
..
.ci	ci: disable chart version bump check (#870 )	2021-10-04 20:39:38 +01:00
actions-runner-controller	Merge pull request #1127 from actions-runner-controller/github-api-cache	2022-02-28 08:37:26 +09:00

Merge pull request #1127 from actions-runner-controller/github-api-cache

Enhances ARC(both the controller-manager and github-webhook-server) to cache any GitHub API responses with HTTP GET and an appropriate Cache-Control header.

Ref #920

## Cache Implementation

`gregjones/httpcache` has been chosen as a library to implement this feature, as it is as recommended in `go-github`'s documentation:

https://github.com/google/go-github#conditional-requests

`gregjones/httpcache` supports a number of cache backends like `diskcache`, `s3cache`, and so on:

https://github.com/gregjones/httpcache#cache-backends

We stick to the built-in in-memory cache as a starter. Probably this will never becomes an issue as long as various HTTP responses for all the GitHub API calls that ARC makes, list-runners, list-workflow-jobs, list-runner-groups, etc., doesn't overflow the in-memory cache.

`httpcache` has an known unfixed issue that it doesn't update cache on chunked responses. But we assume that the APIs that we call doesn't use chunked responses. See #1503 for more information on that.

## Ephemeral runner pods are no longer recreated

The addition of the cache layer resulted in a slow down of a scale-down process and a trade-off between making the runner pod termination process fragile to various race conditions(shorter grace period before runner deletion) or delaying runner pod deletion depending on how long the grace period is(longer grace period). A grace period needs to be at least longer than 60s (which is the same as cache duration of ListRunners API) to not prematurely delete a runner pod that was just created.

But once I disabled automatic recreation of ephemeral runner pod, it turned out to be no more of an issue when it's being scaled via workflow_job webhook.

Ephemeral runner resources are still automatically added on demand by RunnerDeployment via RunnerReplicaSet(I've added `EffectiveTime` fields to our CRDs but that's an implementation detail so let's omit). A good side-effect of disabling ephemeral runner pod recreations is that ARC will no longer create redundant ephemeral runners when used with webhook-based autoscaler.

Basically, autoscaling still works as everyone might expect. It's just better than before overall.

2022-02-28 08:37:26 +09:00

.ci

ci: disable chart version bump check (#870 )

2021-10-04 20:39:38 +01:00

actions-runner-controller

Merge pull request #1127 from actions-runner-controller/github-api-cache

2022-02-28 08:37:26 +09:00