diff --git a/adrs/2022-10-17-runner-image.md b/docs/adrs/2022-10-17-runner-image.md similarity index 100% rename from adrs/2022-10-17-runner-image.md rename to docs/adrs/2022-10-17-runner-image.md diff --git a/adrs/2022-10-27-runnerscaleset-lifetime.md b/docs/adrs/2022-10-27-runnerscaleset-lifetime.md similarity index 100% rename from adrs/2022-10-27-runnerscaleset-lifetime.md rename to docs/adrs/2022-10-27-runnerscaleset-lifetime.md diff --git a/adrs/2022-11-04-crd-api-group-name.md b/docs/adrs/2022-11-04-crd-api-group-name.md similarity index 100% rename from adrs/2022-11-04-crd-api-group-name.md rename to docs/adrs/2022-11-04-crd-api-group-name.md diff --git a/adrs/2022-12-05-adding-labels-k8s-resources.md b/docs/adrs/2022-12-05-adding-labels-k8s-resources.md similarity index 100% rename from adrs/2022-12-05-adding-labels-k8s-resources.md rename to docs/adrs/2022-12-05-adding-labels-k8s-resources.md diff --git a/adrs/2022-12-27-pick-the-right-runner-to-scale-down.md b/docs/adrs/2022-12-27-pick-the-right-runner-to-scale-down.md similarity index 100% rename from adrs/2022-12-27-pick-the-right-runner-to-scale-down.md rename to docs/adrs/2022-12-27-pick-the-right-runner-to-scale-down.md diff --git a/adrs/2023-02-02-automate-runner-updates.md b/docs/adrs/2023-02-02-automate-runner-updates.md similarity index 98% rename from adrs/2023-02-02-automate-runner-updates.md rename to docs/adrs/2023-02-02-automate-runner-updates.md index 393e7899..c3bb5c4d 100644 --- a/adrs/2023-02-02-automate-runner-updates.md +++ b/docs/adrs/2023-02-02-automate-runner-updates.md @@ -2,7 +2,7 @@ **Date**: 2023-02-02 -**Status**: Proposed +**Status**: Done ## Context diff --git a/adrs/2023-02-10-limit-manager-role-permission.md b/docs/adrs/2023-02-10-limit-manager-role-permission.md similarity index 99% rename from adrs/2023-02-10-limit-manager-role-permission.md rename to docs/adrs/2023-02-10-limit-manager-role-permission.md index 803a4ec9..d327b4f8 100644 --- a/adrs/2023-02-10-limit-manager-role-permission.md +++ b/docs/adrs/2023-02-10-limit-manager-role-permission.md @@ -2,7 +2,7 @@ **Date**: 2023-02-10 -**Status**: Pending +**Status**: Done ## Context diff --git a/docs/adrs/2023-03-17-workflow-improvements.md b/docs/adrs/2023-03-17-workflow-improvements.md new file mode 100644 index 00000000..38d611aa --- /dev/null +++ b/docs/adrs/2023-03-17-workflow-improvements.md @@ -0,0 +1,84 @@ +# Improve ARC workflows for autoscaling runner sets + +**Date**: 2023-03-17 + +**Status**: Done + +## Context + +In the [actions-runner-controller](https://github.com/actions/actions-runner-controller) +repository we essentially have two projects living side by side: the "legacy" +actions-runner-controller and the new one GitHub is supporting +(gha-runner-scale-set). To hasten progress we relied on existing workflows and +added some of our own (e.g.: end-to-end tests). We now got to a point where it's +sort of confusing what does what and why, not to mention the increased running +times of some those workflows and some GHA-related flaky tests getting in the +way of legacy ARC and viceversa. The three main areas we want to cover are: Go +code, Kubernetes manifests / Helm charts and E2E tests. + +## Go code + +At the moment we have three workflows that validate Go code: + +- [golangci-lint](https://github.com/actions/actions-runner-controller/blob/34f3878/.github/workflows/golangci-lint.yaml): + this is a collection of linters that currently runs on all PRs and push to + master +- [Validate ARC](https://github.com/actions/actions-runner-controller/blob/01e9dd3/.github/workflows/validate-arc.yaml): + this is a bit of a catch-all workflow, other than Go tests this also validates + Kubernetes manifests, runs `go generate`, `go fmt` and `go vet` +- [Run CodeQL](https://github.com/actions/actions-runner-controller/blob/a095f0b66aad5fbc8aa8d7032f3299233e4c84d2/.github/workflows/run-codeql.yaml) + +### Proposal + +I think having one `Go` workflow that collects everything-Go would help a ton with +reliability and understandability of what's going on. This shouldn't be limited +to the GHA-supported mode as there are changes that even if made outside the GHA +code base could affect us (such as a dependency update). +This workflow should only run on changes to `*.go` files, `go.mod` and `go.sum`. +It should have these jobs, aiming to cover all existing functionality and +eliminate some duplication: + +- `test`: run all Go tests in the project. We currently use the `-short` and + `-coverprofile` flags: while `-short` is used to skip [old ARC E2E + tests](https://github.com/actions/actions-runner-controller/blob/master/test/e2e/e2e_test.go#L85-L87), + `-coverprofile` is adding to the test time without really giving us any value + in return. We should also start using `actions/setup-go@v4` to take advantage + of caching (it would speed up our tests by a lot) or enable it on `v3` if we + have a strong reason not to upgrade. We should keep ignoring our E2E tests too + as those will be run elsewhere (either use `Short` there too or ignoring the + package like we currently do). As a dependency for tests this needs to run + `make manifests` first: we should fail there and then if there is a diff. +- `fmt`: we currently run `go fmt ./...` as part of `Validate ARC` but do + nothing with the results. We should fail in case of a diff. We don't need + caching for this job. +- `lint`: this corresponds to what's currently the `golanci-lint` workflow (this + also covers `go vet` which currently happens as part of `Validate ARC too`) +- `generate`: the current behaviour for this is actually quite risky, we + generate our code in `Validate ARC` workflow and use the results to run the + tests but we don't validate that up to date generate code is checked in. This + job should run `go generate` and fail on a diff. +- `vulncheck`: **EDIT: this is covered by CodeQL** the Go team is maintaining [`govulncheck`](https://go.dev/blog/vuln), a tool to recursively + analyzing all function calls in Go code and spot vulnerabilities on the call + stack. + +## Kubernetes manifests / Helm charts + +We have [recently separated](https://github.com/actions/actions-runner-controller/commit/bd9f32e3540663360cf47f04acad26e6010f772e) +Helm chart validation and we validate up-to-dateness of manifests as part of `Go +/ test`. + +## End to end tests + +These tests are giving us really good coverage and should be one of the main +actors when it comes to trusting our releases. Two improvements that could be +done here are: + +- renaming the workflow to `GHA E2E`: since renaming our resources the `gha` + prefix has been used to identify things related to the mode GitHub supports + and these jobs strictly validate the GitHub mode _only_. Having a shorter name + allows for more readability of the various scenarios (e.g. `GHA E2E / + single-namespace-setup`). +- the test currently monitors and validates the number of pods spawning during + the workflow but not the outcome of the workflow. While not necessary to look + at pods specifics, we should at least guarantee that the workflow can + successfully conclude. diff --git a/adrs/2023-04-14-adding-labels-k8s-resources.md b/docs/adrs/2023-04-14-adding-labels-k8s-resources.md similarity index 97% rename from adrs/2023-04-14-adding-labels-k8s-resources.md rename to docs/adrs/2023-04-14-adding-labels-k8s-resources.md index 737e0653..6a2eb184 100644 --- a/adrs/2023-04-14-adding-labels-k8s-resources.md +++ b/docs/adrs/2023-04-14-adding-labels-k8s-resources.md @@ -86,4 +86,4 @@ Or for example if they're having problems specifically with runners: This way users don't have to understand ARC moving parts but we still have a way to target them specifically if we need to. -[^1]: [ADR 2022-12-05](2022-12-05-adding-labels-k8s-resources.md) +[^1]: Supersedes [ADR 2022-12-05](2022-12-05-adding-labels-k8s-resources.md) \ No newline at end of file diff --git a/adrs/yyyy-mm-dd-TEMPLATE.md b/docs/adrs/yyyy-mm-dd-TEMPLATE.md similarity index 100% rename from adrs/yyyy-mm-dd-TEMPLATE.md rename to docs/adrs/yyyy-mm-dd-TEMPLATE.md