Commit Graph

47 Commits

Author SHA1 Message Date
Yusuke Kuoka c74ad6195f
Fix runners to do their best to gracefully stop on pod eviction (#1759)
Ref #1535
Ref #1581

Signed-off-by: Yusuke Kuoka <ykuoka@gmail.com>
2022-11-01 20:30:10 +09:00
Yusuke Kuoka 2dd13b4a19
runner: Address all shellcheck findings (#1854)
I am about to revisit #1517, #1454, #1561, and #1560 as a part of our on-going effort for a major enhancement to the runner entrypoints being made in #1759.

This change updates and reintroduces #1517 contributed by @CASABECI in a way it becomes applicable to today's code-base.
2022-10-04 20:30:27 +09:00
Felipe Galindo Sanchez 11cb9b7882
feat: allow to discover runner statuses (#1268)
* feat: allow to discover runner statuses

* fix manifests

* Bump runner version to 2.289.1 which includes the hooks support

* Add feedback from review

* Update reference to newRunnerPod

* Fix TestNewRunnerPodFromRunnerController and make hooks file names job specific

* Fix additional TestNewRunnerPod test

* Cover additional feedback from review

* fix rbac manager role

* Add permissions to service account for container mode if not provided

* Rename flag to runner.statusUpdateHook.enabled and fix needsServiceAccount

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2022-07-10 15:11:29 +09:00
Callum Tait e3deb0d752
chore: move runner docker check (#1548) 2022-06-30 11:31:50 +09:00
Yusuke Kuoka 9b28e633c1
Drop support for --once (#1580)
Ref #1196
2022-06-29 21:49:52 +09:00
Bernardo Meurer bf45aa9f6b
refactor(runner/entrypoint): don't mv externalstmp if it's not there (#1315) 2022-05-16 18:37:37 +09:00
Yusuke Kuoka c1e5829b03
refactor(runner): ability to opt-out of using --ephemeral / opt-in to legacy --once for GHES older than 3.3 (#1384)
* runner: Remove the ability to use the deprecated `--once` flag

Ref #1196

* runner: Ability to opt-out of using --ephemeral

Although we are going to eventually remove the ability to use the legacy --once flag as proposed in #1196, there might be folks still using legacy GHES versions 3.2 or earlier.

This commit removes the existing feature flag to opt-in for --ephemeral, while adding another feature flag RUNNER_FEATURE_FLAG_ONCE to opt-in for --once so that folks stuck in legacy GHES versions
can still use ARC.

Since this change every user starts using --ephemeral by default. If they see any issues on legacy GHES instance, RUNNER_FEATURE_FLAG_ONCE=true can be set to opt-in to keep using --once, which gives one more ARC release until they upgrade their GHES instance.

But beware, we won't support legacy GHES instances forever as it's going to be a maintenance nightmare. Please upgrade!

Ref #1196
2022-05-11 09:55:33 +01:00
Richard Fussenegger 8db071c4ba
Improved Bash Logger (#1246)
* Improved Bash Logger

This is a first step towards having robust Bash scripts in the runner images. The changes _could_ be considered breaking, depending on our backwards compatibility definition.

* Fixed Log Formatting Issues

Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com>
2022-04-12 22:02:06 +01:00
Rolf Ahrenberg 7124451cea
chore: fix typo (#1316) [skip ci] 2022-04-08 17:32:01 +01:00
Bernardo Meurer e46df413a1
refactor(runner/entrypoint): check for externalstmp (#1277)
* refactor(runner/entrypoint): check for externalstmp [skip ci]

Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com>
2022-03-30 12:18:18 +01:00
Callum Tait 2cb04ddde7
* feat: move to new run.sh container friendly file (#1244)
* fix: unit tests were very broken

Co-authored-by: toast-gear <toast-gear@users.noreply.github.com>
2022-03-22 19:02:51 +00:00
Richard Fussenegger 532a2bb2a9
feat: remove registration-only runner logic from entrypoint (#1249)
Closes #1207
2022-03-22 18:33:14 +00:00
Richard Fussenegger a68eede616
feat: copy dotfiles from asset to service dir (#1136)
* feat: copy dotfiles from asset to service dir

* Fixed `UNITTEST` Condition

* Load `/etc/environment`

See https://github.com/actions/runner/issues/1703 for context on this change.
2022-03-18 07:40:52 +00:00
toast-gear c4c6e833a7 chore: add deprecation warning 2022-03-14 12:35:07 +00:00
Chris Bui 1b911749a6
feat: disable automatic runner updates (#1088)
* Add env variable to configure `disablupdate` flag

* Write test for entrypoint disable update

* Rename flag, update docs for DISABLE_RUNNER_UPDATE

* chore: bump runner version in makefile

Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com>
2022-02-03 21:03:38 +00:00
Callum Tait f09a974ac2
chore: change to trigger build (#1079)
* chore: change to trigger build

Co-authored-by: toast-gear <toast-gear@users.noreply.github.com>
2022-01-28 21:57:53 +00:00
cspargo 9d5a562407
fix: use copy instead of move (#1066)
* fix: use copy instead of move

Co-authored-by: Colin Spargo <cspargo@users.noreply.github.com>
2022-01-28 21:24:52 +00:00
Callum Tait ad48851dc9
feat: expose if docker is enabled and wait for docker to be ready (#962)
Resolves #897
Resolves #915
2021-12-29 10:23:35 +09:00
Sebastien Le Digabel a98729b08b Adding github action for entrypoint unit test
... and adding safety mechanism in UNITTEST handling.
2021-09-06 08:51:28 +09:00
Sebastien Le Digabel ec0915ce7c Adding some unit testing for entrypoint.sh
The unit tests are simulating a run for entrypoint. It creates some
dummy config.sh and runsvc.sh and makes sure the logic behind
entrypoint.sh is correct.

Unfortunately the entrypoint.sh contains some sections that are not
mockable so I had to put some logic in there too.

Testing includes for now:
- the normal scenario
- the normal non-ephemeral scenario
- the configuration failure scenario

Also tested the entrypoint.sh on a real runner, still works as expected.
2021-09-06 08:51:28 +09:00
Sebastien Le Digabel d355f05ac0 Adding retry after config and formatted logging
Adding a basic retry loop during configuration. If configuration fails,
the runner will just straight into a retry loop and will continuously
fail until it dies after a while.

This change will retry 10 times and will exit if the configuration
wasn't successful.

Also, changed the logging format, adding a bit of color in the event of
success or failure.
2021-09-06 08:51:28 +09:00
Rob Bos fb66b28569
Change `move` command to `copy` to prevent issues (#716)
Prevents issues when /runner and /runnertmp are in different devices

Fixes #686
2021-08-11 09:53:42 +09:00
Yusuke Kuoka fabead8c8e
feat: Workflow job based ephemeral runner scaling (#721)
This add support for two upcoming enhancements on the GitHub side of self-hosted runners, ephemeral runners, and `workflow_jow` events. You can't use these yet.

**These features are not yet generally available to all GitHub users**. Please take this pull request as a preparation to make it available to actions-runner-controller users as soon as possible after GitHub released the necessary features on their end.

**Ephemeral runners**:

The former, ephemeral runners, is basically the reliable alternative to `--once`, which we've been using when you enabled `ephemeral: true` (default in actions-runner-controller).

`--once` has been suffering from a race issue #466. `--ephemeral` fixes that.

To enable ephemeral runners with `actions/runner`, you give `--ephemeral` to `config.sh`. This updated version of `actions-runner-controller` does it for you, by using `--ephemeral` instead of `--once` when you set `RUNNER_FEATURE_FLAG_EPHEMERAL=true`.

Please read the section `Ephemeral Runners` in the updated version of our README for more information.

Note that ephemeral runners is not released on GitHub yet. And `RUNNER_FEATURE_FLAG_EPHEMERAL=true` won't work at all until the feature gets released on GitHub. Stay tuned for an announcement from GitHub!

**`workflow_job` events**:

`workflow_job` is the additional webhook event that corresponds to each GitHub Actions workflow job run. It provides `actions-runner-controller` a solid foundation to improve our webhook-based autoscale.

Formerly, we've been exploiting webhook events like `check_run` for autoscaling. However, as none of our supported events has included `labels`, you had to configure an HRA to only match relevant `check_run` events. It wasn't trivial.

In contrast, a `workflow_job` event payload contains `labels` of runners requested. `actions-runner-controller` is able to automatically decide which HRA to scale by filtering the corresponding RunnerDeployment by `labels` included in the webhook payload. So all you need to use webhook-based autoscale will be to enable `workflow_job` on GitHub and expose actions-runner-controller's webhook server to the internet.

Note that the current implementation of `workflow_job` support works in two ways, increment, and decrement. An increment happens when the webhook server receives` workflow_job` of `queued` status. A decrement happens when it receives `workflow_job` of `completed` status. The latter is used to make scaling-down faster so that you waste money less than before. You still don't suffer from flapping, as a scale-down is still subject to `scaleDownDelaySecondsAfterScaleOut `.

Please read the section `Example 3: Scale on each `workflow_job` event` in the updated version of our README for more information on its usage.
2021-08-11 09:52:04 +09:00
toast-gear 743e6d6202
feat: bump runner version (#705)
* feat: bump runner version

* feat: remove deprecated env var

* docs: updating the docs

Co-authored-by: Callum James Tait <callum.tait@photobox.com>
2021-07-30 19:58:04 +09:00
toast-gear 82d1be7791
chore: deprecate STARTUP_DELAY (#678)
* chore: deprecate STARTUP_DELAY

* chore: adding better comments

* chore: whitespace correction
2021-07-03 11:51:07 +01:00
Yusuke Kuoka 8b90b0f0e3
Clean up import list (#645)
Resolves #644
2021-06-22 17:55:06 +09:00
Shubham Gopale 1084a37174
We are exiting if its a registration-only runner (#641) 2021-06-22 17:26:03 +09:00
Tim Birkett a93fd21f21
feat: add STARTUP_DELAY to entrypoint.sh (#592)
Ref #591 

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-06-04 08:57:59 +09:00
Thejas N 588872a316
feat: allow ephemeral runner to be optional (#498)
- Adds `ephemeral` option to `runner.spec` 
    
    ```
      ....
      template:
         spec:
             ephemeral: false
             repository: mumoshu/actions-runner-controller-ci
      ....
    ```
- `ephemeral` defaults to `true`
- `entrypoint.sh` in runner/Dockerfile modified to read `RUNNER_EPHEMERAL` flag
- Runner images are backward-compatible. `--once` is omitted only when the new envvar `RUNNER_EPHEMERAL` is explicitly set to `false`.

Resolves #457
2021-05-02 19:04:14 +09:00
Yusuke Kuoka dbd7b486d2
feat: Support for scaling from/to zero (#465)
This is an attempt to support scaling from/to zero.

The basic idea is that we create a one-off "registration-only" runner pod on RunnerReplicaSet being scaled to zero, so that there is one "offline" runner, which enables GitHub Actions to queue jobs instead of discarding those.

GitHub Actions seems to immediately throw away the new job when there are no runners at all. Generally, having runners of any status, `busy`, `idle`, or `offline` would prevent GitHub actions from failing jobs. But retaining `busy` or `idle` runners means that we need to keep runner pods running, which conflicts with our desired to scale to/from zero, hence we retain `offline` runners.

In this change, I enhanced the runnerreplicaset controller to create a registration-only runner on very beginning of its reconciliation logic, only when a runnerreplicaset is scaled to zero. The runner controller creates the registration-only runner pod, waits for it to become "offline", and then removes the runner pod. The runner on GitHub stays `offline`, until the runner resource on K8s is deleted. As we remove the registration-only runner pod as soon as it registers, this doesn't block cluster-autoscaler.

Related to #447
2021-05-02 16:11:36 +09:00
Florian Braun 5b7807d54b
Quote vars in entrypoint.sh to prevent unwanted argument split (#420)
Prevents arguments from being split when e.g. the RUNNER_GROUP variable contains spaces (which is legit. One can create such groups in GitHub).

I've seen that all workers with group names that contain no spaces can register successfully, while all workers with groups that contain spaces will not register.

Furthermore, I suppose also other chars can be used here to inject arbitrary commands in an unsupported way via e.g. pipe symbol.

Quoting the vars correctly should prevent that and allow for e.g. group names and runner labels with spaces and other bash reserved characters.
2021-03-31 10:09:08 +09:00
Johannes Nicolai 8c0f3dfc79
Set runner group for runners with enterprise scope (#376)
* so far, runner group parameter is only set for runners with org scope
* now set group for enterprise runners as well
* removed null check for org scope as either org or enterprise will be set
2021-03-08 09:18:23 +09:00
Jesse Haka 28e80a2d28
Add support for enterprise runners (#290)
* Add support for enterprise runners

* update docs
2021-02-05 09:31:06 +09:00
Yusuke Kuoka ace95d72ab
Fix self-update failuers due to /runner/externals mount (#253)
* Fix self-update failuers due to /runner/externals mount

Fixes #252

* Tested Self-update Fixes (#269)

Adding fixes to #253 as confirmed and tested in https://github.com/summerwind/actions-runner-controller/issues/264#issuecomment-764549833 by @jolestar, @achedeuzot and @hfuss 🙇 🍻

Co-authored-by: Hayden Fuss <wifu1234@gmail.com>
2021-01-24 10:58:35 +09:00
Reinier Timmer ee8fb5a388
parametrized working directory (#185)
* parametrized working directory

* manifests v3.0
2020-11-25 08:55:26 +09:00
Erik Nobel 4e93879b8f
[BUG?]: Create mountpoint for /externals/ (#203)
* runner/controller: Add externals directory mount point

* Runner: Create hack for moving content of /runner/externals/ dir

* Externals dir Mount: mount examples for '__e/node12/bin/node' not found error
2020-11-25 08:53:47 +09:00
Reinier Timmer bc35bdfa85
fixed label argument in entrypoint (#162)
* fixed label argument in entrypoint

* Removed quotes from RUNNER_GROUP_ARG
2020-11-11 08:44:51 +09:00
Dan Webb dcf8524b5c
Adds RUNNER_GROUP argument to the runner registration (#157)
* Adds RUNNER_GROUP argument to the runner registration

Adds the ability to register a runner to a predefined runner_group

Resolves #137

* Update README with runner group example

- Updates the README with instructions of how to add the runner to a
  group
- Fix code fencing for shell and yaml blocks in the README
- Use consistent bullet points (dash not asterisk)
2020-11-10 17:15:54 +09:00
Juho Saarinen 40c5050978
Added support for other than public GitHub URL (#146)
Refactoring a bit
2020-10-28 22:15:53 +09:00
Yury Tsarev b79ea980b8
Use self update ready entrypoint (#99)
* Use self update ready entrypoint

* Add --once support for runsvc.sh

Run `cd runner; NAME=$DOCKER_USER/actions-runner TAG=dev make docker-build docker-push`,
`kubectl apply -f release/actions-runner-controller.yaml`,
then update the runner image(not the controller image) by updating e.g. `Runner.Spec.Image` to `$DOCKER_USER/actions-runner:$TAG`, for testing.

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2020-10-05 08:58:20 +09:00
Reinier Timmer 8c5b776807 support runner labels 2020-04-28 11:14:31 +02:00
Reinier Timmer 75d15ee91b backwards compatibility of dockerfile 2020-04-28 11:14:31 +02:00
Reinier Timmer fb35dd4131 support for organization runners 2020-04-28 11:14:31 +02:00
Moto Ishizawa f2d3ca672f Unset environment variables for runner config 2020-02-06 22:15:26 +09:00
Moto Ishizawa c66916a4ee Use dumb-init to handle signal properly 2020-02-06 18:47:50 +09:00
Moto Ishizawa d0d6238963 Run once 2020-01-31 19:29:29 +09:00
Moto Ishizawa cea4d084e4 Add runner container image 2020-01-28 21:56:54 +09:00