Commit Graph

23 Commits

Author SHA1 Message Date
Nikolay Edigaryev 40f58e4aee
More RPC-related logs (#136)
* More RPC-related logs

* Notifier should be set before we use it in the scheduler
2023-09-27 20:16:00 +04:00
Nikolay Edigaryev 3d0e375ede
Don't stop and delete VMs that failed to clone (#125)
* NewVM() never returns an error

* Don't stop and delete VMs that failed to clone
2023-09-13 19:39:10 +04:00
Nikolay Edigaryev bb3d6edcd5
Fix Tart VM IP detection in bridged mode (#124) 2023-09-12 08:52:21 +00:00
Nikolay Edigaryev 6759618f28
orchard create vm: support --image-pull-policy=Always (#110) 2023-07-26 17:43:14 +04:00
Nikolay Edigaryev d57d18d380
Support for sharing files with the host system (#103)
* Support for sharing files with the host system

* Integration tests

* Added back TestVMGarbageCollection comment
2023-07-04 18:10:53 +04:00
Nikolay Edigaryev 3c3b8e8180
Do not treat controller registration error as fatal (#100) 2023-06-29 19:29:32 +04:00
Nikolay Edigaryev 60e564da88
Implement restart policy for VMs (#83)
* Implement restart policy for VMs

* Do not update VM.Resource, we only use it as a read-only specification

* Err()/setErr(): use atomic.Pointer instead of sync.Mutex
2023-04-24 19:30:08 +04:00
Fedor Korotkov dd5e588eb0
Support Bridged Network (#78)
* Support Bridged Network

Inspired by https://github.com/cirruslabs/tart/issues/473

* Fixed tests
2023-04-20 15:04:07 +04:00
Nikolay Edigaryev 4eafec99a5
Fail VMs if the worker had crashed/is unhealthy (#70)
* Fail VMs if the worker had crashed/is unhealthy

* OnDiskName: properly handle cases when VM's name contains hyphens

* Worker: introduce Offline() method and check it before scheduling

* tart.List(): use Tart's JSON output

* OnDiskName: remove empty parts check

* Scheduler: move health-checking logic to a separate function

* Only fail "running" VMs

* Only fail orphaned VMs if they're in terminal state

* Integration tests

* Run healthCheckingLoopIteration() before schedulingLoopIteration()

* Worker: sync on-disk VMs only once at start
2023-04-03 16:47:49 +04:00
Fedor Korotkov f152043f19
Reactive Scheduling (#67)
Before we had two main loops: controller loop to assign VMs and worker loop to start VMs. Each of the loops was performed upon an interval every N seconds.

This change introduces a mechanism for reactively requesting loop execution:

 1. Controller loop will be executed upon VM creation to try to immediately schedule.
 2. A worker will be notified upon a VM assigment and worker loop will be requested to sync immediately.

 Fixes #31
2023-03-28 20:51:41 +04:00
Nikolay Edigaryev cb39836ee0
Resources support (#63)
* Resources support

* Ability to provide VM and worker resources via the CLI

* orchard dev: always listen on :6120

* orchard dev: support --resources

* REST API: provide resource defaults when creating VM

* OpenAPI: document "resources" field

* orchard dev: serve Swagger API documentation on /v1/

* Integration guide
2023-03-27 17:30:54 +04:00
Fedor Korotkov 362ea85b4f
Always require a client for running a worker (#52)
* Always require a client for running a worker

* Actually validate roles

* Delete worker

Fixes #46

* Update internal/worker/worker.go

Co-authored-by: Nikolay Edigaryev <edigaryev@gmail.com>

---------

Co-authored-by: Nikolay Edigaryev <edigaryev@gmail.com>
2023-03-24 17:44:20 +04:00
Nikolay Edigaryev af074f499d
Remove UID for now and use machine ID to differentiate workers (#48)
* Remove UID for now and use machine ID to differentiate workers

* Rename MetadataWorkerKey back to MetadataWorkerNameKey
2023-03-23 23:38:54 +04:00
Fedor Korotkov cdf5c5eb00
Simplified bootstrapping of a cluster (#40)
* Simplified bootstrapping of a cluster

Introduced a new convention about a pre-defined `bootstrap-admin` account for `orchard controller run`. Providing `ORCHARD_BOOTSTRAP_ADMIN_TOKEN` will auto-create such user for easier configuration. `bootstrap-admin` can be used for creating other service accounts on the first run and after that can be disposed.

Also change `orchard worker run` to expect controller URL as the only parameter and a bootstrap token passed via an argument instead of using a context that might not be created.

* Missing error check
2023-03-22 23:43:37 +04:00
Nikolay Edigaryev 10f56bb5e3
Introduce "orchard ssh" and "orchard vnc" commands (#36)
* proxy.Connections(): handle "use of closed network connection" error

* Controller: less strict timeouts that work nicely for WebSockets

* Worker: only attempt connect to the gRPC once our UID is known

* Introduce "orchard ssh" and "orchard vnc" commands

* Worker: prevent context leak by moving logic into a separate function

* Fix linter errors

* Port forwarding integration test

* Check for "uname -mo" output
2023-03-21 14:58:24 -04:00
Fedor Korotkov fb3056d3ae
Refactorings for simplify readability (#35) 2023-03-17 06:11:28 -04:00
Fedor Korotkov 3ecf98c039
Support `startup`/`shutdown` scripts (#33)
* Support `startup`/`shutdown` scripts

Fixes #26

* Fixed Go modules after rebase

* Fixes after rebase
2023-03-14 22:15:54 +04:00
Nikolay Edigaryev 47fef47d1c
Port forwarding support (#30)
* Port forwarding support

* .golangci.yml: remove and replace deprecated and archived linters

* Client: pass credentials when calling WebSocket API methods

* API: require ServiceAccountRoleComputeWrite role for port forwarding

* Use Buf

* Rename Poll() RPC method to Watch()

* Split Rendezvous into two parts: Watcher and Proxy (#32)

* Split Rendezvous into two parts: Watcher and Proxy

* Implement Proxy cancellation

* Use Protocol Buffers structure directly in Watcher

* Fix TestWatcher after switching to Protocol Buffers structure

* portForwardVM(): ensure we also check for gin's context
2023-03-14 11:31:13 -04:00
Fedor Korotkov 0582108ea6
Events Entity (#28)
* Generic Events

We can try to use these generic events for script execution and storing of the output logs in events with `log` kind.

* Lint issues

* Cleanup events upon VM deletion

* Basic integration test

* Run an actual VM in tests

* Apply suggestions from code review

Co-authored-by: Nikolay Edigaryev <edigaryev@gmail.com>

* Use POST

* Make newEventKey private

* Append events in batches

* Lint issues

* Private `scopePrefix`

---------

Co-authored-by: Nikolay Edigaryev <edigaryev@gmail.com>
2023-03-13 08:04:17 -04:00
Fedor Korotkov 165662bb0a
Better state syncing and other improvements (#24) 2023-03-01 11:42:16 -05:00
Nikolay Edigaryev a7264370f5
Introduce "controller init" and generate self-signed X.509 certificate (#17) 2023-02-04 11:40:07 +04:00
Nikolay Edigaryev 6bcc02d815
Use golangci-lint (#15) 2023-01-31 22:22:28 +04:00
Nikolay Edigaryev 92e8732d46
Initial version of the Orchard orchestration system (#3)
* Initial version of the Orchard orchestration system

* Update README.md

Co-authored-by: Fedor Korotkov <fedor.korotkov@gmail.com>

Co-authored-by: Fedor Korotkov <fedor.korotkov@gmail.com>
2023-01-26 23:46:23 +04:00