* Startup script: implement retries for connection-related operations
* assert.Equal → assert.Contains
* Wait for at least 1,000 lines of logs
* Join slice of strings before calling assert.Contains()
* TestHostDirs: use require.Contains() instead of require.EqualValues()
* TestHostDirs: wait for at least 4 log lines
* proxy.Connections(): require io.ReadWriteCloser instead of net.Conn
* Orchard Controller: implement an SSH server that acts as a jump host
* Issue a warning if the name used will be invalid in the future
* Further restrict uppercase characters in names in the future
The rationale is similar to https://github.com/kubernetes/kubernetes/issues/71140.
We won't want to munge the user's input and introduce subtle bugs doing
lowercase comparisons.
* Change event prefix to preserve order under load
When there are a lot of events streamed from a worker, it's possible to have two batches coming for the same timestamp (which is a timestamp of the event on the worker). This way the existing logic would mess up the order because `index` and the random number doesn't guarantee the order.
To fix this I've changed the format of the prefix for the event to include tro things:
1. Timestamp in nanoseconds of the injection time on the controller so two sequential batches will have guaranteed order unless they are processed within a nanosecond.
2. Made the `index` being fixed length with trailing zeros, so they are properly lexicographically sorted (`000001`, `000002`, ...).
* No need to disable linting
* Implement restart policy for VMs
* Do not update VM.Resource, we only use it as a read-only specification
* Err()/setErr(): use atomic.Pointer instead of sync.Mutex
* Fail VMs if the worker had crashed/is unhealthy
* OnDiskName: properly handle cases when VM's name contains hyphens
* Worker: introduce Offline() method and check it before scheduling
* tart.List(): use Tart's JSON output
* OnDiskName: remove empty parts check
* Scheduler: move health-checking logic to a separate function
* Only fail "running" VMs
* Only fail orphaned VMs if they're in terminal state
* Integration tests
* Run healthCheckingLoopIteration() before schedulingLoopIteration()
* Worker: sync on-disk VMs only once at start
* Resources support
* Ability to provide VM and worker resources via the CLI
* orchard dev: always listen on :6120
* orchard dev: support --resources
* REST API: provide resource defaults when creating VM
* OpenAPI: document "resources" field
* orchard dev: serve Swagger API documentation on /v1/
* Integration guide
* proxy.Connections(): handle "use of closed network connection" error
* Controller: less strict timeouts that work nicely for WebSockets
* Worker: only attempt connect to the gRPC once our UID is known
* Introduce "orchard ssh" and "orchard vnc" commands
* Worker: prevent context leak by moving logic into a separate function
* Fix linter errors
* Port forwarding integration test
* Check for "uname -mo" output
* Generic Events
We can try to use these generic events for script execution and storing of the output logs in events with `log` kind.
* Lint issues
* Cleanup events upon VM deletion
* Basic integration test
* Run an actual VM in tests
* Apply suggestions from code review
Co-authored-by: Nikolay Edigaryev <edigaryev@gmail.com>
* Use POST
* Make newEventKey private
* Append events in batches
* Lint issues
* Private `scopePrefix`
---------
Co-authored-by: Nikolay Edigaryev <edigaryev@gmail.com>