Commit Graph

131 Commits

Author SHA1 Message Date
Nikolay Edigaryev bafcf6fac2
Simplify state reconciliation and support changing Softnet settings (#364)
* Simplify state reconciliation and support changing Softnet settings

* Remove unused "updateFunc" parameter from syncOnDiskVMs()

* Don't take an address of a loop variable

* ensure → ensures

* updateVMState(): don't forget to update VMState

* Introduce TestSpecUpdateSoftnet integration test

* Update OpenAPI specification to include generation/observedGeneration
2025-11-06 20:56:31 +04:00
Nikolay Edigaryev 08e9dfbbfe
Support "tart run"'s --net-softnet-allow and --net-softnet-block (#361)
* Support "tart run"'s --net-softnet-allow and --net-softnet-block

* Use ghcr.io/cirruslabs/macos-tahoe-base:latest by default
2025-10-27 23:07:43 +04:00
Nikolay Edigaryev af221cf3c1
Support for prefixed Orchard Controller API URLs (#355)
* Support for prefixed Orchard Controller API URLs

* Fix Swagger UI

* Remove spurious "fmt" import

* Use url.URL in order to correctly calculate API path for Swagger UI
2025-10-06 20:04:47 +04:00
Nikolay Edigaryev 6d23548d81
API spec: document VM object more thoroughly (#354)
* API spec: document VM object more thoroughly

* Describe hostDirs and signify that it's worker-local in docs
2025-10-06 18:22:57 +04:00
Nikolay Edigaryev c5e0d68a3d
API: introduce ability to watch a VM (#351)
* API: introduce ability to watch a VM

* Document ?watch=true for GET /vms/{name} in the OpenAPI specification

* WatchVM: ensure that goroutine is terminated on early return with error

* WatchVM: close channels on goroutine exit

* WatchVM: ensure that we wait for the goroutine after additional barriers

* WatchVM: ignore unexpected keys instead of throwing an error

* WatchVM: perform context-aware writes to a bounded channel

* WatchVM: don't forget to close errCh on goroutine exit too

* WatchVM: don't close readyCh in goroutine to avoid ambiguity

* WatchVM: filter out spurious KVs that signify VM deletion
2025-10-03 21:34:53 +04:00
Nikolay Edigaryev cdece3149b
orchard create vm: do not enable --nested by default (#348) 2025-09-29 17:37:28 +04:00
Nikolay Edigaryev 43e21c7963
orchard create vm: "--nested" flag to enable nested virtualization (#346) 2025-09-26 19:42:44 +04:00
Nikolay Edigaryev 873efb24e7
ghcr.io/cirruslabs/macos-sequoia-base:latest for everything (#344) 2025-09-25 20:43:53 +04:00
Nikolay Edigaryev 56260e7667
Worker: automatically scrape logical cores and memory size (#341) 2025-09-17 00:13:42 +04:00
Nikolay Edigaryev f5aa04e98b
orchard controller run: introduce configurable --worker-offline-timeout (#342) 2025-09-17 00:10:39 +04:00
Nikolay Edigaryev 26668f2cbd
orchard controller run: introduce --experimental-disable-db-compression (#336) 2025-08-19 17:31:18 +04:00
Nikolay Edigaryev 39fbbbc2a6
Disable Prometheus metrics by default (#331) 2025-07-17 00:58:13 +04:00
Nikolay Edigaryev ed7921ce16
Fix websocket.(*Conn).timeoutLoop goroutine leak (#329) 2025-07-11 15:23:50 +04:00
Nikolay Edigaryev ae7cdd8628
orchard controller run: introduce "--listen-pprof" command-line argument (#326)
* orchard controller run: introduce "--pprof" command-line flag

* --pprof → --listen-pprof

* Log pprof HTTP server error, if any
2025-06-26 20:15:10 +04:00
Nikolay Edigaryev 7957a9b95a
Try "tart ip --resolver=agent" first when using "--net-bridged" (#323) 2025-06-19 17:36:56 +04:00
Nikolay Edigaryev 76f0672759
spf13/cobra: don't use PersistentFlags() (#319) 2025-05-26 19:58:37 +04:00
Nikolay Edigaryev a37a8914cd
orchard controller run: introduce --experimental-ping-interval (#316)
* orchard controller run: introduce --experimental-ping-interval

* Ensure that --experimental-ping-interval is always larger than 5s
2025-05-15 21:14:17 +04:00
Nikolay Edigaryev d52aa91927
Controller: periodically send PINGs on all WebSocket connections (#315) 2025-05-15 18:43:52 +04:00
Nikolay Edigaryev 507db0fcfe
orchard create vm: introduce --disk-size command-line argument (#313) 2025-04-29 18:21:46 +04:00
Nikolay Edigaryev 40f222c408
Worker: fix "failed to retrieve Orchard's home directory path" (#309)
When running through launchd and no HOME is set.
2025-04-17 21:57:04 +04:00
Nikolay Edigaryev 0a3d9c6d1c
BadgerDB: periodically perform garbage collection (#307)
* BadgerDB: periodically perform garbage collection

* GC every hour
2025-04-16 00:44:04 +04:00
Nikolay Edigaryev e3e585778c
Worker: do not block RPCv2 when performing forwarding ports and resolving IPs (#306)
* Worker: do not block RPCv2 when performing actions

* Do not block RPCv1 with handleGetIP() too
2025-04-16 00:18:02 +04:00
Nikolay Edigaryev 3c2de83ea7
Orchard Worker: don't forget to use localnetworkhelper in RPC and RPCv2 (#304)
* Orchard Worker: don't forget to use localnetworkhelper in RPC and RPCv2

* Fix integration tests by not requiring an empty vm.StatusMessage
2025-04-11 00:15:13 +04:00
Nikolay Edigaryev abcfee677d
Work around Sequoia's "Local Network" permission with a helper process (#302)
* Work around Sequoia's "Local Network" permission with a helper process

* README.md: macOS 15 (Sequoia) warning

* Make "orchard dev" unix-specific too, otherwise Release fails

* Fix typo in "localNetworkHerlper"

* Slightly improve the macOS 15 (Sequoia) note

* orchard worker run: better documentation for --user

* Make sure privilege dropping is the first step we do in runWorker()
2025-04-10 18:01:19 +04:00
Nikolay Edigaryev c24db17aa5
Use VM status message to reflect pulling, cloning, configuring, etc. (#298) 2025-04-03 18:08:13 +04:00
Nikolay Edigaryev 599ac40a90
orchard ssh vm: prevent busy loop in remote terminal resize goroutine (#297) 2025-04-02 14:07:52 +00:00
Nikolay Edigaryev 9919117b9b
orchard controller run: create a default bootstrap context (#291)
* orchard controller run: create a default bootstrap context

* Dockerfile: correct AS casing

* Fix typo in BootstrapContextName
2025-03-27 18:48:04 +04:00
Nikolay Edigaryev 7d340d6908
.golangci.yml: support golangci-lint 2.0 (#289) 2025-03-24 23:58:47 +04:00
gsakun 705bf8bd83
add insecure-no-tls flag (#281)
* support enable tls flag

* modify tls enable control flag

Co-authored-by: Nikolay Edigaryev <edigaryev@gmail.com>

* Optimize message print

* Avoid unrelated changes to the bootstrap message

* Consistent command-line argument order

* Extra spacing

* No need to shadow controllerCert

---------

Co-authored-by: Nikolay Edigaryev <edigaryev@gmail.com>
2025-03-22 00:09:24 +04:00
Nikolay Edigaryev 39243978ed
orchard context create: ask for service account name and token (#282)
If not provided either via --bootstrap-token or via
--service-account-{name,token}.
2025-03-20 02:21:44 +04:00
Nikolay Edigaryev 59007020f4
Controller: enable experimental RPC v2 by default (#280)
* Controller: enable experimental RPC v2 by default

* Ensure mutual exclusiveness for --{,no-}experimental-rpc-v2

* Check earlier
2025-03-18 21:28:01 +04:00
Nikolay Edigaryev d5cd08fcce
Controller: advertise ALPN (#279) 2025-03-18 18:55:45 +04:00
dependabot[bot] c70eb068d4
Bump go.opentelemetry.io/otel/sdk/metric from 1.27.0 to 1.34.0 (#257)
* Bump go.opentelemetry.io/otel/sdk/metric from 1.27.0 to 1.34.0

Bumps [go.opentelemetry.io/otel/sdk/metric](https://github.com/open-telemetry/opentelemetry-go) from 1.27.0 to 1.34.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.27.0...v1.34.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/sdk/metric
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* opentelemetry: add TestConfigure

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nikolay Edigaryev <edigaryev@gmail.com>
2025-02-20 02:19:11 +04:00
Nikolay Edigaryev 818f4288c2
Controller API: correctly detect WebSocket closure in Watch RPC (#259) 2025-02-20 02:00:57 +04:00
Nikolay Edigaryev 2c8d36ef70
Always randomize MAC address (#251)
* Always randomize MAC address

* Worker: check DHCP lease time and print a warning if it's unconfigured

* Further improve the explanation

* Add two leases example to the explanation

* Add an example of the resulting /var/db/dhcpd_leases
2025-02-13 12:35:12 +00:00
Nikolay Edigaryev 2aae818f78
Worker: prefer assigned CPU/memory to CPU/memory (#250)
* Worker: prefer assigned CPU/memory to CPU/memory

* orchard get worker: show default CPU, default memory and labels
2025-02-13 16:23:47 +04:00
Nikolay Edigaryev ee3c0f91f2
Startup script: implement retries for connection-related operations (#249)
* Startup script: implement retries for connection-related operations

* assert.Equal → assert.Contains

* Wait for at least 1,000 lines of logs

* Join slice of strings before calling assert.Contains()

* TestHostDirs: use require.Contains() instead of require.EqualValues()

* TestHostDirs: wait for at least 4 log lines
2025-02-12 18:11:12 +04:00
Nikolay Edigaryev 4794f2a5b6
orchard create vm: introduce --random-serial command-line argument (#248) 2025-02-12 18:00:13 +04:00
Nikolay Edigaryev 61d7d34ea4
RPC v2: fix Ping() hanging due to PONG not being processed (#247) 2025-02-07 22:05:09 +04:00
Nikolay Edigaryev 8dd74db446
Worker notification improvements (#246)
* OpenAPI: document all default "wait" values

* Re-use waitContext instead of instantiating it anew
2025-02-07 00:38:04 +04:00
Nikolay Edigaryev 722d5a8eaf
Avoid including " and $ characters in bootstrap admin's token (#245)
* Avoid including " and $ characters in bootstrap admin's token

* Avoid fallthrough
2025-02-06 21:37:42 +04:00
Fedor Korotkov 86f0afb5a3
Small timout for worker notification (#242)
* Small timout for worker notification

It seems at the moment if a worker re-establishes notify stream (for example, if network flips or proxy breaks the connection) then we can see "no worker registered with this name" errors.

This change makes Notifier to wait for 30 seconds before failing, at the time of calling `Notifier#Notify` we know such worker exists.

PS not sure if we need to make the timeout configurable.

* Wait via context

* Make sure all `context`s for `Notify` is time bounded

* Lint issues
2025-02-06 17:30:09 +00:00
Nikolay Edigaryev 26c8808506
Support scheduling by labels (#244) 2025-02-06 18:05:36 +04:00
Nikolay Edigaryev 581de320b9
Allow creating VMs with implicit CPU and memory (#243)
* Allow creating VMs with implicit CPU and memory

* Clarify why cpu/memory can be 0 a bit better

* Controller(API): don't forget to update DefaultCPU and DefaultMemory

* Add an integration test for implicit CPU and memory
2025-02-06 00:50:01 +04:00
Nikolay Edigaryev 88fba8004d
Introduce WebSocket-based RPC v2 (#239)
* Introduce WebSocket-based RPC v2

* go test: add -ldflags="-B gobuildid"

* No need to change the "controller.workerNotifier.Notify()" error message

* No need to modify Protocol Buffers/gRPC generated code

* rpcWatch(): explain that connection shouldn't be normally be closed

* Avoid "port forwarding failed: " repetition in error messages

* Improve comments and avoid repetition in IP resolution errors
2025-01-30 17:33:32 +04:00
Nikolay Edigaryev 077252f6d4
Prevent goroutine leak when Close()'ing *grpc_net_conn.Conn (#237) 2025-01-23 18:17:14 +04:00
Nikolay Edigaryev 1fce915d67
API: only overwrite specific worker fields when worker already exists (#236)
* API: only overwrite specific worker fields when worker already exists

* Don't forget to return when creating new worker

* Return updated worker when updating the worker
2025-01-16 16:42:17 +04:00
Nikolay Edigaryev 08769e00b4
Worker: do not consider on-disk VMs syncing error as fatal (#230) 2024-12-11 19:56:00 +04:00
Nikolay Edigaryev d7b6f477e1
Never list workers in Update()/storeUpdate() transactions (#228)
* POST /v1/workers: do not list workers in a single update txn

* schedulingLoopIteration(): do not list workers in a single update txn

* .golangci.yml: remove mentions of fully deprecated linters
2024-12-05 16:59:50 +04:00
Nikolay Edigaryev d94690176e
Schedule opportunistically and more granularly (#225)
* Schedule opportunistically and more granularly

To avoid transaction conflicts.

* Measure scheduling loop iteration duration and log it at debugging level

* Use "continue NextWorker" instead of just "continue" for clarity
2024-12-03 14:11:48 +00:00