Added support for Darwin backend using vfkit/vmnet-helper #15
Draft
jedhamzawi wants to merge 34 commits into
Draft
Added support for Darwin backend using vfkit/vmnet-helper #15jedhamzawi wants to merge 34 commits into
jedhamzawi wants to merge 34 commits into
Conversation
Lay the foundation for macOS VM support so abox can run on developer machines that don't have libvirt/KVM. Uses vfkit as a subprocess, mirroring the existing virsh CLI-wrapping pattern on Linux. - Create internal/backend/darwin/ with stub Backend, VMManager, NetworkManager, and DiskManager (registered as "macos" backend) - Split platform-specific code with //go:build tags for backend imports, privilege helper, peer credentials, and runtime dirs - Add DirectPrivilegeClient for macOS (user-owned storage dirs don't need root escalation) - Add platform-specific RuntimeDir fallback (os.TempDir on macOS) Co-Authored-By: Claude Opus 4.6 (200K context) <noreply@anthropic.com>
Create internal/vfkit/ to wrap the vfkit CLI binary, mirroring how internal/libvirt/ wraps virsh. vfkit runs as a foreground process (unlike libvirtd), so this package handles process lifecycle directly. - Add Commander abstraction for testable command execution - Add VMConfig type and BuildArgs for vfkit CLI argument construction (EFI boot, virtio-blk, virtio-net, virtio-serial, REST API) - Add process lifecycle: StartVM (detached background), StopVM (SIGTERM→poll→SIGKILL), ForceStopVM, IsRunning via PID files - Add REST API client for vfkit state queries and graceful ACPI shutdown - Add unit tests for argument building, PID file handling, port allocation Co-Authored-By: Claude Opus 4.6 (200K context) <noreply@anthropic.com>
Implement all DiskManager methods for the darwin backend (Phase 3), mirroring the libvirt implementation. Uses DirectPrivilegeClient for user-owned storage at ~/Library/Application Support/abox/images/. - Implement Create, Delete, EnsureBaseImage, Import, Export using qemu-img and the PrivilegeClient interface - Add rebaseDisk, flattenDisk, copyDiskFile helpers matching libvirt - Add backend_test.go with ResourceNames and StorageDir tests for parity with libvirt backend test coverage Co-Authored-By: Claude Opus 4.6 (200K context) <noreply@anthropic.com>
- Implement all 11 VMManager interface methods (Create, Start, Stop, ForceStop, Remove, Exists, IsRunning, State, GetIP, GetUUID, Redefine) - Store VM UUID and REST API port in BackendConfig for state persistence - Look up VM IP via ARP table with macOS zero-stripping MAC normalization - Stop uses REST API graceful shutdown with signal fallback and PID cleanup - Update DryRun to show actual vfkit CLI args - Rename backend from "macos" to "vfkit" for consistency with "libvirt" - Add "vfkit" to valid backends in validation - Add unit tests for all helpers and ARP output parsing Co-Authored-By: Claude Opus 4.6 (200K context) <noreply@anthropic.com>
- Extract iptables/UFW firewall functions from start, stop, and remove commands into platform-specific files using build tags (start_firewall_linux.go, stop_firewall_linux.go, remove_firewall_linux.go) - Add darwin no-op stubs ready for Phase 6 pfctl implementation - Fix NetworkManager.IsActive to return true (vmnet is always available) - Add NetworkManager interface compliance and behavior tests Phase 5 of macOS backend: unblocks `abox start` past firewall setup. Next blocker is Phase 6 (dnsfilter bind address + pfctl redirect). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implements Phase 6 (DNS/HTTP filtering via pfctl) and Phase 6.5 (macOS privilege helper architecture) for the darwin/vfkit backend. Phase 6 — DNS filtering on macOS: - Add vmnet gateway detection (internal/vmnet/) to use real vmnet subnet (192.168.64.0/24) instead of abox subnet pool - Add SubnetProvider backend interface for vmnet-managed networking - Platform-specific filter listen addresses: dnsfilter binds to 127.0.0.1 on macOS (pfctl redirects), httpfilter binds to 0.0.0.0 - PfctlClient wrapping privilege helper RPCs for DNS redirect rules using PF anchors (abox/<name>) for per-instance isolation - Post-boot firewall setup (setupPostBootFirewall) since vmnet assigns VM IPs dynamically via DHCP - Stop/remove cleanup flushes pfctl anchors Phase 6.5 — macOS privilege helper: - Extract shared scaffolding from helper.go into helper_common.go (RunHelper, token auth, audit interceptor, command resolution) - Platform-specific PrivilegeServer implementations: helper_linux.go (iptables/UFW/TOCTOU-safe file ops) and helper_darwin.go (pfctl RPCs and simpler file ops for user-owned storage) - Add PfctlEnable/PfctlLoadAnchor/PfctlFlushAnchor RPCs to proto - Remove DirectClient (dead code from Phase 1 stopgap) — macOS now uses the real privilege helper via sudo, same architecture as Linux - Register privilege-helper subcommand on darwin - Move Linux-specific path validation to validation_linux.go - Allow spaces in path validation (macOS Application Support paths) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
vfkit (Apple Virtualization.framework) only supports raw disk images, not qcow2. This rewrites the darwin DiskManager to use raw format with APFS copy-on-write clones for space-efficient instance disks. - Add DiskFormat field to Instance config and DiskFormatProvider optional interface for backends requiring non-default disk format - Rewrite darwin DiskManager: EnsureBaseImage converts qcow2→raw once, Create uses APFS cp -c clone + os.Truncate, Import/Export handle format conversion between raw and portable qcow2 archives - Use direct file operations instead of privilege helper for darwin disk ops (storage is user-owned ~/Library/Application Support/abox/) - Add platform-aware cloud-init ISO install (direct copy on macOS, privilege helper on Linux) to fix root-owned file permission errors - Atomic base image conversion via temp file + rename to prevent races - Update base remove command to scan for both disk.qcow2 and disk.raw Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 7 of the macOS backend: Apple Silicon requires arm64 guests (Virtualization.framework cannot emulate x86_64). Previously all image providers hardcoded amd64, so `abox base pull ubuntu-24.04` on an M-series Mac downloaded an amd64 image and the VM exited immediately at boot. This also shifts the qcow2→raw conversion from create-time to pull-time, since vfkit only reads raw format — doing the conversion during `abox base pull` avoids a redundant repack step on darwin and makes subsequent creates (APFS clone from user cache to backend dir) effectively free. - Add hostArch() helper and thread arch through image providers: Ubuntu filters by catalog arch, Debian/AlmaLinux parameterize URL and filename construction (AlmaLinux maps amd64→x86_64, arm64→aarch64) - Add config.UserBaseImageExt()/UserBaseImageName() helpers returning ".raw" on darwin and ".qcow2" on linux, applied to both user cache and backend storage paths - Rewrite darwin EnsureBaseImage: no longer converts format, just APFS-clones user-cache raw into backend storage dir - Platform-aware convertImage in \`abox base pull\`: darwin runs qcow2→raw conversion and cleans up partial dst on failure; linux retains the qcow2→qcow2 normalization with rename fallback - Fix base remove to use platform-aware extension for backend dir so darwin instances don't orphan .raw copies; rename libvirtImage → backendImage in the process - Update list, prune, and tab-completion to use the extension helper - Add filename-helper tests for Debian/AlmaLinux and an arm64 path for Ubuntu's catalog parser Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Use strings.SplitSeq in parseARPOutput to avoid allocating the intermediate slice (modernize/stringsseq) - Log the unreadable-config error in Remove and expand the comment explaining idempotent behavior (nilerr) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add TrafficInterceptor on the darwin backend backed by a filter marker file, so status/doctor can report accurate state without a privilege client. - Expand PfctlClient from DNS-only redirect to a full per-instance rule set: DNS rdr + DHCP/HTTP-proxy/ICMP allows + default-deny outbound, matching libvirt's nwfilter semantics. - Rename AddDNSRedirect -> ApplyInstanceRules; update tests for the new ruleset and marker lifecycle. - Wire setupPostBootFirewall on darwin to apply the new rule set after vmnet assigns the VM IP. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add ensureAnchorReferences/removeAnchorReferences in privilege helper that insert `rdr-anchor "abox/*"` and `anchor "abox/*"` next to Apple's com.apple/* anchors so pf actually evaluates per-instance abox rules. Without this, Phase 8 rules were loaded but never fired. - Use marker-anchored insertion (no parser, no fenced block) keyed off the standard Apple anchor lines. Refuse to edit and emit a clear error if those markers are absent (custom or MDM-managed pf.conf). - Move EnsureEnabled to pre-boot (setupHostFirewall) so the one-time `pfctl -f` reload happens before vmnet installs runtime rules for the VM; running it post-boot wiped vmnet's in-memory rules and left the guest unreachable until the next bring-up. - Add `abox teardown-pf` command and a Makefile `uninstall` target that calls it as the inverse of auto-wiring. - Add `abox doctor` host check that distinguishes "not wired yet" from "custom pf.conf needs manual edit" so the user gets an actionable hint. - Surface the wiring notice client-side via a before/after HasAnchorReferences probe — the helper's stderr is captured to a per-instance log file by PrivilegeClientFor and never reaches the terminal. - New PfctlTeardownConfig RPC + token wrapper. ActionPfctlWireAnchors / ActionPfctlUnwireAnchors audit constants. - 25 new unit tests covering pfconf helpers, the doctor's three branches, and the new TeardownConfig RPC client. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Split checkdeps into platform-specific lists: - Linux: virsh, iptables, libvirt group checks (unchanged) - darwin: vfkit, qemu-img, xorriso, pfctl, sudo + brew install hints - Gate libvirt privilege helpers (InLibvirtGroup, InLibvirtQemuGroup, CanAccessLibvirtImages) to //go:build linux; keep InGroup shared - Split doctor's Statfs helper per platform, eliminating the //nolint:gosec rationale that was only correct on Linux - Add guardMonitor(inst) that rejects monitor.enabled: true on darwin with a clear error (Tetragon requires a Linux host with eBPF) - Split quickstart helptopic; darwin variant documents brew prereqs, abox teardown-pf, and snapshot/monitor unsupported - Guard make build-helper / install-helper against darwin so the setuid helper targets fail fast with a clear message instead of confusing groupadd/build-constraint errors - Make the doctor DNS-failure hint platform-aware (iptables on Linux, pfctl on macOS) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add docs/macos.md covering install, PF anchors, and platform limitations - Update CLAUDE.md overview to note macOS (vfkit) support - Cross-link macOS sections from quickstart, requirements, filtering, privilege-helper, and troubleshooting - Document macOS-specific log files, pfctl commands, and sudo-only privilege escalation Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ment - New internal/vmnethelper/ wraps the nirs/vmnet-helper CLI (darwin-only); foundational piece for Phase 11's per-VM bridge migration - HelperConfig + pure BuildArgs (shared mode, --enable-isolation, --interface-id, optional sudo -n prefix) - Start attaches the caller's socketpair fd via cmd.ExtraFiles[0], reads start JSON from an os.Pipe with a 5s SetReadDeadline, resolves the bridge via ifconfig, writes a PID file, detaches via Release - Stop/ForceStop/IsRunning/ReadPID/CleanupPIDFile mirror vfkit; isHelperProcess matches both "vmnet-helper" and "sudo" comms - BridgeInterfaceForGateway scans ifconfig with 50ms × 20 retry - ResolveBinaryPath: env override → brew paths → /opt → LookPath - NeedsSudo caches an sw_vers probe; ≤25 needs sudo, 26+ runs bare - 35 unit tests, no vmnet-helper binary required Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- vfkit.VMConfig gains NetFD int; exported NetFDChild = 3 const pins the ExtraFiles[0]-to-child-fd-3 invariant - vfkit.BuildArgs emits virtio-net,fd=N,mac=... instead of ,nat,mac=...; NetFD rendered verbatim, StartVM enforces it equals NetFDChild - vfkit.StartVM signature grows netFD *os.File (mirrors vmnethelper.Start); attaches via cmd.ExtraFiles, closes parent-side on every return path (validation error, log-open error, Start error, or success) so callers don't reason about fd ownership per-branch - darwin VMManager.Start passes nil placeholder with a TODO pointing at Phase 11.3, which wires the real socketpair between vmnet-helper and vfkit. Between 11.2 and 11.3, abox start fails fast on darwin with a clear "vfkit: netFD is required" error; builds and unit tests stay green - Tests updated to set NetFD: NetFDChild on every fixture that sets a MAC; added TestBuildArgs_NetFD table test Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…bnets) Make `abox start` boot a sandboxed VM on macOS by connecting vfkit to a per-instance vmnet-helper process over a datagram socketpair. Replaces the phase-11.3 TODO that left VMManager.Start calling vfkit.StartVM(cfg, nil). - vm.go: Start creates an AF_UNIX SOCK_DGRAM socketpair, launches vmnet-helper in host mode pinned to the instance's /24, reconciles the handed-out gateway against the cloud-init gateway, starts vfkit on the other end, and persists the resolved bridgeN. Stop/ForceStop/Remove tear the helper down (vfkit first, then helper). - backend.go + subnet.go: NetworkDefaults allocates deterministic per-VM /24s from the 192.168.128.x host-mode pool (avoids Docker/OrbStack/Podman 192.168.64.x), replacing the old shared-mode detection. internal/vmnet deleted (now unused). - vmnethelper: HelperConfig gains StartAddress/EndAddress/SubnetMask; BuildArgs emits --start-address/--end-address/--subnet-mask. Export ChildSocketFD. - pfctl: add a per-bridge `block drop quick on <bridge> inet6 all` rule (vmnet has no NAT66 off switch) with bridge-name validation; thread the bridge through ApplyInstanceRules/buildInstanceRules/validatePfctlArgs. - check-deps: per-dependency custom check so vmnet-helper (off-PATH in libexec) lists inline with the other deps. - docs/macos.md: document host-mode per-VM networking, the IPv6 block, and vmnet-helper as a required dependency. Verified on-hardware (macOS 26.5.1, Apple Silicon): VMs boot, get 192.168.128.x/129.x addresses, IPv6 egress is dead despite a guest v6 address, cross-VM traffic is blocked by pf default-deny, and stop/remove leave no orphaned helpers or bridges. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Tip-to-tip merge of sandialabs/abox main (37 commits) into the vfkit/darwin backend branch. Brings in Go 1.26, golangci-lint v2.12.1 gate, the httpfilter HTTP/2 MITM rework, multi-distro (Fedora/RHEL/AlmaLinux) support, and the checkdeps install-hint refactor. Conflicts resolved treating upstream as the source of truth, adapting the darwin backend to match its paradigms (and parameterizing where upstream assumed a single architecture): - images: adopt upstream's providerUbuntu/archAMD64 constants; add archARM64 and keep our hostArch() parameterization so arm64 catalogs resolve on macOS. - privilege: carry upstream's multi-distro qemuDiskGroups / InLibvirtQemuGroup / CanAccessLibvirtImages and the libvirtImagesDir const into the build-tagged *_linux files; keep cross-platform InGroup in the shared file. helper_linux picks up upstream's const-extraction + slices.Backward cleanups. - checkdeps: adopt upstream PR sandialabs#6 (hint field on the dependency table, named command constants, table-driven shared installHint, multi-distro hints, firewalld warning) while preserving our linux/darwin split and the custom check func for vmnet-helper. warnFirewalld is now a platform hook (no-op on darwin). The hint test is tagged linux-only; a darwin parity test added. Also cleared v2.12.x lint findings across the darwin files (goconst constants in vmnethelper/checkdeps, net.IP formatting in backend/darwin, errors.New in vfkit, stale nolint in peercred, import grouping + gosec note in *_linux). Build, unit tests (-race), and lint are green on Go 1.26 for darwin and linux (amd64 + arm64). Filter e2e on macOS hardware still pending (httpfilter rework). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
abox mount/unmount rely on FUSE/SSHFS, which has no good fit on macOS (macFUSE is a kernel extension requiring Reduced Security). Register the commands per-platform via addPlatformFileCommands (Linux registers them, darwin is a no-op), so `abox mount` returns "unknown command" on macOS. Drop the sshfs dependency from the macOS check-deps table and remove the macFUSE/sshfs-mac install hints from the macOS docs and quickstart. The Linux mount/unmount packages and docs are unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
macOS 26's hardened Code Signing Monitor rejects Go's internal-linker
"linker-signed" ad-hoc signature at exec, killing the process with
SIGKILL ("Code Signature Invalid") even though `codesign -v` accepts the
binary at rest. Re-signing with `codesign --force --sign -` produces a
plain ad-hoc signature (flags 0x2) the monitor accepts.
Document the symptom and manual fix under troubleshooting's
macOS-specific issues for binaries built with a bare `go build`.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add `abox remote add [<name>] <instance>:<path>`, which registers a git remote (in the current host clone) that routes through `abox ssh` using git's built-in `ext::` transport. The remote reuses the instance's scoped SSH key and per-instance known_hosts and resolves the VM's current IP on every operation, so nothing is written to host SSH config and the remote survives instance restarts and IP drift. This gives macOS and Linux a first-class file-sync path to replace the Linux-only SSHFS mount. git blocks the `ext::` transport by default (built-in `ext` policy is `never`), so the command also sets `protocol.ext.allow=user` in the local repo config, which permits direct user fetch/pull/push while still blocking ext:: inside recursive/submodule fetches. It leaves config alone if ext is already permitted. - Factor `parseRemotePath` from scp into shared `sshutil.ParseRemotePath` - Add `ActionRemoteAdd` audit constant - Register under the Access command group - Document in docs/vm-access.md (+ macos.md, README, darwin help/comments) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Author
TODO:
|
Name the macOS backend after its driver (vfkit) to mirror the libvirt backend, which is named for its driver rather than the OS. The registered backend Name was already "vfkit"; this aligns the Go package/directory and the comments that referred to the macOS backend as "darwin". OS-meaning constructs are deliberately left as-is: //go:build darwin constraints, _darwin.go file suffixes, and comments that genuinely refer to the macOS operating system. Also fix a Linux assumption leaking into cross-platform code: importcmd hardcoded a ".qcow2" base-image extension instead of routing through config.UserBaseImageName, which resolves to ".raw" on macOS. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ux validator Two audit findings in the privilege helper (F1, F5): The darwin helper's root file-op RPCs (Chmod/MkdirAll/RemoveAll/CopyFile, plus QemuImgCreate paths) applied only a character allowlist and a bare ".." check, so a token-holding caller could operate on any path as root. Add darwin validators mirroring the Linux confinement: paths must resolve (symlinks included) under /Users/<name>/Library/Application Support/abox, with <name>'s home ownership verified against the allowed UID, and RemoveAll additionally guarded by a minimum component depth. User-owned storage justifies dropping Linux's TOCTOU pinning, not containment. Also drop the dead os.Stat in Chmod (F29). safePathChars had gained a literal space for the macOS storage path, but the file is shared, silently loosening the Linux setuid helper's checks and breaking the linux-tagged test that pins the strict behavior. Split the constant per platform: linux keeps the no-space set, darwin permits the space "Application Support" requires. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Audit finding F2: on darwin all egress enforcement (DNS redirect, IPv6 block, default deny) lives in the per-instance pf anchor, applied post-boot once the VM IP is known. If waitForIP timed out, start warn-returned and the VM ran with zero filtering, and nothing ever re-applied the rules. Treat a missing IP as fatal: setupPostBootFirewall now errors, and runStart force-stops the VM before surfacing an actionable error, so a running-but-unfiltered instance can't be left behind. On the already-running path, re-apply the pf rules when the filter marker is missing (recoverFirewall, no-op on linux), so a plain `abox start` repairs a lost anchor instead of only recovering daemons. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…-helpers Two lifecycle audit findings (F3, F4): Stop issued the REST/ACPI shutdown and then immediately SIGTERMed vfkit, whose own handler hard-stops the guest after ~5s — making the stop command's 60s grace poll dead code and risking unsynced guest writes. Stop now waits up to 55s (ctx-cancellable) for vfkit to exit on its own after a successful REST stop, falling back to SIGTERM/SIGKILL only when the request fails or the window elapses. vmnet-helper teardown keeps its after-vfkit ordering. vmnet-helper never auto-exits when vfkit dies, and Start blindly launched a second helper over the orphan, which hung reading the start JSON and left the instance unstartable with no CLI recovery. Start now detects a running helper while vfkit is down and stops it (force-kill on failure) before launching fresh. Helper-stop failures in Stop/ForceStop are logged at Warn instead of Debug — a leaked root process is worth surfacing. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ng privileged On macOS, `abox create` prompted for sudo up front even though the vfkit backend performs no privileged operations at create time (user-owned storage, APFS clone, no-op network create) — blocking non-interactive create and contradicting the documented flow (first prompt at `abox start`). Add an optional backend.CreatePrivilegeProvider capability (mirroring SubnetProvider): backends that don't implement it default to requiring privileges, so Linux/libvirt is unchanged. vfkit reports false; create then skips privilege acquisition entirely, including the failure-cleanup path, which removes the user-owned disk dir directly. Addresses audit finding F7. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
`abox tap` passed inst.Bridge to tcpdump -i, but on darwin that is the abstract resource name (abox-<name>), never a host interface — tcpdump failed with "no such device" while check-deps advertised tap support. Split interface resolution per platform: Linux keeps inst.Bridge; darwin resolves the bridgeN interface that VMManager.Start persisted into BackendConfig["bridge"], with a clear error if the instance predates it. Also platform-gate the escalation check (getcap is Linux-only; darwin probes /dev/bpf0, treating EBUSY as accessible) and the remediation hints (apt/setcap on Linux; sudo//dev/bpf/ChmodBPF on macOS), and note tap's sudo requirement in docs/macos.md. Addresses audit finding F6. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…tale backend note Audit findings F8, F19, F20, F21: - F8: the privilege helper exits per CLI command; repeat sudo prompts are suppressed by sudo's credential cache, not helper persistence (macos.md, privilege-helper.md, troubleshooting.md) - F19: add vmnet-helper (separate tap, installs off PATH, ABOX_VMNET_HELPER_PATH override) to requirements.md, quickstart.md, and the in-binary darwin quickstart help - F20: fix the macos.md storage layout table — instance disks live under ~/Library/Application Support/abox/images/instances/<name>/, not ~/.local/share - F21: CLAUDE.md no longer claims libvirt is the only backend; add the vfkit/vmnethelper packages to the diagram and key-files table Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…port, liveness check, fatal bridge persist Audit findings F11-F14: - F12: PID files move from $TMPDIR (purged by macOS after ~3 idle days, leaving live VMs unmanageable) to <instance>/run/, with a read-fallback to the legacy location so already-running VMs survive the upgrade - F14: allocate the vfkit REST port fresh at every Start instead of reusing the Create-time allocation (allocate-then-store TOCTOU), and persist the current value for State/Stop - F13: verify vfkit liveness (PID + REST VMState) within a 2s grace window after launch; on instant death, tear down vfkit + vmnet-helper and surface the vfkit.log tail instead of reporting success - F11: a failed config.Save of the bridge interface after launch is now fatal (tear down VM + helper) — previously it warned and returned nil, leaving a running VM with zero pf egress filtering and no self-heal Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…shot export/import, never shrink disks Audit findings F9, F10, F15: - F9: abox import now resolves networking through the shared backend.ResolveNetwork helper (also used by create), so imported macOS instances land in the vfkit host-mode pool instead of getting a 10.10.x.0/24 subnet and hardcoded IP from the Linux pool that trips the start-time determinism guard - F10: backends whose disk export is always self-contained (vfkit/raw) declare it via the new SelfContainedExporter capability; export rejects --snapshot up front instead of writing a full image with a lying Snapshot:true manifest (which a Linux import would unsafely rebase, silently corrupting unallocated clusters); vfkit import detects a qcow2 backing-file reference via qemu-img info and fails with guidance instead of an opaque convert error - F15: DiskManager.Create only grows the cloned disk; a --disk smaller than the base image now errors with the required minimum instead of silently truncating the filesystem tail and GPT backup header Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… root anchor Audit finding F17: PfctlLoadAnchor wrote caller-supplied RulesContent verbatim and ran pfctl -a abox/<name> -f as root, with all rule-content safety client-side in the unprivileged firewall package. The anchor namespaces where rules live, not what they match, so a token-holding caller could load arbitrary pf text (rdr hijacks of host traffic, nested anchor/load anchor/include). The helper now validates every line token-for-token against the exact rule shapes the legitimate generator emits (the server-side equivalent of re-deriving from structured fields, within the existing RPC message): IPv4-only addresses, bridgeN interface names, unprivileged redirect ports, rdr targets pinned to 127.0.0.1:53-redirect form, and a character allowlist that excludes file paths and metacharacters. Acceptance tests feed real generator output through the validator to guard against drift; 28 malicious payloads are rejected. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…-only hints Audit finding F16: audit events written via log/syslog succeed at write time on macOS but appear nowhere — not in the unified log, not ASL, not /var/log/system.log — and 'abox logs' pointed users at journalctl, which doesn't exist there. - macOS audit events now go to a rotating file at ~/.local/share/abox/logs/audit.log (logutil.RotateWriter, 10 MB x 3), since CGO is disabled and os_log is unreachable from pure Go; Linux keeps syslog, including the historical routing of the default logger's INFO+ output through it (journalctl -t abox is unchanged) - 'abox logs' help now shows the platform-appropriate retrieval path via logging.AuditLogHint() - doctor's libvirt-specific 'nwfilter' label and 'check nwfilter rules' hints are platform-gated: macOS reports the pf anchor instead - documented the macOS audit log location in troubleshooting.md (The tap command's Linux-only hints named by the audit were already platform-gated by c20a90b.) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…; drop dead commander clone Audit findings F18, F22: - F18: delete internal/vfkit/commander.go — a byte-identical clone of internal/libvirt/commander.go with zero references (process.go uses exec.Command directly, restapi.go uses net/http); package doc comment moved to vfkit.go - F22: the highest-risk new areas had zero coverage. Add tests for vfkit process lifecycle (stale-PID handling that never signals an innocent live process, wrong-process do-not-kill, IsRunning, CleanupPIDFile), the REST API via httptest (VMState, stop requests, port allocation), vmnet-helper process management (sudo-vs-direct comm matching, killChild, readLineWithDeadline timeout/EOF), and the deterministic host-subnet allocator (determinism, collision skip, non-overlap, pool bounds, exhaustion fallback) The only seam added is a package-level lookupComm var (defaulting to the real ps-based lookup) in vfkit and vmnethelper, so kill/keep decisions are testable without depending on what happens to be running. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ed filter daemons on macOS IsAboxProcess read /proc/<pid>/exe, which does not exist on macOS, so it returned false for every PID on darwin: daemon recovery judged live dns/http daemons dead, deleted their socket/PID files, and wedged on bind-address-in-use, while abox stop's signal fallback silently skipped its SIGTERM. - Split IsAboxProcess into lifecycle_linux.go (existing /proc impl) and lifecycle_darwin.go (ps -o comm= behind a seam; CGO is disabled) and return (bool, error) so callers can distinguish confirmed-not-abox from unverifiable. - checkAlreadyRunning no longer deletes socket/PID files of a running process whose identity cannot be verified; it errors instead. - New darwin-only reclaimOrphanedFilterDaemon finds a live "abox dns/http serve <name>" via pgrep even when the PID file is stale or missing (e.g. after a $TMPDIR purge), re-verifies identity before signaling, then SIGTERM/SIGKILLs it so a fresh daemon can bind. Linux variant is a no-op. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
vfkit answers POST /vm/state with 202 Accepted, but postVMState required exactly 200, so every RequestStop was misread as a failure: the stop path skipped the 55s graceful window and fell straight to SIGTERM, where vfkit's own signal handler force-stops the guest after ~5s — cutting off in-flight shutdown work. Observed on hardware as a 6.5s stop that killed a guest mid-shutdown despite a 30s blocking stop job. Accept any 2xx, and pin the success-path test mocks to 202 so they match real vfkit instead of masking the regression. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adding as a draft PR to fix later. Just wanted to add ASAP for visibility – this exists so hopefully nobody duplicates work! It's also fully functional and I'm using an abox on Mac as I finish up this branch.
I still need to manually review all the LLM code.