Skip to content

Add global.vsock with full/system-probe modes (system-probe <=> micro VM VSock)#3186

Open
lebauce wants to merge 1 commit into
mainfrom
lebauce/vsock-for-microvm-only
Open

Add global.vsock with full/system-probe modes (system-probe <=> micro VM VSock)#3186
lebauce wants to merge 1 commit into
mainfrom
lebauce/vsock-for-microvm-only

Conversation

@lebauce

@lebauce lebauce commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

What

Introduces a real global.vsock configuration section so that scoping VSock to the system-probe ⇄ micro VM channel is opt-in, while keeping the legacy "everything over VSock" behavior available and backward compatible.

Configuration

global:
  vsock:
    enabled: true        # turns VSock on
    mode: SystemProbe    # "Full" (default) or "SystemProbe"
  • full (default) — all Agent components communicate over VSock. Reproduces the legacy useVSock behavior: DD_VSOCK_ADDR=host, DD_REMOTE_AGENT_REGISTRY_ENABLED=false, the host auth volume on the node agent, and the CWS runtime-security channel over VSock (DD_RUNTIME_SECURITY_CONFIG_SOCKET=vsock:5020, DD_RUNTIME_SECURITY_CONFIG_EVENT_GRPC_SERVER=security-agent) on all CWS containers.

  • system-probe — VSock is scoped to the host system-probe only. It hosts a remote runtime-security event server over VSock so the system-probe inside a micro VM can forward events to it:

    • system-probe container: DD_RUNTIME_SECURITY_CONFIG_EVENT_GRPC_SERVER=system-probe, DD_RUNTIME_SECURITY_CONFIG_SOCKET=vsock:5020
    • core agent / security agent: keep the regular unix socket (/var/run/sysprobe/runtime-security.sock)

    This matches the agent's own event_grpc_server="system-probe" branch in pkg/security/module/cws.go, documented as the remote event server "for remote system-probes (e.g., in micro VMs via vsock)". event_grpc_server is a process-role selector; the vsock: address belongs on socket.

Backward compatibility

  • global.useVSock is kept, marked deprecated, and maps to vsock.enabled with the Full mode.
  • When the global.vsock section is set, useVSock is ignored.
  • Resolution is centralized in GlobalConfig.GetVSockConfig(), consumed by both the global node-agent code and the CWS feature.

Changes

  • API (datadogagent_types.go): added VSock *VSockConfig to GlobalConfig, the VSockConfig struct (Enabled, Mode), and the VSockMode enum (Full / SystemProbe); deprecated UseVSock.
  • vsock.go: GlobalConfig.GetVSockConfig() resolver with the back-compat precedence.
  • global/agent.go: full-mode VSock env vars / auth volume gated on enabled && mode == Full.
  • cws/feature.go: CWS runtime-security env vars branch on the VSock mode, with the address on socket and the role on event_grpc_server.
  • Regenerated deepcopy, CRDs, and docs.
  • Tests: global test covers deprecated useVSock, vsock.enabled (Full default), and SystemProbe mode; CWS test covers Full (deprecated useVSock) and SystemProbe modes.

🤖 Generated with Claude Code

@datadog-datadog-prod-us1-2

datadog-datadog-prod-us1-2 Bot commented Jun 23, 2026

Copy link
Copy Markdown

Pipelines  Code Coverage

Fix all issues with BitsAI

⚠️ Warnings

🚦 2 Pipeline jobs failed

pull request linter | Check Milestone   View in Datadog   GitHub Actions

pull request linter | build   View in Datadog   GitHub Actions

ℹ️ Info

🎯 Code Coverage (details)
Patch Coverage: 94.12%
Overall Coverage: 44.97% (+0.32%)

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 616fc8b | Docs | Datadog PR Page | Give us feedback!

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 77087447f9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

managers.EnvVar().AddEnvVarToContainers(containersForEnvVars, &corev1.EnvVar{
Name: DDRuntimeSecurityConfigEventGRPCServer,
Value: "security-agent",
Value: "vsock:5020",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep event gRPC server as a process selector

When global.useVSock is enabled for CWS, this writes DD_RUNTIME_SECURITY_CONFIG_EVENT_GRPC_SERVER=vsock:5020. That Agent setting selects which Agent process sends runtime-security events and activity dumps (for example security-agent or system-probe); the vsock:5020 address is not a valid process selector. In micro-VM CWS deployments this leaves the host system-probe configured with an invalid event sender, so events forwarded from the guest can fail to be delivered as intended.

Useful? React with 👍 / 👎.

@codecov-commenter

codecov-commenter commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 94.33962% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.14%. Comparing base (92788bf) to head (616fc8b).
⚠️ Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
api/datadoghq/v2alpha1/datadogagent_validation.go 72.72% 3 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3186      +/-   ##
==========================================
+ Coverage   44.03%   44.14%   +0.10%     
==========================================
  Files         377      378       +1     
  Lines       30713    30758      +45     
==========================================
+ Hits        13525    13578      +53     
+ Misses      16300    16294       -6     
+ Partials      888      886       -2     
Flag Coverage Δ
unittests 44.14% <94.33%> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
api/datadoghq/v2alpha1/datadogagent_types.go 100.00% <ø> (+100.00%) ⬆️
api/datadoghq/v2alpha1/vsock.go 100.00% <100.00%> (ø)
...nal/controller/datadogagent/feature/cws/feature.go 81.53% <100.00%> (+5.90%) ⬆️
internal/controller/datadogagent/global/agent.go 85.56% <100.00%> (ø)
api/datadoghq/v2alpha1/datadogagent_validation.go 60.00% <72.72%> (+60.00%) ⬆️

Continue to review full report in Codecov by Harness.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 92788bf...616fc8b. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lebauce lebauce requested a review from a team as a code owner June 24, 2026 12:31
@lebauce lebauce changed the title Scope useVSock to CWS system-probe <=> micro VM communication only Add global.vsock with Full/SystemProbe modes (system-probe <=> micro VM VSock) Jun 24, 2026
@lebauce lebauce force-pushed the lebauce/vsock-for-microvm-only branch 2 times, most recently from 8c93eb7 to 6f416b8 Compare June 24, 2026 13:37
…VM VSock)

Introduces a real global.vsock configuration section so that scoping VSock to
the system-probe <=> micro VM channel is opt-in, while keeping the legacy
"everything over VSock" behavior available and backward compatible.

Configuration:

  global:
    vsock:
      enabled: true        # turns VSock on
      mode: SystemProbe    # "full" (default) or "system-probe"

- Full (default): all Agent components communicate over VSock. Reproduces the
  legacy useVSock behavior (DD_VSOCK_ADDR=host, remote agent registry disabled,
  host auth volume on the node agent, and the CWS runtime-security channel over
  VSock with SOCKET=vsock:5020 and EVENT_GRPC_SERVER=security-agent on all CWS
  containers).
- SystemProbe: VSock is scoped to the host system-probe only. It hosts a remote
  runtime-security event server over VSock so the system-probe inside a micro VM
  can forward events to it (system-probe container: EVENT_GRPC_SERVER=system-probe,
  SOCKET=vsock:5020); the core and security agents keep the regular unix socket.
  This matches the agent's event_grpc_server="system-probe" branch, documented as
  the remote event server for remote system-probes in micro VMs via vsock. The
  event_grpc_server config is a process-role selector; the vsock address belongs
  on the socket config.

global.useVSock is kept, deprecated, and maps to vsock.enabled with the Full
mode. When the vsock section is set, useVSock is ignored. Resolution is
centralized in GlobalConfig.GetVSockConfig(), consumed by both the global
node-agent code and the CWS feature.

SystemProbe mode requires features.cws.directSendFromSystemProbe when CWS is
enabled, since the host system-probe no longer exposes the unix socket the
security-agent connects to; this is enforced by ValidateDatadogAgent.

Regenerates deepcopy, CRDs, and docs.
@lebauce lebauce force-pushed the lebauce/vsock-for-microvm-only branch from 6f416b8 to 616fc8b Compare June 24, 2026 13:39
@lebauce lebauce changed the title Add global.vsock with Full/SystemProbe modes (system-probe <=> micro VM VSock) Add global.vsock with full/system-probe modes (system-probe <=> micro VM VSock) Jun 24, 2026
@lebauce lebauce added enhancement New feature or request qa/skip-qa labels Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants