[CONTP-1448] Add Windows node support#3154
Open
zhuminyi wants to merge 1 commit into
Open
Conversation
🛑 Gate Violations
ℹ️ Info🎯 Code Coverage (details) Useful? React with 👍 / 👎 This comment will be updated automatically if new data arrives.🔗 Commit SHA: 2dd7823 | Docs | Datadog PR Page | Give us feedback! |
8ae634f to
59aec8a
Compare
59aec8a to
baecd3a
Compare
1dc1a99 to
d8864f6
Compare
e8fb9a5 to
ede06c4
Compare
2852d18 to
390f49d
Compare
390f49d to
aae4ee0
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: aae4ee0813
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
dc3aa57 to
02298c2
Compare
When spec.override.windowsNodeAgent is present in the DatadogAgent CR, the
operator creates a second DaemonSet (datadog-agent-windows) targeting Windows
nodes alongside the existing Linux one. Each DaemonSet targets only its own OS
via nodeSelector + the node.kubernetes.io/os=windows:NoSchedule toleration, so
in a mixed cluster every node gets exactly one agent pod of the right type. The
Linux DaemonSet is unchanged; the feature is opt-in (absent key = no-op).
API / image:
- api: add WindowsNodeAgentComponentName = "windowsNodeAgent"
- api: add AgentWindows *DaemonSetStatus to DDAI + DDA status, with an
agent-windows printer column (regenerated CRDs, deepcopy, openapi)
- images: add GetLatestWindowsAgentImage() -> agent:X.Y.Z-servercore
Windows DaemonSet builder (component/agent/windows.go):
- nodeSelector kubernetes.io/os=windows + Windows taint toleration
- servercore image; core agent + trace agent (+ process agent) only
- PowerShell init container creates an empty datadog.yaml + auth/ dir in a
shared emptyDir mounted at C:/ProgramData/Datadog by all containers, so the
IPC auth token written by the core agent is visible to the trace agent
- no Linux securityContext; DD_AUTH_TOKEN_FILE_PATH overridden to a Windows path
- StripLinuxOnlySettings (allowlist): keeps only core/trace/process containers
and the Windows init container; drops any volume mount with a Linux ("/") path,
unreferenced volumes, Unix-socket env vars, Linux securityContext fields,
hostPID/hostIPC, and AppArmor annotations for removed containers
Reconciler (controller_reconcile_windows_agent.go):
- OS-aware feature gating: only Windows-supported features run ManageNodeAgent
- EnsureWindowsIntakeReachable forces APM/DogStatsD non-local traffic AFTER the
feature loop so it isn't clobbered (Windows has no Unix socket)
- guards: FIPS and EDS are unsupported and surface a WindowsAgentReconcile
condition; Linux-disabled still reconciles Windows
- ensureWindowsDaemonSetAbsent cleans up the Windows DS (owner-scoped by
component + part-of labels) on opt-out / Disabled / FIPS / EDS
Tests: builder, strip (allowlist + socket/hostPID/AppArmor), image, and
reconciler (no-op, FIPS, EDS, disable, owner-scoped cleanup, status routing).
Example manifest: examples/datadogagent/datadog-agent-with-windows-nodes.yaml
Validated on GKE (Windows Server 2019, WINDOWS_LTSC_CONTAINERD): core + trace
agent Running, status surfaced, no Linux artifacts leak even with NPM/CSPM/SBOM
enabled.
Known limitation: the component=agent local APM/DogStatsD services do not route
to Windows pods (labeled agent-windows); Windows workload->agent traffic needs
hostPort until a Windows-specific local service is added. LogCollection is not
yet supported (needs Windows host log-path mounts).
02298c2 to
2dd7823
Compare
joepeeples
approved these changes
Jun 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Adds opt-in Windows node Agent support via
spec.override.windowsNodeAgent. When configured, the Operator creates a Windows-targeted Agent DaemonSet alongside the existing Linux Agent DaemonSet.Motivation
Enable Datadog Agent deployment on mixed Linux/Windows Kubernetes clusters managed by the Datadog Operator.
Key Changes:
internal/controller/datadogagent/component/agent/windows.go
New Windows Agent DaemonSet builder and Windows-specific pod sanitization. Handles Windows node targeting, init config, safe container allowlisting, Linux-only field stripping, -servercore image handling, non-local APM/DogStatsD traffic, and Windows log collection mounts.
internal/controller/datadogagentinternal/controller_reconcile_windows_agent.go
New Windows Agent reconciler. Handles opt-in behavior, unsupported FIPS/EDS cleanup, disabled cleanup, Windows container selection, feature filtering, pod sanitization, image normalization, intake reachability, log mounts, and DaemonSet create/update.
internal/controller/datadogagentinternal/controller_reconcile_agent.go
Chains Windows reconciliation into the existing node Agent flow so Linux and Windows DaemonSets stay in sync across normal, disabled, and EDS paths.
reconcileV2WindowsAgentis called on every DDAI reconcile, for linux only DDA that meansensureWindowsDaemonSetAbsentwill be always called on each reconcile.API/status generated files
Adds status.agentWindows and the agent-windows printer column for both DatadogAgent and DatadogAgentInternal. Also fixes status handling so AgentWindows is preserved and compared correctly.
examples/datadogagent/datadog-agent-with-windows-nodes.yaml
Adds an example Windows configuration and documents current limitations, especially that local APM/DogStatsD services do not yet route to Windows pods.
Gaps addressed
tolerations: []. Every Kubernetes distribution (GKE, EKS, AKS, kubeadm) automatically appliesnode.kubernetes.io/os=windows:NoScheduleto Windows nodes via kubelet, so no Windows pod is ever scheduled.NewDefaultWindowsAgentPodTemplateSpecadds thenode.kubernetes.io/os=windows:NoScheduletoleration +nodeSelector os=windows(windows.go).reconcileV2WindowsAgentbuilds a dedicateddatadog-agent-windowsDaemonSet, chained after the Linux DS on every reconcile path (controller_reconcile_windows_agent.go,controller_reconcile_agent.go).agent:X.Y.Z— a Linux binary. Windows requires a separate image (agent:X.Y.Z-servercore) compiled for Windows APIs. No Windows image reference exists anywhere in the codebase.GetLatestWindowsAgentImage+EnsureWindowsServercoreImagecoerce the default agent image to the-servercoretag (pkg/images/images.go,windows.go).securityContextSYS_ADMIN,NET_RAW, etc.), seccomp profiles, andreadOnlyRootFilesystemare all rejected by the Windows container runtime.StripLinuxOnlySettings(allowlist) clears capabilities, seccomp, SELinux, AppArmor,readOnlyRootFilesystem, hostPID/hostIPC, and uses a nil podSecurityContext(windows.go)./proc,/sys/fs/cgroup,/etc/passwd, and Unix sockets at/var/rundo not exist on Windows. The pod would fail to start.StripLinuxOnlySettingsallowlist drops every mount/volume with a/-prefixed path and every*SOCKET*/DOCKER_HOST/DD_VSOCK_ADDRenv var;AddWindowsLogCollectionVolumesadds the WindowsC:/log hostPaths (windows.go).system-probe/security-agentalways injectedliveProcessCollectionor CSPM are enabled.windowsContainersFromFeatures+windowsSupportedFeaturesallowlist keep only core/trace/process-agent; the strip allowlist removes any other container (controller_reconcile_windows_agent.go,windows.go).bashand Linux paths. Windows has nobash. The trace-agent also requires a config file atC:\ProgramData\Datadog\datadog.yaml, which the Windows image does not include.init-config-windowsinit container createsC:\ProgramData\Datadog\datadog.yaml+ auth dir; all Linux init containers are stripped (windowsInitContainersinwindows.go).DatadogAgentCRD to opt in to Windows or configure a Windows-specific image, resources, or tolerations.spec.override.windowsNodeAgentcomponent (WindowsNodeAgentComponentName) optrride fields (image, resources, etc.)(api/datadoghq/v2alpha1/datadogagent_types.go).Additional Notes
Windows support is currently limited to the core Agent, trace Agent, and process Agent. Linux-only containers, mounts, security settings, and socket/env configuration are stripped from the Windows DaemonSet.
Current limitations:
global.useFIPSAgent.Describe your test plan
Enable Linux-only features, then inspect the Windows DaemonSet — the allowlist guarantees it stays clean:
Checklist
bug,enhancement,refactoring,documentation,tooling, and/ordependenciesqa/skip-qalabel