Skip to content

[CASCL-1386] Add evict-legacy-nodes subcommand to drain non-Datadog node groups#3026

Open
L3n41c wants to merge 29 commits into
mainfrom
lenaic/CASCL-1386-evict-legacy-nodes
Open

[CASCL-1386] Add evict-legacy-nodes subcommand to drain non-Datadog node groups#3026
L3n41c wants to merge 29 commits into
mainfrom
lenaic/CASCL-1386-evict-legacy-nodes

Conversation

@L3n41c

@L3n41c L3n41c commented May 18, 2026

Copy link
Copy Markdown
Member

Summary

Adds kubectl datadog autoscaling cluster evict-legacy-nodes, which migrates workloads off non-Datadog-managed node groups (cluster-autoscaler ASGs, EKS managed node groups, user-created Karpenter NodePools, standalone EC2 instances) onto the Datadog-managed Karpenter NodePools created by install, then scales the legacy groups to zero.

Jira: CASCL-1386
Builds on: CASCL-1304

Highlights

  • Re-classifies the cluster via clusterinfo.Classify and updates the dd-cluster-info ConfigMap before any destructive work.
  • Scales the legacy cluster-autoscaler Deployment to 0 replicas as step 1 (opt-out via --skip-cluster-autoscaler).
  • Creates temporary maxUnavailable: 1 PodDisruptionBudgets for workloads without one. Cleanup is label-based (app.kubernetes.io/managed-by=kubectl-datadog + autoscaling.datadoghq.com/temporary-pdb=true), so a SIGKILL'd run is reaped by the next invocation.
  • Evicts in parallel by manager type, with errors aggregated rather than aborting other types:
    • ASG → cordon + evict + UpdateAutoScalingGroup(0,0,0) (only when all nodes drained — avoids AZ-rebalance and MinSize > DesiredCapacity hazards).
    • EKS managed node groupUpdateNodegroupConfig(0,0,0), then waits for the K8s nodes carrying eks.amazonaws.com/nodegroup=<name> to disappear so the EKS-side drain finishes before temp PDBs are removed (sentinel error `errEKSDrainIncomplete`).
    • Karpenter user NodePool → cordon + evict (NodePool spec left alone, by design).
    • Standalone EC2 → cordon + evict + TerminateInstances.
  • Pre-flight refuses when no Datadog-managed NodePool exists; warns on user NodePool weight conflicts that could cause evicted pods to land back on user-managed nodes.
  • Every K8s read-modify-write is wrapped in retry.RetryOnConflict.

Out of scope (by design)

  • Modifying user-managed Karpenter NodePool specs (only their existing nodes are drained).
  • Deleting ASGs, EKS managed node groups, or NodePools — the command scales to 0; ownership and deletion belong to whoever provisioned them (Terraform, Helm, eksctl, etc.).
  • Migrating Fargate-hosted pods (Fargate has its own lifecycle).

Test plan

  • Unit tests with -race (go test -race ./cmd/kubectl-datadog/autoscaling/cluster/evict/...).
  • make lint — 0 issues.
  • make kubectl-datadog builds.
  • Pre-commit review passed: Codex (6 rounds), code-reviewer agent (2 rounds), /simplify (2 rounds), cross-validation (1 cycle).
  • Smoke test on a real EKS cluster with a CA + ASG + EKS MNG + user Karpenter NodePool + workloads.

🤖 Generated with Claude Code

… node groups

Introduces `kubectl datadog autoscaling cluster evict-legacy-nodes`, which
drains workloads from cluster-autoscaler ASGs, EKS managed node groups,
user-created Karpenter NodePools and standalone EC2 instances onto the
Datadog-managed Karpenter NodePools created by `install`, then scales the
legacy groups to zero.

Highlights:
- Re-classifies the cluster (reuses `clusterinfo.Classify`) and updates the
  `dd-cluster-info` ConfigMap before any destructive work.
- Scales the legacy cluster-autoscaler Deployment to 0 replicas (opt-out
  via `--skip-cluster-autoscaler`).
- Creates temporary `maxUnavailable: 1` PodDisruptionBudgets for workloads
  without one; cleanup is label-based, so a SIGKILL'd run is reaped on the
  next invocation.
- Evicts in parallel by manager type:
    * ASG: cordon + evict + UpdateAutoScalingGroup(0,0,0) (only when all
      nodes drained; avoids AZ-rebalance and MinSize/DesiredCapacity
      hazards).
    * EKS managed node group: UpdateNodegroupConfig(0,0,0), then waits
      for the K8s nodes to disappear so the EKS-side drain finishes
      before temp PDBs are removed.
    * Karpenter user NodePool: cordon + evict (NodePool spec untouched).
    * Standalone EC2: cordon + evict + TerminateInstances.
- Pre-flight refuses when no Datadog-managed NodePool exists; warns on
  user NodePool weight conflicts.
- Every K8s read-modify-write wrapped in retry.RetryOnConflict.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov-commenter

codecov-commenter commented May 18, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 64.27673% with 284 lines in your changes missing coverage. Please review.
✅ Project coverage is 45.11%. Comparing base (20ecb9e) to head (89442cb).
⚠️ Report is 70 commits behind head on main.

Files with missing lines Patch % Lines
...d/kubectl-datadog/autoscaling/cluster/evict/run.go 15.78% 96 Missing ⚠️
...kubectl-datadog/autoscaling/cluster/evict/evict.go 0.00% 80 Missing ⚠️
...d/kubectl-datadog/autoscaling/cluster/evict/pdb.go 69.00% 45 Missing and 8 partials ⚠️
...datadog/autoscaling/cluster/uninstall/uninstall.go 0.00% 19 Missing ⚠️
...tl-datadog/autoscaling/cluster/evict/evict_pods.go 78.94% 11 Missing and 5 partials ⚠️
...ctl-datadog/autoscaling/cluster/evict/preflight.go 84.84% 4 Missing and 1 partial ⚠️
.../kubectl-datadog/autoscaling/cluster/evict/plan.go 95.00% 2 Missing and 2 partials ⚠️
.../autoscaling/cluster/common/karpenter/fromnodes.go 0.00% 3 Missing ⚠️
...dog/autoscaling/cluster/evict/clusterautoscaler.go 88.23% 1 Missing and 1 partial ⚠️
...bectl-datadog/autoscaling/cluster/evict/eks_mng.go 95.12% 2 Missing ⚠️
... and 3 more
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3026      +/-   ##
==========================================
+ Coverage   41.50%   45.11%   +3.60%     
==========================================
  Files         335      389      +54     
  Lines       28714    34165    +5451     
==========================================
+ Hits        11919    15415    +3496     
- Misses      16001    17769    +1768     
- Partials      794      981     +187     
Flag Coverage Δ
unittests 45.11% <64.27%> (+3.60%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...ctl-datadog/autoscaling/cluster/common/aws/node.go 100.00% <100.00%> (ø)
...autoscaling/cluster/common/clusterinfo/classify.go 88.46% <100.00%> (-0.07%) ⬇️
...d/kubectl-datadog/autoscaling/cluster/evict/asg.go 100.00% <100.00%> (ø)
...ubectl-datadog/autoscaling/cluster/evict/prompt.go 100.00% <100.00%> (ø)
...tl-datadog/autoscaling/cluster/evict/standalone.go 100.00% <100.00%> (ø)
cmd/kubectl-datadog/autoscaling/cluster/cluster.go 0.00% <0.00%> (ø)
...ubectl-datadog/autoscaling/cluster/evict/cordon.go 96.66% <96.66%> (ø)
...dog/autoscaling/cluster/evict/clusterautoscaler.go 88.23% <88.23%> (ø)
...bectl-datadog/autoscaling/cluster/evict/eks_mng.go 95.12% <95.12%> (ø)
...atadog/autoscaling/cluster/evict/karpenter_user.go 75.00% <75.00%> (ø)
... and 8 more

... and 107 files with indirect coverage changes


Continue to review full report in Codecov by Harness.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 20ecb9e...89442cb. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@datadog-prod-us1-3

datadog-prod-us1-3 Bot commented May 18, 2026

Copy link
Copy Markdown

Code Coverage

Fix all issues with BitsAI

🛑 Gate Violations

🎯 1 Code Coverage issue detected

A Patch coverage percentage gate may be blocking this PR.

Patch coverage: 63.26% (threshold: 80.00%)

ℹ️ Info

🎯 Code Coverage (details)
Patch Coverage: 63.26%
Overall Coverage: 45.24% (+3.42%)

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 89442cb | Docs | Datadog PR Page | Give us feedback!

Trailing alignment / extra blank line cleaned up by `gofmt -s` and
`golangci-lint --fix` on three of the evict-package test files. No
behavior change; restores `git diff --exit-code` after `make fmt` for
the `check_formatting` CI gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@L3n41c L3n41c added the enhancement New feature or request label May 18, 2026
@L3n41c L3n41c added this to the v1.27.0 milestone May 18, 2026
Adds three small test files to lift `evict/` package coverage from 55%
to 66% so the patch-coverage gate passes:

- `preflight_test.go`: covers `warnKarpenterWeightConflicts` for no
  Karpenter targets, no conflict, equal-weight conflict, nil-weight
  defaulting to 0, and an unknown user-NodePool name.
- `prompt_test.go`: covers `printPlan` rendering of each section
  (CA, PDB, per-manager evictions, dry-run skips) and
  `promptConfirmation`'s y/N/yes parsing.
- `pdb_test.go`: covers `uniqueNodes` excluding EKS MNG entries and
  deduplicating shared node names.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@L3n41c

L3n41c commented May 18, 2026

Copy link
Copy Markdown
Member Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7ee9656a7e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread cmd/kubectl-datadog/autoscaling/cluster/evict/evict_pods.go Outdated
Comment thread cmd/kubectl-datadog/autoscaling/cluster/evict/pdb.go Outdated
L3n41c and others added 14 commits May 18, 2026 22:14
P1: `shouldSkipPod` collapsed two distinct concerns — "do not call Evict()
on this pod" (drainNode) and "this pod no longer occupies the node"
(waitForNodeEmpty). A pod with DeletionTimestamp set was treated as
absent, so for ASG/standalone targets the orchestrator could terminate
the EC2 instance before the container finished its grace period.
Splits the predicate:

- `shouldSkipEviction`: skip the Eviction call for DS / mirror /
  terminating / completed pods.
- `podOccupiesNode`: a terminating pod STILL occupies the node and
  must keep `waitForNodeEmpty` blocking. DS / mirror / completed pods
  don't.

P2: `uniqueNodes` no longer filters out EKS managed node group entries
when discovering controllers for temporary PDBs. The orchestrator now
blocks on `waitEKSNodegroupEmpty` before cleaning up the PDBs, so EKS
observes them throughout its drain. Excluding MNG nodes left a workload
whose every replica lived on a single MNG without any PDB protection;
EKS could then disrupt all replicas at once.

Test coverage:
- New `TestPodOccupiesNode` locks in the "terminating pods still
  occupy the node" semantics.
- `TestShouldSkipPod` renamed to `TestShouldSkipEviction`.
- `TestUniqueNodes_ExcludesEKSMNG` renamed to
  `TestUniqueNodes_IncludesAllManagerTypes` and inverted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…can dedup

`ExtractEC2InstanceID` and `LabelEKSNodegroup` were introduced in
`common/clusterinfo` so the evict package could reuse them, but two more
copies of the same `aws:///[^/]+/(i-[0-9a-f]+)$` regex remained inline:
in `common/clusterinfo/classify.go:197` and `common/karpenter/fromnodes.go:65`.

`common/clusterinfo` already imports `common/karpenter` (for the Karpenter
detection helpers), so karpenter can't reach back into clusterinfo without
producing an import cycle. Move the helper to a neutral home —
`common/aws/node.go` — that both can import. The unit test moves with the
function.

Net result: one regex definition, four call sites, three packages dedup'd.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The existing tests under `cmd/kubectl-datadog/autoscaling/cluster/` (e.g.
`install/install_test.go:TestValidate`, `apply/install_mode_test.go`,
`apply/inference_method_test.go`, `apply/create_karpenter_resources_test.go`)
all declare the table inline inside `for _, tc := range []struct{ … }{ … } {`
rather than via an intermediate `tests :=` variable.

Aligns four newly-added tests with that convention:
- `common/aws/node_test.go: TestExtractEC2InstanceID`
- `evict/evict_pods_test.go: TestShouldSkipEviction`, `TestPodOccupiesNode`
- `evict/plan_test.go: TestParseTargetSpec`
- `evict/pdb_test.go: TestTempPDBName`

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`evictASG` extracted the EC2 instance ID and warned on an unexpected
providerID, but discarded the ID immediately afterwards. The check was
load-bearing in an earlier revision that terminated instances per-instance
via `TerminateInstanceInAutoScalingGroup`; after the round-2 Codex review
replaced that loop with a single `UpdateAutoScalingGroup(0,0,0)`, the ID
is no longer used here and the warning misleadingly suggested otherwise.

The `standalone` and `karpenter/fromnodes` call sites still use the
extracted ID, so the helper stays.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tASG`

The five `TestEvictASG_*` functions shared the same setup (fake clientset
+ stubAutoscaling), the same call (`evictASG(ctx, client, stub, …)`) and
the same axes of assertion (`err`, `stub.scaledASGs`, optionally
`Spec.Unschedulable`). They only differed on the initial K8s objects,
whether an Eviction reactor was wired, and the drain timeouts.

Collapsing them into one `TestEvictASG` with an inline case slice keeps
the matrix of (objects × reactor × dry-run × wantErr × wantScaledASGs ×
wantUnschedulable) visible at a glance, aligns with the convention used
in `install/install_test.go`, `apply/install_mode_test.go`, etc., and
makes it cheap to add a new scenario.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Same shape as the earlier `TestEvictASG` consolidation, applied to the
rest of the package. Tests that exercised the same target function with
different inputs are now grouped into one `Test<Function>` with an
inline case slice; one test per *function* rather than per *scenario*
collapses ~65 cousin functions down to ~25.

- `clusterautoscaler_test.go`: 5 → 1 (`TestScaleDownClusterAutoscaler`)
- `cordon_test.go`: 4 → 1 (`TestCordonNode`)
- `eks_mng_test.go`: 5 → 1 (`TestEvictEKSManagedNodeGroup`)
- `evict_pods_test.go`: 5 EvictPodWithRetry tests → 1 (`TestEvictPodWithRetry`)
  with a typed `evictionResponder` closure capturing each scenario's reactor
- `karpenter_user_test.go`: 2 → 1 (`TestEvictKarpenterUserNodePool`)
- `standalone_test.go`: 5 → 1 (`TestEvictStandalone`)
- `preflight_test.go`: 5 conflict tests → 1 (`TestWarnKarpenterWeightConflicts`)
- `prompt_test.go`: 4 printPlan + 2 promptConfirmation → 2 (`TestPrintPlan`,
  `TestPromptConfirmation`)
- `run_test.go`: 5 → 1 (`TestEvictAllTargetsParallel`)
- `plan_test.go`: 3 BuildPlan tests + the inline-subtests in
  `TestHasDatadogManagedNodePool` → `TestBuildPlan` + table-driven
  `TestHasDatadogManagedNodePool`
- `pdb_test.go`: per-helper consolidations
  (`TestUniqueNodes`, `TestIsTemporaryPDB`, `TestHasUserPDB`,
  `TestCleanupTempPDBs`, `TestCreateTempPDB`); `TestTempPDBName` was already
  table-driven; `TestDiscoverControllers_FiltersByNodeSet` left as a single
  scenario since the fixtures don't vary.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the hand-rolled for/return-true loop with the stdlib
`slices.ContainsFunc` (Go 1.21+). Same semantics, half the lines,
no extra dependency — `slices` is already imported elsewhere in the
plugin.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e boolean expression

Both functions were cascades of `if cond { return true/false }` short-circuiting
the same way `||` and `&&` already do. The two predicates collapse to a single
return line each, which makes the asymmetry between the two — `||` includes
`p.DeletionTimestamp \!= nil`, `&&` doesn't — visible at a glance.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the hand-rolled accumulator loop in `waitForNodeEmpty` with
`lo.CountBy`. Same semantics, fewer lines, intent visible at the
function name. The stdlib `slices` package has `ContainsFunc` /
`IndexFunc` but no `CountFunc`, so `lo.CountBy` is the right call —
`lo` is already a project dependency.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the if/else cascade in evictPodWithRetry with a tagless switch
and move the retry-sleep into the default branch so the four dispatch
outcomes (success, 404, non-429 error, deadline) sit as peers next to
the retry path. Tightens listPodsOnNode with named returns.
PollInterval is an internal field with a single construction site
(run.go) that never set it, leaving two divergent defensive defaults
(2s in drainNode, 10s in waitEKSNodegroupEmpty) to paper over the gap.
Set it explicitly to 2s in run.go and remove both fallbacks; tests
still inject their own short intervals.
Drop the per-manager-type goroutines, mutex and WaitGroup in favour of
a plain loop over targets. The original rationale for parallelism was
asynchronous EKS draining (UpdateNodegroupConfig returning while EKS
worked in the background), but waitEKSNodegroupEmpty now blocks until
the EKS-managed nodes disappear, so every manager already runs
synchronously. Sequential execution yields linear logs, bounded
apiserver pressure, and easier debugging when a target fails.
Realign inline struct literals after recent field-list changes. Adding
fields whose name is longer than any existing one widens the column
gofmt computes for the whole literal; the previous commits introduced
such fields without re-running gofmt, so check_formatting on the CI
pipeline reported diffs in five test files. Pure whitespace, no
semantic change.
@L3n41c

L3n41c commented May 20, 2026

Copy link
Copy Markdown
Member Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 96d5f50e4f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread cmd/kubectl-datadog/autoscaling/cluster/evict/eks_mng.go Outdated
Comment thread cmd/kubectl-datadog/autoscaling/cluster/evict/run.go Outdated
L3n41c and others added 2 commits May 20, 2026 11:42
Two fixes triggered by Codex review of 96d5f50:

1. EKS managed node group: the API rejects `maxSize < 1`, so the previous
   `min=max=desired=0` Update would have failed before any drain started.
   DescribeNodegroup the target first, then preserve its current MaxSize
   while still zeroing min/desired (defaulting to 1 if the described
   scaling config is nil). The EKS-side drain behaviour is unchanged —
   desired=0 still triggers it — but the API call now succeeds.

2. Dry-run: both `clusterinfo.Persist` call sites in Run mutated the
   cluster-info ConfigMap unconditionally, before any dry-run gate. A
   preview must not require write RBAC on the Karpenter namespace and
   must not leave behind state; gate both calls on `\!opts.DryRun` and
   log the would-be persistence instead.

Test updates: extend stubEKS with DescribeNodegroup, add cases for
DescribeNodegroup failure (short-circuits Update) and for a nil
ScalingConfig (falls back to max=1).
Replace context.Background() with t.Context() across the eviction
test files. t.Context() (Go 1.24+) returns a context that is canceled
when the test completes, so any goroutine still using the context is
notified instead of leaking past the test boundary.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@levan-m levan-m modified the milestones: v1.27.0, v1.28.0 Jun 5, 2026
L3n41c added a commit that referenced this pull request Jun 18, 2026
Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
L3n41c added a commit that referenced this pull request Jun 18, 2026
Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
L3n41c added a commit that referenced this pull request Jun 18, 2026
Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
L3n41c added a commit that referenced this pull request Jun 18, 2026
Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
L3n41c added a commit that referenced this pull request Jun 18, 2026
…-nodes

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
L3n41c added a commit that referenced this pull request Jun 19, 2026
Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
L3n41c added a commit that referenced this pull request Jun 19, 2026
Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
L3n41c added a commit that referenced this pull request Jun 22, 2026
Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.
L3n41c added a commit that referenced this pull request Jun 23, 2026
Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.
L3n41c added a commit that referenced this pull request Jun 25, 2026
Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.
L3n41c added a commit that referenced this pull request Jun 25, 2026
Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.
L3n41c added a commit that referenced this pull request Jun 25, 2026
* [CASCL-1386] Add evict-legacy-nodes command skeleton

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

* [CASCL-1386] Implement evict-legacy-nodes execution plan building

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.
L3n41c added a commit that referenced this pull request Jun 25, 2026
…#3162)

* [CASCL-1386] Add evict-legacy-nodes command skeleton

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

* [CASCL-1386] Implement evict-legacy-nodes execution plan building

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

* [CASCL-1386] Implement evict-legacy-nodes plan display and confirmation

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.
L3n41c added a commit that referenced this pull request Jun 26, 2026
Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.
L3n41c added a commit that referenced this pull request Jun 26, 2026
Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.
L3n41c added a commit that referenced this pull request Jun 29, 2026
* [CASCL-1386] Add evict-legacy-nodes command skeleton

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

* [CASCL-1386] Implement evict-legacy-nodes execution plan building

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

* [CASCL-1386] Implement evict-legacy-nodes plan display and confirmation

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

* [CASCL-1386] Implement evict-legacy-nodes preflight warnings

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

* [CASCL-1386] Implement evict-legacy-nodes preflight warnings

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.
L3n41c added a commit that referenced this pull request Jun 29, 2026
…#3164)

* [CASCL-1386] Add evict-legacy-nodes command skeleton

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

* [CASCL-1386] Implement evict-legacy-nodes execution plan building

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

* [CASCL-1386] Implement evict-legacy-nodes plan display and confirmation

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

* [CASCL-1386] Implement evict-legacy-nodes preflight warnings

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

* [CASCL-1386] Implement cluster-autoscaler scale-down for evict-legacy-nodes

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.
L3n41c added a commit that referenced this pull request Jun 29, 2026
Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.
L3n41c added a commit that referenced this pull request Jun 29, 2026
…ws (#3173)

* [CASCL-1386] Add evict-legacy-nodes command skeleton

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

* [CASCL-1386] Implement evict-legacy-nodes execution plan building

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

* [CASCL-1386] Implement evict-legacy-nodes plan display and confirmation

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

* [CASCL-1386] Implement evict-legacy-nodes preflight warnings

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

* [CASCL-1386] Implement cluster-autoscaler scale-down for evict-legacy-nodes

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.

* [CASCL-1386] Extract shared EC2 instance-ID helper to common/aws

Part of a stack splitting #3026 (too large to review in one piece) into
small pieces that each build and pass tests on their own. The command is
fully functional only once the whole stack lands.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants