Skip to content

Workloads can reference ConfigMaps and Secrets, delivered to every POP cell#129

Draft
scotwells wants to merge 19 commits into
feat/federated-deployment-schedulingfrom
feat/configmap-secret-mounts-federated
Draft

Workloads can reference ConfigMaps and Secrets, delivered to every POP cell#129
scotwells wants to merge 19 commits into
feat/federated-deployment-schedulingfrom
feat/configmap-secret-mounts-federated

Conversation

@scotwells
Copy link
Copy Markdown
Contributor

@scotwells scotwells commented Jun 1, 2026

Value

Workloads can reference ConfigMaps and Secrets, and they are delivered to the Instances that need them in every POP cell. The management plane resolves those references into labeled companion objects and propagates them cross-plane via the federator's PropagationPolicy, gated by a ReferencedData scheduling gate (behind the enableReferencedDataGate flag, default off) so an Instance only starts once its referenced data has landed.

This is the core of referenced ConfigMap/Secret delivery — the bottom of a reviewable stack carved out of the original monolithic #129. It lands the resolution + cross-plane delivery mechanism only; status-reason polish, hub-side GC, and e2e follow as separate stacked PRs so each gets focused review.

What

  • Referenced-data types + resolver (single-cluster path), cell-side gate clearing when data lands.
  • Cross-plane delivery via the federation manager; controller gated to the management plane.
  • Conflict-tolerant finalizer handling, atomic companion ref-counting, source-name companion naming.
  • Hub→cell terminal-error annotation bridge (the channel the blocking-reason PR reads).

Stack

#107#142 (foundation completion) ← this PR (referenced-data core) ← blocking-reason ← hub-side GC ← e2e. Base is the bundled foundation (#142); merges down in order.

Verification

go build / go vet / go test ./internal/... ./api/... / golangci-lint v2.12.2 all green at the branch tip.

🤖 Generated with Claude Code

@scotwells scotwells force-pushed the feat/configmap-secret-mounts-federated branch from e0b8ee7 to 9a064cc Compare June 2, 2026 00:02
scotwells added a commit that referenced this pull request Jun 3, 2026
Drop the pre-merge dependency framing (the #129 'not yet on main' note)
and repoint example references from PR URLs to relative paths
(examples/serverless-js-configmap, examples/config-secret-probe), as if
the referenced-data delivery and example PRs are merged. Keep the
upstream base/base-compat ROM-enablement limitation, which is a Unikraft
runtime matter independent of these PRs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
scotwells added a commit that referenced this pull request Jun 4, 2026
Surface rolling-update / restart progress in `datumctl compute workloads` by
showing updated/desired replica counts next to ready. UP-TO-DATE counts
instances on the latest template revision (status.updatedReplicas), so a roll
is visible as the count dips below desired and then recovers.

Includes a byte-identical copy of the UpdatedReplicas/ObservedGeneration
WorkloadDeployment status fields in api/v1alpha so the plugin can read them.
These fields are defined identically on the controller branch (PR #129); the
duplicate resolves cleanly once both land on main.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@scotwells scotwells force-pushed the feat/configmap-secret-mounts-federated branch from d981d02 to e420096 Compare June 4, 2026 23:06
@scotwells scotwells changed the base branch from feat/federated-deployment-scheduling to split/federation-core-bundled June 4, 2026 23:06
@scotwells scotwells changed the title feat: Deliver referenced ConfigMaps and Secrets to instances feat: deliver referenced ConfigMaps and Secrets to instances (core) Jun 4, 2026
@scotwells scotwells changed the title feat: deliver referenced ConfigMaps and Secrets to instances (core) Workloads can reference ConfigMaps and Secrets, delivered to every POP cell Jun 4, 2026
@scotwells scotwells force-pushed the split/federation-core-bundled branch from 0920e3e to 741fb15 Compare June 5, 2026 01:42
@scotwells scotwells force-pushed the feat/configmap-secret-mounts-federated branch from e420096 to b11ab65 Compare June 5, 2026 01:42
@scotwells scotwells force-pushed the split/federation-core-bundled branch from 741fb15 to 119ddfd Compare June 5, 2026 15:13
@scotwells scotwells force-pushed the feat/configmap-secret-mounts-federated branch from b11ab65 to e0df8b4 Compare June 5, 2026 15:13
@scotwells scotwells force-pushed the split/federation-core-bundled branch from 119ddfd to 5997231 Compare June 5, 2026 15:25
@scotwells scotwells force-pushed the feat/configmap-secret-mounts-federated branch from e0df8b4 to 5ec3c00 Compare June 5, 2026 15:25
@scotwells scotwells force-pushed the split/federation-core-bundled branch from 5997231 to 1d0fca7 Compare June 5, 2026 16:19
@scotwells scotwells force-pushed the feat/configmap-secret-mounts-federated branch from 5ec3c00 to bc253fd Compare June 5, 2026 16:19
@scotwells scotwells force-pushed the split/federation-core-bundled branch from 1d0fca7 to a67b32c Compare June 5, 2026 16:46
@scotwells scotwells force-pushed the feat/configmap-secret-mounts-federated branch from bc253fd to 643da4d Compare June 5, 2026 16:46
@scotwells scotwells changed the base branch from split/federation-core-bundled to feat/federated-deployment-scheduling June 5, 2026 17:30
An error occurred while trying to automatically change base from split/federation-core-bundled to feat/federated-deployment-scheduling June 5, 2026 17:30
@scotwells scotwells force-pushed the feat/federated-deployment-scheduling branch from a67b32c to b45810f Compare June 5, 2026 17:49
@scotwells scotwells force-pushed the feat/configmap-secret-mounts-federated branch from 643da4d to 4ef5f5c Compare June 5, 2026 17:49
scotwells and others added 19 commits June 5, 2026 13:36
Phase 0 of the cross-plane referenced-data delivery design: the API surface,
validation, and the management-plane building blocks every later phase compiles
against.

- API: EnvFrom on SandboxContainer; ReferencedDataReady condition + reasons;
  ReferencedData scheduling gate; referenced-data label and
  expected-referenced-data/restartedAt annotations.
- Validation: secret-volume validation, ConfigMap/Secret items key->path
  projection, file-mode range and path safety, EnvFrom validation, and an
  admission SubjectAccessReview that the submitter can read each referenced
  ConfigMap/Secret.
- internal/referenceddata seam: reference collector, deterministic companion
  naming, and the scoped ProjectConfigSecretReader (+ single-cluster variant).
- Feature-flagged gate insertion (EnableReferencedDataGate, default off) so
  behavior is unchanged until the delivery and consumption phases are deployed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit 8f4014b)
Phase 1: the management-plane ReferencedDataController. For each
WorkloadDeployment it collects referenced ConfigMaps/Secrets, reads them with a
scoped reader, enforces per-object/aggregate size limits, materializes one
shared companion per source as a local copy, records the expected companion set
on the deployment, reference-counts across deployments with finalizer cleanup,
and surfaces resolution failures on ReferencedDataReady. Source watches refresh
companions on rotation.

Delivery is abstracted behind a companionWriter seam; the federated writer and
PropagationPolicy extension are deferred to Phase 1b (federation merge). Sizes
are configurable (referencedData.perObjectLimitBytes / aggregateLimitBytes).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit 7f48a97)
Phase 2: the cell-side InstanceReconciler now resolves the referenced-data
scheduling gate. For a gated Instance it reads the expected companion set from
the owning WorkloadDeployment, lists the labeled companions in the namespace,
and clears the gate once they are all present — surfacing ReferencedDataReady
(Resolving/AwaitingPropagation/Ready) with the missing set on the message. A
companion watch re-checks waiting Instances, with a requeue safety net.

Adds Kubernetes Events on transitions and compute_referenced_data_* metrics
(companions present/expected, gate-wait duration, condition transitions), and
counts referenced-data-blocked replicas in the deployment status.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit b3da528)
Fix type mismatches and test setup issues that arose when landing the
referenced-data commits onto the federated-deployment-scheduling base:

- Use multicluster.ClusterName (not string) in Watches handler signatures
  for referenceddata_controller.go and instance_controller.go companion
  watches; update enqueueWDsForSource to accept multicluster.ClusterName
- Cast req.ClusterName to string when passing to resolveAndValidateSources
- Replace New(Options{}) with NewWithOptions(Options{}) in stateful_control_test.go
  after re-introducing the no-arg New() on the combined Options struct
- Initialize r.finalizers in newRefDataReconciler test helper; add
  instanceControllerFinalizer to test instance fixtures so the finalizer
  framework does not short-circuit reconcile on first pass
- Update TestReferencedDataEventEmittedOnClear to expect single-pass
  gate-clearing (federation branch clears both status and gate in same pass)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
(cherry picked from commit 3b00771)
Phase 1b: the resolver now has a federated companion writer. When a federation
client is configured it materializes companions into the project's downstream
namespace (ns-{project-uid}) on the Karmada hub via the same
MappedNamespaceResourceStrategy the federator uses; with no federation client it
falls back to the single-cluster local copy.

The federator's PropagationPolicy now selects the referenced-data-labeled
ConfigMaps and Secrets alongside the WorkloadDeployment, so companions
co-propagate to the same cells, and it forwards the expected-referenced-data
annotation to the downstream deployment so the cell can clear the gate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit 5f42053)
The ReferencedDataController was registered unconditionally, so it also started
in the cell operator and collided with the cell's WorkloadDeploymentReconciler
(both default their controller name to "workloaddeployment"). Gate it to the
management controller set and give it a distinct name. Found by running the
operators in the multi-cluster e2e environment.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit 0240e90)
The WorkloadDeploymentFederator and ReferencedDataController both call
Update() on the same WorkloadDeployment to add/remove their respective
finalizers, producing optimistic-lock conflicts under concurrent
reconciliation. The companions DO converge, but the errors are noisy.

Wrap all three finalizer mutation sites (add on first reconcile, remove
on deletion, remove on empty-refs) in retry.RetryOnConflict so that a
concurrent federator update causes a transparent re-read + retry rather
than a logged error. Each retry re-GETs the latest object version before
re-applying only the single finalizer change.

Adds two unit tests using interceptor.Funcs to inject a conflict on the
first Update and assert that the finalizer is correctly applied/removed
on retry.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
(cherry picked from commit 87fe6ac)
…cret-mounts

Fixes six issues identified in code review of the federated configmap/secret
mounts feature:

1. [BLOCKER] ValidateUpdate was a no-op stub: an attacker could create a
   Workload with no refs then edit in refs to ConfigMaps/Secrets they cannot
   read. Fixed by implementing ValidateUpdate to run the same full template
   validation (including SAR) as ValidateCreate on the new object.

2. [MAJOR] SAR for referenced-data was missing UserInfo.Extra, unlike the
   sibling Network SAR. Fixed by populating Extra from AdmissionRequest.

3. [MAJOR] Collector still collected both refs from an envFrom entry that had
   both configMapRef and secretRef set, even though validateEnvFrom forbids it.
   Fixed by short-circuiting such entries before collection in CollectFromTemplate.

4. [MAJOR] validateReferencedDataAccess maintained a hand-rolled traversal
   (collectRefsFromSpec) parallel to referenceddata.CollectFromTemplate, with
   drift risk. Fixed by adding CollectFromSpec to collector.go and making the
   validation call it as the single source of truth. No import cycle (validation
   -> referenceddata, not the reverse).

5. [MINOR] names.go: after TrimRight("-.", ...) the truncated segment could be
   empty (e.g. source name is all '-' or '.'), producing "<prefix>.-<hash>"
   which is invalid DNS-1123 (segment starts with '-'). Fixed by emitting
   "<prefix>.<hash>" when truncated is empty.

6. [MINOR] Tests added: both-refs-set envFrom rejection + no SAR for invalid
   entry; SAR InternalError/fail-closed path; ValidateUpdate SAR path;
   names edge cases (all-dashes, all-dots, name ending on dot, valid prefix).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
(cherry picked from commit a7ac4dd)
Fixes 10 issues found in the code review of the configmap/secret mounts
federation feature.

**Blockers:**

Fix 1 (federator status sync clobbers resolver conditions): syncStatusFromDownstream
now merges downstream status into the project WD while preserving the
resolver-owned ReferencedDataReady condition, rather than replacing .Status
wholesale. Wrapped in RetryOnConflict for concurrent-write safety.

Fix 2 (ref-count annotation race): materialiseOne and releaseOneCompanion now
wrap their read-modify-write of the ref-count annotation in
retry.RetryOnConflict loops so that two WDs reconciling concurrently against
the same companion cannot drop each other's ref entries.

**Majors:**

Fix 3 (optional source escape hatch): resolveAndValidateSources accepts the WD
template spec and uses a new isOptionalRef helper to determine per-source
optionality. NotFound and TooLarge errors for optional sources are silently
skipped; the WD proceeds with the remaining required companions.

Fix 4 (unparseable ref-count): decodeRefCount now returns ([]string, error).
An unmarshal failure propagates as a transient error; release paths return
the error rather than computing an empty remaining set that would incorrectly
delete the companion.

Fix 5 (nil-source panic): resolveOneSource nil-guards the reader's return
value and maps a (nil, nil) return to a SourceNotFound condition error.

Fix 6 (non-conflict-tolerant writes): the expected-set annotation Patch and
Status().Update are each wrapped in retry.RetryOnConflict; residual conflicts
after retries return (Result{}, nil) to requeue cleanly.

Fix 7 (annotation/label overwrite): ApplyConfigMap and ApplySecret in both
localCompanionWriter and downstreamCompanionWriter now use mergeLabels /
mergeAnnotations helpers that only upsert controller-owned keys, preserving
third-party entries such as Karmada bookkeeping annotations.

**Minors:**

Fix 8 (unused scheme field): ProjectReader.scheme field and the second
parameter of NewProjectReader removed; client.Options{} is sufficient.

Fix 9 (misleading localCompanionWriter comment): clarified that
localCompanionWriter is reachable only when FederationClient is nil, which
management-plane federation wiring never does on this branch.

Fix 10 (literal "v1" APIVersion): PropagationPolicy ConfigMap and Secret
selectors now use corev1.SchemeGroupVersion.String() instead of the literal
string "v1".

**Regression tests added:**
- Two WDs sharing a source, interleaved reconciles → both ref-count entries preserved
- Optional missing source → skipped, WD not failed
- Unparseable ref-count annotation → companion NOT deleted
- Federator status sync preserves ReferencedDataReady condition

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
(cherry picked from commit 5d5dce1)
…etrics

- [BLOCKER] Restore ObservedGeneration guard on gate clearing: both the
  quota and referenced-data gate removals in reconcileSchedulingGates now
  check cond.ObservedGeneration == instance.Generation, preventing a stale
  True condition from generation N clearing a freshly-stamped gate at N+1.

- [MAJOR] Suppress AwaitingPropagation event flood: the Warning event is
  now emitted only on a reason transition (prev != AwaitingPropagation),
  not on every reconcile where the missing-set message changes.

- [MAJOR] Delete dead removeQuotaSchedulingGate: its generation guard has
  been ported to reconcileSchedulingGates; the standalone function is gone.

- [MAJOR] Drop high-cardinality instance label from CompanionsPresent and
  CompanionsExpected metrics: both gauges now aggregate per namespace only,
  eliminating the unbounded per-instance series set.

- [MINOR] Delete drainEvents test helper (never called).

- [MINOR] Remove instanceByWorkloadDeploymentUIDIndex: the constant, its
  IndexField registration, and the index func are all deleted; the companion
  watch enqueue path already lists the full namespace and never queried it.
  Updated addInstanceIndexers doc-comment accordingly.

- [MINOR] Route checkForNetworkCreationFailure through
  fetchOwnerWorkloadDeployment to avoid a duplicate WD Get; fix the stale
  doc-comment on enqueueInstancesInNamespace.

- [MINOR] Fix no-gate self-heal: the branch that flips a non-True
  ReferencedDataReady condition to True when the gate is absent now
  validates that all expected companions are actually present before
  marking Ready, avoiding a false-Ready status when the gate was stripped
  out-of-band while companions were still missing.

- ADD TEST: TestReferencedDataStaleConditionGuard verifies that a stale
  True condition at ObservedGeneration N does not clear a gate at generation
  N+1; only the second reconcile (after condition is re-evaluated at N+1)
  removes the gate.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
(cherry picked from commit c433ad1)
The companion ref-count (companionRefCountAnnotation) read-modify-write
was non-atomic: materialiseOne fetched the companion (R1), computed the
updated ref-count, built a desired object carrying that ref-count, then
called ApplyConfigMap which issued a second independent GET (R2) before
calling Update. mergeAnnotations then overwrote the referenced-by key
on R2 with the R1-derived value. If a concurrent WD had committed its
ref-count entry between R1 and R2, R2 already contained that entry, but
the merge silently discarded it. Because Apply's own GET+Update targeted
R2, no conflict was raised and RetryOnConflict never fired.

Fix (approach a): remove the second internal GET from ApplyConfigMap /
ApplySecret. Both methods now accept an `existing` argument — the
already-fetched object (or nil when the companion does not yet exist).
When existing is non-nil, the implementation writes directly onto it and
calls Update, targeting the same resourceVersion that the caller used to
compute the ref-count. A concurrent change between GET and Update will
now produce a Conflict that propagates out to the materialiseOne /
releaseOneCompanion RetryOnConflict loop, which re-reads and recomputes.

The same double-GET path existed in releaseOneCompanion and is fixed
with the same pattern: pass the already-fetched companion as the
`existing` argument.

Add TestReferencedData_RefCount_ConflictForcesReread, which injects a
concurrent write of a third WD's key (wd3Key) into the companion on
realCl immediately after the outer GET returns. Under the fixed code the
subsequent Update at R1 conflicts with R2, RetryOnConflict re-reads R2
(with wd3Key), and all three keys appear in the final annotation. Under
the old code Apply's internal GET would return R2 with wd3Key, but
mergeAnnotations would overwrite referenced-by with the R1-derived value
([wd1Key, wd2Key]), silently dropping wd3Key, and the Update at R2 would
succeed — the test would fail with only 2 keys instead of 3.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
(cherry picked from commit 9b2b7e0)
Replace manual membership loop in refCountAdd with slices.Contains.
Replace two manual map-copy loops in mergeLabels/mergeAnnotations with
maps.Copy (semantics preserved: copies wanted INTO existing, never
replaces the map).

Add ReferencedDataLabelValue = "true" to api/v1alpha/labels.go and use
it everywhere in place of the repeated string literal. Use kindSecret
and kindConfigMap constants in indexers.go and the controller instead of
repeated string literals. Extract isOptionalInVolumes and
isOptionalInContainers helpers from isOptionalRef to bring gocyclo from
32 down to within the limit. Define named constants for all repeated test
strings (rdTestWD1, rdTestBlobKey, testKindConfigMap, etc.) introduced
by the configmap/secret change set to clear the goconst and gocyclo lint
issues it brought in.

The instance_controller.go GetEventRecorderFor call retains its existing
//nolint:staticcheck comment: the replacement API (GetEventRecorder)
has an incompatible Eventf signature requiring a migration of all emit
sites, which is deferred as a separate task.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
(cherry picked from commit 8d782e7)
…olve

Companions were named "<lower-kind>.<source-name>" (e.g. configmap.app-config)
but Instances and Pods reference sources by their verbatim source name
(e.g. app-config). The runtime found only the prefixed companion → 404
→ mounts/env vars failed.

Option B fix: companion objects are now named exactly by their source name.
Cross-kind collision is safe (ConfigMap and Secret are distinct resource
types in the same namespace). Pod/Instance volume and envFrom references
resolve naturally without any translation layer.

Kind-disambiguation is preserved in the expected-referenced-data annotation
as "Kind/name" tokens (e.g. ["ConfigMap/app-config","Secret/app-secret"])
so the cell InstanceReconciler can verify each companion by kind+name
without probing both resource types.

Changes:
- internal/referenceddata/names.go: CompanionName drops the kind prefix;
  adds CompanionToken("Kind", "name") → "Kind/name" helper
- internal/referenceddata/names_test.go: update expected values; add
  TestCompanionName_SourceNameContract (contract test that would fail
  against the old prefixed code) and TestCompanionName_SameSourceDifferentKind
- internal/controller/referenceddata_controller.go: annotation stores
  kind-qualified tokens; releaseOneCompanion accepts explicit kind param
- internal/controller/referenceddata_controller_test.go: update assertions
  to kind-qualified tokens; fix stale-RV and same-name source/companion cases
- internal/controller/instance_controller.go: gate-clearing parses kind-
  qualified tokens via listPresentCompanionsByKindName
- internal/controller/instance_referenced_data_test.go: align to new names
  and annotation token format
- test/e2e/referenced-data-mounts/chainsaw-test.yaml: update companion name
  assertions (configmap.app-config → app-config, secret.app-secret → app-secret)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
(cherry picked from commit 7434216)
Note (split): empty against split/federation-core-bundled. The foundation
already guards the nil Instance Status.Controller deref in
reconcileInstanceGates (alongside its updatedReplicas rollout tracking), so
this commit's production change is a no-op here. Retained to preserve the
cherry-pick order; the accompanying nil-guard test lands in 6e7de18.

(cherry picked from commit 58783e84a9810a78eae84b67e887be663008c01d5)
Add a regression test that constructs an Instance with Programmed=True
but Status.Controller==nil and asserts that reconcileInstanceGates neither
panics nor counts it as a current replica. Without the nil guard introduced
in the same area the test panics with a nil-pointer dereference, confirming
the guard is load-bearing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
(cherry picked from commit 6e7de18)
Spec.Controller is a nilable pointer the infra provider populates
independently of an Instance's networking readiness. reconcileInstanceGates
dereferenced instance.Spec.Controller.SchedulingGates in the network
gate-clearing path without a nil guard, so an Instance that reached
networkReady before its controller spec was populated panic-looped the
WorkloadDeployment reconcile and froze status.

This mirrors the existing Status.Controller guard. Extends the
nil-controller regression test to drive networkReady=true with a nil
Spec.Controller, the case the first guard missed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit 8cf0945)
…rencedDataReady

Add wdReferencedDataChangedPredicate to the WorkloadDeploymentReconciler's For()
watch so the reconciler re-runs when ReferencedDataController writes
ReferencedDataReady=False/SourceNotFound, without creating a self-trigger loop on
the reconciler's own Available/ReplicasReady status writes.

Root cause: the For() watch had no predicate, so each Status().Update by the WD
reconciler (writing Available and ReplicasReady) re-enqueued itself, creating a
continuous reconcile loop. The resolver's write of ReferencedDataReady was lost in
this noise — the WD reconciler would re-run but process a stale queue entry before
the cache reflected the resolver's update, leaving Available stuck at
InstancesProvisioning instead of being promoted to ReferencedDataNotReady.

Fix approach:
- Add wdReferencedDataChangedPredicate: passes on generation change (spec update) or
  ReferencedDataReady condition status/reason/message change. Drops updates where
  only Available/ReplicasReady changed (own writes).
- Add equality.Semantic.DeepEqual guard before Status().Update as belt-and-suspenders:
  avoids the superfluous API call entirely when status did not change.
- Extract "ReplicasAvailable" inline string to reasonReplicasAvailable constant
  (goconst compliance).

Loop prevention: the WD reconciler writes Available and ReplicasReady but never
writes ReferencedDataReady. So its own status writes fail the predicate's Update
check and do not re-enqueue via For(). Instance changes still trigger via
Owns(&Instance{}), which has no predicate.

Workload level: WorkloadReconciler.Owns(&WorkloadDeployment{}) has no predicate and
correctly fires on Available condition changes. The reconciler already has the
equality.Semantic.DeepEqual guard so it does not loop on its own writes.

Tests: wdRefDataCondChanged unit tests (nil/nil, added, removed, status change,
message change, identical), wdReferencedDataChangedPredicate tests (resolver write
passes, resolver clear passes, generation change passes, Available-only drops,
ReplicasReady-only drops, create/delete/generic always pass).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
(cherry picked from commit 4ba1d46)
…nnotation

The referenced-data resolver sets ReferencedDataReady=False/SourceNotFound on the
hub WorkloadDeployment, but Karmada statusAggregation only flows cell->hub, so the
cell Instance controller (which reads the cell WD copy) never saw the verdict and
the Instance stayed Ready=Unknown/Resolving instead of Ready=False/SourceNotFound.

Bridge it via an annotation (compute.datumapis.com/referenced-data-error), which
DOES propagate hub->cell as part of ObjectMeta. The resolver stamps it on terminal
errors (SourceNotFound/TooLarge/Unauthorized) and clears it on resolve; the cell
instance controller reads it from the owner WD and surfaces the terminal reason+
message, with a condition-based fallback preserved for single-cluster.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit 3751f78)
Extract shared WorkloadDeployment test fixtures (cityCode, wdControllerTestUID)
and the quota-denied message into package-level constants so goconst no longer
flags repeated string literals across the controller test files.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@scotwells scotwells force-pushed the feat/federated-deployment-scheduling branch from b45810f to 73177eb Compare June 5, 2026 18:38
@scotwells scotwells force-pushed the feat/configmap-secret-mounts-federated branch from 4ef5f5c to 646124c Compare June 5, 2026 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant