Skip to content

EAI 5821 evaluate envoy gateway as unified gateway platform cluster forge#734

Draft
blankdots wants to merge 16 commits into
mainfrom
EAI-5821-evaluate-envoy-gateway-as-unified-gateway-platform-cluster-forge
Draft

EAI 5821 evaluate envoy gateway as unified gateway platform cluster forge#734
blankdots wants to merge 16 commits into
mainfrom
EAI-5821-evaluate-envoy-gateway-as-unified-gateway-platform-cluster-forge

Conversation

@blankdots

Copy link
Copy Markdown
Contributor

No description provided.

@blankdots blankdots self-assigned this Jun 4, 2026
blankdots and others added 2 commits June 4, 2026 15:10
  Replace the :6443 k8s-passthrough listener on the shared `https` gateway
  with a dedicated `tls-passthrough` gateway on :443 that owns the external
  MetalLB LoadBalancer and does SNI-based TLS passthrough:
  k8s.<domain> -> kube API service, *.<domain> -> apps gateway. The apps
  gateway moves to ClusterIP behind it.

  The listener and TLSRoutes carry explicit hostnames: Envoy Gateway TLS
  passthrough builds SNI filter chains from hostnames, so an empty hostname
  yields an empty Envoy config that never routes. Listening on :443 instead
  of :6443 avoids hijacking pod->apiserver traffic where the node IP equals
  the MetalLB pool IP.

  Refs: EAI-5821
@mramdgh mramdgh force-pushed the EAI-5821-evaluate-envoy-gateway-as-unified-gateway-platform-cluster-forge branch from 87ebc9d to 26e90ed Compare June 5, 2026 06:51
mramdgh and others added 11 commits June 5, 2026 10:20
  Set extensionManager listener.includeAll=false so the AI Gateway xDS
  translation hook only receives listeners generated for its own resources
  (AIGatewayRoute/AIServiceBackend/InferencePool).

  With includeAll=true the hook also received the L4 tls-passthrough
  listener and tried to insert its request-header-metadata HTTP filter into
  a TCP filter chain that has no HTTPConnectionManager. That failed xDS
  translation for the entire GatewayClass, so the passthrough data plane
  got an empty snapshot and never left initialization.
  Revert the debug inversion: the tls-passthrough gateway owns the external
  MetalLB LoadBalancer on :443 (SNI passthrough) and the apps gateway drops
  back to ClusterIP behind it. The inversion was a workaround for the
  passthrough data plane not starting, which is now fixed.
- Bump cluster-auth to 0.6.0-rc2, which injects x-api-key-id and
  x-auth-username on every authenticated request and supports
  SecurityPolicy contextExtensions for per-IS group enforcement
  (required by the ai-gateway-discovery controller)
- Add api_key_id and aim_service_id to access log fields so every
  AI gateway request is attributed to the originating API key and
  AIM service in structured logs
  Set listener.includeAll=true so the AI extension injects the EPP
  ext_proc filter into the shared https :443 listener. InferencePool
  routes were returning 503 (no healthy upstream) because nothing set
  x-gateway-destination-endpoint on that Gateway-owned listener.

  Add failOpen=true so the extension erroring on the tls-passthrough
  L4 listener (an HTTP filter can't splice into a TCP chain) no longer
  fails that proxy's xDS translation and leaves it stuck in init.
  mergeGateways is off, so each gateway is a separate translation pass:
  the https proxy gets the filter, the passthrough proxy keeps its
  original xDS.
… to v1.0.8 and cluster-auth to 0.6.0-rc4"

This reverts commit 28a58c4.
…ster-auth-rc4

EAI-5821: Add ext-proc metrics scraping and bump cluster-auth to rc4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants