Skip to content

OLS-3311 Bundle embedding models for hermetic build#2961

Merged
openshift-merge-bot[bot] merged 1 commit into
openshift:mainfrom
blublinsky:embedding-hermetic-build
Jun 24, 2026
Merged

OLS-3311 Bundle embedding models for hermetic build#2961
openshift-merge-bot[bot] merged 1 commit into
openshift:mainfrom
blublinsky:embedding-hermetic-build

Conversation

@blublinsky

@blublinsky blublinsky commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Description

Summary
Decouples the OLS service image from the lightspeed-rag-content image by bundling
embedding model configs directly in this repo and fetching model binaries via Cachi2
generic prefetch for hermetic builds.

  • Removes the lightspeed-rag-content multi-stage dependency from Containerfile
  • Adds sentence-transformers/all-mpnet-base-v2 (BYOK/FAISS) and
    ibm-granite/granite-embedding-30m-english (OKP/Solr) model configs and tokenizers
  • Pre-populates the HuggingFace cache at build time so models can be loaded by ID
    with TRANSFORMERS_OFFLINE=1
  • Adds artifacts.lock.yaml for Cachi2 generic artifact prefetch of model binaries
  • Updates Tekton pipelines to include generic prefetch stage
  • Adds make get-embeddings target for local (non-hermetic) development

Build Modes

  • Hermetic (Konflux/Cachi2): Binaries prefetched to /cachi2/output/deps/generic/,
    copied into place during image build
  • Local (non-hermetic): Run make get-embeddings before docker build; build
    fails fast with clear error if binaries are missing

Test plan

  • Unit tests pass (1058 passed)
  • Ruff lint clean
  • MyPy type check clean
  • Verify Cachi2 prefetch works in Konflux PR pipeline
  • Verify container image loads both models offline (TRANSFORMERS_OFFLINE=1)

Type of change

  • [ x] Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • Chores
    • Bundle embedding model assets directly in the container for more reliable offline startup.
    • Add a locked manifest for embedding weight downloads.
    • Replace the old retrieval flow with dedicated embedding download targets and remove the temporary copy approach.
  • CI/CD
    • Adjust dependency prefetch inputs by adding a generic entry before existing entries.
  • Configuration
    • Update the example to point to the specific embedding model subdirectory.
    • Refine ignore rules for embedding model artifacts (including versioned folders).

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds tracked embedding model configs, pins weight artifacts, updates build and pipeline inputs, and switches runtime and Makefile flows to provision embedding files from the build context and HuggingFace downloads.

Changes

Embedding Model Provisioning Overhaul

Layer / File(s) Summary
Model config files and ignores
embeddings_model/all-mpnet-base-v2/*, embeddings_model/granite-embedding-30m-english/*, .gitignore, .woke.yaml
Adds model config JSON for both embedding models and narrows ignore rules so metadata stays tracked while weight files remain excluded from tool processing.
Artifact lock and prefetch inputs
artifacts.lock.yaml, .tekton/lightspeed-service-pull-request.yaml, .tekton/lightspeed-service-push.yaml
Adds pinned artifact metadata for two model weight files and inserts a generic prefetch input into both Tekton pipeline definitions.
Runtime image weight handling
Containerfile
Sets HuggingFace cache location, copies embedding model files from the build context, overlays prefetched weights when available, downloads missing weights, and primes the local HuggingFace cache layout for both models.
Makefile targets and example path
Makefile, examples/olsconfig.yaml
Replaces the old image-copy target with embedding download targets and updates the example configuration to point to the all-mpnet-base-v2 model directory.

Sequence Diagram

sequenceDiagram
  participant TektonPipeline as Tekton Pipeline
  participant Cachi2 as Generic Prefetch
  participant ContainerBuild as Container Build
  participant HuggingFace as HuggingFace Hub
  participant RuntimeImage as Runtime Image

  TektonPipeline->>Cachi2: prefetch inputs from artifacts.lock.yaml
  Cachi2-->>TektonPipeline: cached model weights
  TektonPipeline->>ContainerBuild: build image with cache volume
  ContainerBuild->>ContainerBuild: copy embeddings_model/
  alt prefetched weights available
    ContainerBuild->>ContainerBuild: overlay .safetensors files
  else missing weights
    ContainerBuild->>HuggingFace: curl resolve URLs
    HuggingFace-->>ContainerBuild: model.safetensors
  end
  ContainerBuild->>RuntimeImage: create HuggingFace cache refs
  ContainerBuild->>RuntimeImage: symlink local model snapshots
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: bundling embedding models to support a hermetic build.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@openshift-ci openshift-ci Bot requested review from onmete and xrajesh June 23, 2026 09:18

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
Containerfile (1)

78-79: 🩺 Stability & Availability | 🔵 Trivial

TRANSFORMERS_OFFLINE is already set in application code; consider making it container-level for unconditional offline mode.

The application sets TRANSFORMERS_OFFLINE=1 at runtime (line 59 of runner.py), but only when embeddings_model_path is configured. Moving this to the container's ENV block would ensure offline mode is enforced regardless of configuration state, improving reliability in disconnected environments. Note: HF_HUB_OFFLINE is not used in this codebase.

Proposed Containerfile update (conditional approach)
     LANG=en_US.UTF-8 \
     LLAMA_INDEX_CACHE_DIR=/tmp/llama_index \
     TIKTOKEN_CACHE_DIR=/app-root/.tiktoken_cache \
-    HF_HOME=/app-root/.cache/huggingface
+    HF_HOME=/app-root/.cache/huggingface \
+    TRANSFORMERS_OFFLINE=1
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Containerfile` around lines 78 - 79, Add TRANSFORMERS_OFFLINE=1 to the
Containerfile's ENV block alongside the existing environment variables like
TIKTOKEN_CACHE_DIR and HF_HOME. This ensures offline mode is enforced at the
container level unconditionally, rather than being conditionally set in
runner.py based on configuration state. This improves reliability in
disconnected environments by guaranteeing the offline mode is always active
regardless of how the application is configured.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Makefile`:
- Around line 149-155: The curl commands in the get-embeddings-byok and
get-embeddings-okp targets lack error handling safeguards, allowing HTTP errors
(404/403) to be written to the model.safetensors files while appearing
successful. Add the --fail flag to both curl commands to make them fail on HTTP
errors, and optionally add --retry and --connect-timeout flags to handle
transient failures and make the downloads fail fast and predictably instead of
silently writing corrupt files.
- Around line 147-155: The targets get-embeddings, get-embeddings-byok, and
get-embeddings-okp are phony targets that should be declared in the .PHONY
directive to prevent Make from treating them as file targets. Add these three
target names to the .PHONY declaration in the Makefile (typically near the
beginning of the file where other phony targets are declared) so that Make
correctly handles them as non-file targets regardless of whether files with
those names exist.

---

Nitpick comments:
In `@Containerfile`:
- Around line 78-79: Add TRANSFORMERS_OFFLINE=1 to the Containerfile's ENV block
alongside the existing environment variables like TIKTOKEN_CACHE_DIR and
HF_HOME. This ensures offline mode is enforced at the container level
unconditionally, rather than being conditionally set in runner.py based on
configuration state. This improves reliability in disconnected environments by
guaranteeing the offline mode is always active regardless of how the application
is configured.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1947a343-4686-4ebf-92a0-9e4b82c866fa

📥 Commits

Reviewing files that changed from the base of the PR and between 724097c and 2a64543.

📒 Files selected for processing (22)
  • .gitignore
  • .tekton/lightspeed-service-pull-request.yaml
  • .tekton/lightspeed-service-push.yaml
  • Containerfile
  • Makefile
  • artifacts.lock.yaml
  • embeddings_model/all-mpnet-base-v2/1_Pooling/config.json
  • embeddings_model/all-mpnet-base-v2/config.json
  • embeddings_model/all-mpnet-base-v2/config_sentence_transformers.json
  • embeddings_model/all-mpnet-base-v2/modules.json
  • embeddings_model/all-mpnet-base-v2/sentence_bert_config.json
  • embeddings_model/all-mpnet-base-v2/special_tokens_map.json
  • embeddings_model/all-mpnet-base-v2/tokenizer.json
  • embeddings_model/all-mpnet-base-v2/tokenizer_config.json
  • embeddings_model/granite-embedding-30m-english/1_Pooling/config.json
  • embeddings_model/granite-embedding-30m-english/config.json
  • embeddings_model/granite-embedding-30m-english/modules.json
  • embeddings_model/granite-embedding-30m-english/sentence_bert_config.json
  • embeddings_model/granite-embedding-30m-english/special_tokens_map.json
  • embeddings_model/granite-embedding-30m-english/tokenizer.json
  • embeddings_model/granite-embedding-30m-english/tokenizer_config.json
  • examples/olsconfig.yaml

Comment thread Makefile
Comment thread Makefile
@blublinsky blublinsky force-pushed the embedding-hermetic-build branch 3 times, most recently from 9ee8229 to bcf19fb Compare June 23, 2026 10:28
@blublinsky

Copy link
Copy Markdown
Contributor Author

/retest ci/prow/ols-evaluation

@vimalk78

Copy link
Copy Markdown
Contributor

are we totally removing rag content from ols service?

@blublinsky

Copy link
Copy Markdown
Contributor Author

are we totally removing rag content from ols service?

Yes, FAISS based OCP one is being replaced by OKP
The remaining one is BYOK

@blublinsky blublinsky force-pushed the embedding-hermetic-build branch from bcf19fb to 792e44b Compare June 23, 2026 17:53

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Containerfile`:
- Around line 94-96: The urllib.request.urlretrieve call in the python3 -c block
lacks socket timeout and retry logic, making the build vulnerable to network
issues. Add a socket timeout parameter to the urlretrieve call and implement
retry logic with exponential backoff around the download attempt to match the
resilience provided by the curl-based approach in the Makefile get-embeddings
targets. This will prevent transient network hiccups from failing the entire
build.
- Around line 86-96: The Containerfile does not validate SHA-256 checksums for
the safetensors files despite them being pinned in artifacts.lock.yaml, and the
urllib.request.urlretrieve fallback contradicts hermetic-build guarantees by
allowing unverified network downloads. Extract the pinned SHA-256 hashes from
artifacts.lock.yaml for both all-mpnet-base-v2-model.safetensors and
granite-embedding-30m-english-model.safetensors, then modify the RUN block to
compute and verify the SHA-256 hash of each file after the cachi2 copy or
download using sha256sum, comparing against the pinned values and failing the
build if they do not match. Remove the urllib.request.urlretrieve fallback
entirely and replace it with an explicit error that exits the build if the
expected file is not found in the cachi2 output directory.
- Around line 99-108: The cache-priming RUN block that loops through model_dir
variables (all-mpnet-base-v2 and granite-embedding-30m-english) creates a
non-standard Hugging Face cache layout. The refs/main file contains the literal
string "local" instead of a commit hash, and the symlinks are absolute instead
of relative paths pointing to a blobs directory. Either verify that the fallback
case returning "local:sentence-transformers/all-mpnet-base-v2" is never used in
production and remove the cache-priming block, or correct the cache structure by
replacing the literal "local" string with a proper commit hash in refs/main and
changing the symlinks to relative paths that point to a standard blobs directory
layout, aligning with the official Hugging Face Hub cache specification.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2b6c3dca-84b8-489d-8c83-318e05ac92c1

📥 Commits

Reviewing files that changed from the base of the PR and between bcf19fb and 792e44b.

📒 Files selected for processing (23)
  • .gitignore
  • .tekton/lightspeed-service-pull-request.yaml
  • .tekton/lightspeed-service-push.yaml
  • .woke.yaml
  • Containerfile
  • Makefile
  • artifacts.lock.yaml
  • embeddings_model/all-mpnet-base-v2/1_Pooling/config.json
  • embeddings_model/all-mpnet-base-v2/config.json
  • embeddings_model/all-mpnet-base-v2/config_sentence_transformers.json
  • embeddings_model/all-mpnet-base-v2/modules.json
  • embeddings_model/all-mpnet-base-v2/sentence_bert_config.json
  • embeddings_model/all-mpnet-base-v2/special_tokens_map.json
  • embeddings_model/all-mpnet-base-v2/tokenizer.json
  • embeddings_model/all-mpnet-base-v2/tokenizer_config.json
  • embeddings_model/granite-embedding-30m-english/1_Pooling/config.json
  • embeddings_model/granite-embedding-30m-english/config.json
  • embeddings_model/granite-embedding-30m-english/modules.json
  • embeddings_model/granite-embedding-30m-english/sentence_bert_config.json
  • embeddings_model/granite-embedding-30m-english/special_tokens_map.json
  • embeddings_model/granite-embedding-30m-english/tokenizer.json
  • embeddings_model/granite-embedding-30m-english/tokenizer_config.json
  • examples/olsconfig.yaml
✅ Files skipped from review due to trivial changes (13)
  • embeddings_model/granite-embedding-30m-english/sentence_bert_config.json
  • examples/olsconfig.yaml
  • embeddings_model/all-mpnet-base-v2/sentence_bert_config.json
  • embeddings_model/granite-embedding-30m-english/1_Pooling/config.json
  • embeddings_model/all-mpnet-base-v2/config_sentence_transformers.json
  • embeddings_model/all-mpnet-base-v2/tokenizer_config.json
  • embeddings_model/all-mpnet-base-v2/modules.json
  • embeddings_model/granite-embedding-30m-english/modules.json
  • embeddings_model/all-mpnet-base-v2/special_tokens_map.json
  • embeddings_model/all-mpnet-base-v2/config.json
  • artifacts.lock.yaml
  • embeddings_model/granite-embedding-30m-english/config.json
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (7)
  • .tekton/lightspeed-service-push.yaml
  • embeddings_model/granite-embedding-30m-english/special_tokens_map.json
  • .woke.yaml
  • embeddings_model/granite-embedding-30m-english/tokenizer_config.json
  • embeddings_model/all-mpnet-base-v2/1_Pooling/config.json
  • Makefile
  • .tekton/lightspeed-service-pull-request.yaml

Comment thread Containerfile
Comment on lines +86 to +96
RUN if [ -d /cachi2/output/deps/generic ]; then \
cp /cachi2/output/deps/generic/all-mpnet-base-v2-model.safetensors embeddings_model/all-mpnet-base-v2/model.safetensors && \
cp /cachi2/output/deps/generic/granite-embedding-30m-english-model.safetensors embeddings_model/granite-embedding-30m-english/model.safetensors ; \
fi && \
for f in \
"embeddings_model/all-mpnet-base-v2/model.safetensors|https://huggingface.co/sentence-transformers/all-mpnet-base-v2/resolve/main/model.safetensors" \
"embeddings_model/granite-embedding-30m-english/model.safetensors|https://huggingface.co/ibm-granite/granite-embedding-30m-english/resolve/main/model.safetensors" \
; do \
path="${f%%|*}" && url="${f##*|}" && \
[ -f "$path" ] || python3 -c "import urllib.request,sys; urllib.request.urlretrieve(sys.argv[1], sys.argv[2])" "$url" "$path" ; \
done

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Pinned SHA-256 checksums are never verified, and the HTTP fallback undermines the hermetic-build guarantee.

Two concerns on this block:

  1. No checksum validation. artifacts.lock.yaml pins sha256 for both safetensors, but neither the cachi2 copy (Lines 87-88) nor the HTTP download (Line 95) validates the bytes against those hashes. The PR summary states the Containerfile "validates checksums," but it does not — this leaves a supply-chain integrity gap for the bundled weights.
  2. Network fallback contradicts the stated build modes. In a hermetic Konflux/Cachi2 build, egress is blocked, so the urllib fallback cannot run; in local builds the PR objective says the build should "fail fast with a clear error if binaries are missing," yet this code instead silently downloads unverified weights from HuggingFace. The fallback both defeats hermeticity and diverges from the documented behavior.

Recommend validating against the pinned hashes and making the missing-binary case an explicit failure rather than a network fetch.

🛡️ Sketch: verify checksums and fail fast instead of fetching
 RUN if [ -d /cachi2/output/deps/generic ]; then \
     cp /cachi2/output/deps/generic/all-mpnet-base-v2-model.safetensors embeddings_model/all-mpnet-base-v2/model.safetensors && \
     cp /cachi2/output/deps/generic/granite-embedding-30m-english-model.safetensors embeddings_model/granite-embedding-30m-english/model.safetensors ; \
     fi && \
     for f in \
-      "embeddings_model/all-mpnet-base-v2/model.safetensors|https://huggingface.co/sentence-transformers/all-mpnet-base-v2/resolve/main/model.safetensors" \
-      "embeddings_model/granite-embedding-30m-english/model.safetensors|https://huggingface.co/ibm-granite/granite-embedding-30m-english/resolve/main/model.safetensors" \
+      "embeddings_model/all-mpnet-base-v2/model.safetensors|78c0197b6159d92658e319bc1d72e4c73a9a03dd03815e70e555c5ef05615658" \
+      "embeddings_model/granite-embedding-30m-english/model.safetensors|3b1fcdc9c5eb954f603bc386474e321505ff29c6c67f21e3aa8db3d2d1a533cf" \
     ; do \
-      path="${f%%|*}" && url="${f##*|}" && \
-      [ -f "$path" ] || python3 -c "import urllib.request,sys; urllib.request.urlretrieve(sys.argv[1], sys.argv[2])" "$url" "$path" ; \
+      path="${f%%|*}" && want="${f##*|}" && \
+      { [ -f "$path" ] || { echo "Missing $path; run 'make get-embeddings' or provide cachi2 artifacts" >&2; exit 1; } ; } && \
+      echo "${want}  ${path}" | sha256sum -c - ; \
     done
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RUN if [ -d /cachi2/output/deps/generic ]; then \
cp /cachi2/output/deps/generic/all-mpnet-base-v2-model.safetensors embeddings_model/all-mpnet-base-v2/model.safetensors && \
cp /cachi2/output/deps/generic/granite-embedding-30m-english-model.safetensors embeddings_model/granite-embedding-30m-english/model.safetensors ; \
fi && \
for f in \
"embeddings_model/all-mpnet-base-v2/model.safetensors|https://huggingface.co/sentence-transformers/all-mpnet-base-v2/resolve/main/model.safetensors" \
"embeddings_model/granite-embedding-30m-english/model.safetensors|https://huggingface.co/ibm-granite/granite-embedding-30m-english/resolve/main/model.safetensors" \
; do \
path="${f%%|*}" && url="${f##*|}" && \
[ -f "$path" ] || python3 -c "import urllib.request,sys; urllib.request.urlretrieve(sys.argv[1], sys.argv[2])" "$url" "$path" ; \
done
RUN if [ -d /cachi2/output/deps/generic ]; then \
cp /cachi2/output/deps/generic/all-mpnet-base-v2-model.safetensors embeddings_model/all-mpnet-base-v2/model.safetensors && \
cp /cachi2/output/deps/generic/granite-embedding-30m-english-model.safetensors embeddings_model/granite-embedding-30m-english/model.safetensors ; \
fi && \
for f in \
"embeddings_model/all-mpnet-base-v2/model.safetensors|78c0197b6159d92658e319bc1d72e4c73a9a03dd03815e70e555c5ef05615658" \
"embeddings_model/granite-embedding-30m-english/model.safetensors|3b1fcdc9c5eb954f603bc386474e321505ff29c6c67f21e3aa8db3d2d1a533cf" \
; do \
path="${f%%|*}" && want="${f##*|}" && \
{ [ -f "$path" ] || { echo "Missing $path; run 'make get-embeddings' or provide cachi2 artifacts" >&2; exit 1; } ; } && \
echo "${want} ${path}" | sha256sum -c - ; \
done
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Containerfile` around lines 86 - 96, The Containerfile does not validate
SHA-256 checksums for the safetensors files despite them being pinned in
artifacts.lock.yaml, and the urllib.request.urlretrieve fallback contradicts
hermetic-build guarantees by allowing unverified network downloads. Extract the
pinned SHA-256 hashes from artifacts.lock.yaml for both
all-mpnet-base-v2-model.safetensors and
granite-embedding-30m-english-model.safetensors, then modify the RUN block to
compute and verify the SHA-256 hash of each file after the cachi2 copy or
download using sha256sum, comparing against the pinned values and failing the
build if they do not match. Remove the urllib.request.urlretrieve fallback
entirely and replace it with an explicit error that exits the build if the
expected file is not found in the cachi2 output directory.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checksum is done by konflux prefetch task, no need here.

Comment thread Containerfile
Comment thread Containerfile
@blublinsky blublinsky force-pushed the embedding-hermetic-build branch from 792e44b to 1b18b4e Compare June 23, 2026 21:07

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Containerfile`:
- Around line 101-111: Add an integration test that validates offline model
loading functionality works with the pre-populated HuggingFace cache. The test
should verify that models can be successfully loaded with the
TRANSFORMERS_OFFLINE=1 environment variable set, and specifically confirm that
the fallback model ID (local:sentence-transformers/all-mpnet-base-v2) loads
correctly using the cache structure created by the Containerfile cache priming
block shown in the diff. This test should confirm the end-to-end contract
between the cache structure created by the RUN command (which sets up the model
directories, snapshots, and refs/main pointers) and the actual model loading
behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ef95f73b-a667-499c-82e9-f79cc958adb5

📥 Commits

Reviewing files that changed from the base of the PR and between 792e44b and 1b18b4e.

📒 Files selected for processing (23)
  • .gitignore
  • .tekton/lightspeed-service-pull-request.yaml
  • .tekton/lightspeed-service-push.yaml
  • .woke.yaml
  • Containerfile
  • Makefile
  • artifacts.lock.yaml
  • embeddings_model/all-mpnet-base-v2/1_Pooling/config.json
  • embeddings_model/all-mpnet-base-v2/config.json
  • embeddings_model/all-mpnet-base-v2/config_sentence_transformers.json
  • embeddings_model/all-mpnet-base-v2/modules.json
  • embeddings_model/all-mpnet-base-v2/sentence_bert_config.json
  • embeddings_model/all-mpnet-base-v2/special_tokens_map.json
  • embeddings_model/all-mpnet-base-v2/tokenizer.json
  • embeddings_model/all-mpnet-base-v2/tokenizer_config.json
  • embeddings_model/granite-embedding-30m-english/1_Pooling/config.json
  • embeddings_model/granite-embedding-30m-english/config.json
  • embeddings_model/granite-embedding-30m-english/modules.json
  • embeddings_model/granite-embedding-30m-english/sentence_bert_config.json
  • embeddings_model/granite-embedding-30m-english/special_tokens_map.json
  • embeddings_model/granite-embedding-30m-english/tokenizer.json
  • embeddings_model/granite-embedding-30m-english/tokenizer_config.json
  • examples/olsconfig.yaml
✅ Files skipped from review due to trivial changes (14)
  • embeddings_model/all-mpnet-base-v2/sentence_bert_config.json
  • examples/olsconfig.yaml
  • embeddings_model/granite-embedding-30m-english/modules.json
  • embeddings_model/all-mpnet-base-v2/modules.json
  • embeddings_model/granite-embedding-30m-english/sentence_bert_config.json
  • embeddings_model/granite-embedding-30m-english/config.json
  • embeddings_model/all-mpnet-base-v2/config.json
  • embeddings_model/all-mpnet-base-v2/1_Pooling/config.json
  • embeddings_model/all-mpnet-base-v2/tokenizer_config.json
  • embeddings_model/all-mpnet-base-v2/special_tokens_map.json
  • .woke.yaml
  • embeddings_model/granite-embedding-30m-english/tokenizer_config.json
  • .gitignore
  • embeddings_model/all-mpnet-base-v2/config_sentence_transformers.json
🚧 Files skipped from review as they are similar to previous changes (6)
  • .tekton/lightspeed-service-pull-request.yaml
  • embeddings_model/granite-embedding-30m-english/1_Pooling/config.json
  • embeddings_model/granite-embedding-30m-english/special_tokens_map.json
  • artifacts.lock.yaml
  • Makefile
  • .tekton/lightspeed-service-push.yaml

Comment thread Containerfile
Comment on lines +101 to +111
# Pre-populate HuggingFace cache so models can be loaded by ID with TRANSFORMERS_OFFLINE=1
RUN for model_dir in all-mpnet-base-v2 granite-embedding-30m-english; do \
case "$model_dir" in \
all-mpnet-base-v2) hf_id="sentence-transformers--all-mpnet-base-v2" ;; \
granite-embedding-30m-english) hf_id="ibm-granite--granite-embedding-30m-english" ;; \
esac && \
repo_dir="/app-root/.cache/huggingface/hub/models--${hf_id}" && \
mkdir -p "$repo_dir/snapshots/local" "$repo_dir/refs" && \
echo "local" > "$repo_dir/refs/main" && \
ln -sf "/app-root/embeddings_model/${model_dir}"/* "$repo_dir/snapshots/local/" ; \
done

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read-only verification script: confirm fallback path and cache priming expectations align.

set -euo pipefail

# 1) Confirm fallback model ID used when embeddings_model_path is unset
rg -n 'local:sentence-transformers/all-mpnet-base-v2|_get_embed_model|embeddings_model_path' \
  ols/src/rag_index/index_loader.py tests/unit/rag_index/test_index_loader.py -C2

# 2) Confirm runtime env wiring relevant to offline behavior
rg -n 'HF_HOME|TRANSFORMERS_OFFLINE|TRANSFORMERS_CACHE|configure_hugging_face_envs' \
  Containerfile ols/utils/environments.py -C2

# 3) Confirm cache-priming block exists exactly as expected in image build recipe
sed -n '96,116p' Containerfile

Repository: openshift/lightspeed-service

Length of output: 3917


Add an explicit integration test validating offline model loading with the pre-populated cache.

The fallback model ID local:sentence-transformers/all-mpnet-base-v2 and environment configuration for TRANSFORMERS_OFFLINE=1 are correctly implemented, and the cache priming block in the Containerfile properly synthesizes the expected HuggingFace hub structure. However, there is no integration test confirming that models can actually be loaded offline using this pre-populated cache. Since the PR marks offline model loading as pending verification, add a test that validates the end-to-end contract: the fallback model loads successfully with TRANSFORMERS_OFFLINE=1 and the cache structure created by the Containerfile.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Containerfile` around lines 101 - 111, Add an integration test that validates
offline model loading functionality works with the pre-populated HuggingFace
cache. The test should verify that models can be successfully loaded with the
TRANSFORMERS_OFFLINE=1 environment variable set, and specifically confirm that
the fallback model ID (local:sentence-transformers/all-mpnet-base-v2) loads
correctly using the cache structure created by the Containerfile cache priming
block shown in the diff. This test should confirm the end-to-end contract
between the cache structure created by the RUN command (which sets up the model
directories, snapshots, and refs/main pointers) and the actual model loading
behavior.

@blublinsky

Copy link
Copy Markdown
Contributor Author

/retest

@blublinsky blublinsky force-pushed the embedding-hermetic-build branch from 1b18b4e to fee7518 Compare June 24, 2026 08:01
Comment thread examples/olsconfig.yaml
- product_docs_index_path: "./vector_db/user_application_docs/version_1"
product_docs_index_id: user-application-docs-version_1
embeddings_model_path: "./embeddings_model"
embeddings_model_path: "./embeddings_model/all-mpnet-base-v2"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a reminder for corresponding change in operator.

@raptorsun

Copy link
Copy Markdown
Contributor

/lgtm
looks good to me.
we may need update the operator to reflect the change of embedding model's path to make sure the bundle work with new embedding model locations.

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 24, 2026
@blublinsky blublinsky force-pushed the embedding-hermetic-build branch from fee7518 to 4cb74db Compare June 24, 2026 10:23
@openshift-ci openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 24, 2026
@blublinsky blublinsky force-pushed the embedding-hermetic-build branch from 4cb74db to 25a520f Compare June 24, 2026 12:23
@raptorsun

Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 24, 2026
@blublinsky

Copy link
Copy Markdown
Contributor Author

/approve

@openshift-ci

openshift-ci Bot commented Jun 24, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: blublinsky

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 24, 2026
@openshift-ci

openshift-ci Bot commented Jun 24, 2026

Copy link
Copy Markdown

@blublinsky: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot Bot merged commit 8605212 into openshift:main Jun 24, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants