Skip to content

feat(Ingest): Replace ingest validation with chow - BED-7790#2733

Open
wes-mil wants to merge 6 commits into
mainfrom
BED-7790
Open

feat(Ingest): Replace ingest validation with chow - BED-7790#2733
wes-mil wants to merge 6 commits into
mainfrom
BED-7790

Conversation

@wes-mil
Copy link
Copy Markdown
Contributor

@wes-mil wes-mil commented May 4, 2026

Description

Replace ingest validation with chow

Motivation and Context

Resolves BED-7790

Why is this change required? What problem does it solve?

This creates a single source of truth for validation, so that users can do ingest validation on the command line. It also improves error reporting and testability.

How Has This Been Tested?

Tested manually with payloads, also chow is heavily unit tested itself.

Screenshots (optional):

Types of changes

  • New feature (non-breaking change which adds functionality)

Checklist:

Summary by CodeRabbit

  • Refactor

    • Ingest validation reworked to use the new payload schema: invalid or malformed ingest files now return 400 Bad Request with clearer schema-validation details.
    • Internal ingest processing simplified for more consistent error reporting.
  • Chores

    • Switched to the centralized payload validation library and updated startup/integration flows to use it.

@wes-mil wes-mil self-assigned this May 4, 2026
@wes-mil wes-mil added enhancement New feature or request api A pull request containing changes affecting the API code. labels May 4, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: f89a0e91-f036-4b64-8e4a-8af5196f0445

📥 Commits

Reviewing files that changed from the base of the PR and between b29d40e and b6035ae.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (1)
  • go.mod
🚧 Files skipped from review as they are similar to previous changes (1)
  • go.mod

📝 Walkthrough

Walkthrough

Migrates ingest validation from an internal upload schema/validator to payload.Schema from github.com/specterops/chow, removing local JSON schema assets and validators, updating function/struct signatures, delegating validation to the external payload package, and adapting error handling and tests to the new ValidationReport shape.

Changes

Schema Migration: upload.IngestSchema → payload.Schema

Layer / File(s) Summary
Data Shape
cmd/api/src/api/registration/registration.go, cmd/api/src/api/v2/model.go, cmd/api/src/daemons/datapipe/pipeline.go, cmd/api/src/services/graphify/service.go, cmd/api/src/services/graphify/ingest.go
Replaced upload.IngestSchema types with payload.Schema across structs and function parameters; updated NewResources, NewPipeline, NewGraphifyService, and RegisterFossRoutes signatures.
Validation Core
cmd/api/src/services/upload/upload.go, cmd/api/src/services/graphify/ingest.go, cmd/api/src/services/graphify/tasks.go
SaveIngestFile now accepts payload.Schema and returns payload.ValidationReport; JSON/ZIP write-and-validate split into WriteAndValidateJSON/WriteAndValidateZip; ReadFileForIngest/processSingleFile/ProcessIngestFile updated to return/consume payload.ValidationReport.
Removal of Local Schema & Helpers
cmd/api/src/services/upload/schema.go, cmd/api/src/services/upload/file_validator.go, cmd/api/src/services/upload/jsonschema/*, cmd/api/src/services/upload/jsonutils.go, cmd/api/src/services/upload/streamdecoder.go
Deleted embedded JSON schema files, local IngestSchema type and loader, file validator abstractions, JSON stream helpers, and JSON parsing/validation code (streamdecoder retains only zip utilities).
Wiring & Error Handling
cmd/api/src/api/v2/fileingest.go, cmd/api/src/services/graphify/ingest.go, cmd/api/src/services/graphify/tasks.go
Updated handler flows to use payload.ValidationReport for constructing 400 Bad Request responses when validation errors exist; IngestWrapper dispatch now accepts payload.ParsedData.
Schema Loading Sites
cmd/api/src/services/entrypoint.go, cmd/api/src/daemons/datapipe/datapipe_integration_test.go, cmd/api/src/services/graphify/graphify_integration_test.go, packages/go/graphify/graph/graph.go
Replaced upload.LoadIngestSchema() with payload.LoadSchema() and updated imports.
Tests & Module deps
cmd/api/src/api/v2/fileingest_test.go, cmd/api/src/services/upload/upload_test.go, cmd/api/src/services/graphify/ingest_integration_test.go, cmd/api/src/services/upload/streamdecoder_test.go, go.mod
Updated tests to expect 400 on schema validation failures and to drive new helpers; removed streamdecoder tests; go.mod adds github.com/specterops/chow and moves jsonschema/go-cmp to indirect.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API as API Server
    participant Storage as Temp Storage
    participant Payload as payload.Validator
    participant Graphify as GraphifyService

    Client->>API: POST /ingest (file)
    API->>Storage: WriteAndValidateJSON/Zip (tee to validator)
    Storage->>Payload: ParseAndValidate / ValidateZip
    Payload-->>Storage: ValidationReport (errors?) 
    alt validation errors
        Storage-->>API: return report + ErrInvalidJSON/type
        API-->>Client: 400 Bad Request (schema validation details)
    else valid
        Storage-->>API: saved file path + report
        API->>Graphify: enqueue/process file (ReadFileForIngest)
        Graphify->>Payload: Parse metadata / ParsedData
        Graphify-->>API: processing result / error
        API-->>Client: 202 Accepted
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: replacing ingest validation with the chow library, matching the comprehensive scope of changes across multiple files.
Description check ✅ Passed The PR description addresses most required sections: clear description of changes, motivation/context with ticket reference, testing approach, and completed checklist. However, it lacks detail on how documentation updates were accomplished.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch BED-7790

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
cmd/api/src/services/graphify/ingest_integration_test.go (1)

101-102: 💤 Low value

Consider asserting on the validation report in the failure path.

The test discards the ValidationReport returned by ReadFileForIngest. According to tasks.go (lines 190-196), report.CriticalErrors and report.ValidationErrors are used to populate error messages. Asserting on these fields would strengthen the test coverage for the new validation flow.

💡 Suggested enhancement
-			_, err := graphify.ReadFileForIngest(ingestContext, invalidReader, readOptions)
-			require.NotNil(t, err)
+			report, err := graphify.ReadFileForIngest(ingestContext, invalidReader, readOptions)
+			require.NotNil(t, err)
+			require.True(t, len(report.CriticalErrors) > 0 || len(report.ValidationErrors) > 0, "expected validation errors in report")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/api/src/services/graphify/ingest_integration_test.go` around lines 101 -
102, The test currently only checks that ReadFileForIngest(ingestContext,
invalidReader, readOptions) returns a non-nil error; update the failure path to
also capture and assert on the returned ValidationReport (e.g., the report value
returned by ReadFileForIngest) and assert that report.CriticalErrors and/or
report.ValidationErrors contain the expected entries for the invalid input;
locate the call to ReadFileForInest and add assertions against the returned
report's CriticalErrors and ValidationErrors fields to validate the new
validation flow.
go.mod (1)

47-47: 💤 Low value

Use the latest stable release version (v0.1.1) instead of a pseudo-version.

The chow dependency uses a pseudo-version (v0.1.2-0.20260501161234-9ffc316fa3ec) pointing to a pre-release commit. The latest stable release is v0.1.1 (published 2026-04-14). Consider upgrading to v0.1.1 for production deployments to improve reproducibility and use a vetted release.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@go.mod` at line 47, Replace the pseudo-version entry for the module
github.com/specterops/chow in go.mod with the latest stable release version
v0.1.1; locate the line containing "github.com/specterops/chow
v0.1.2-0.20260501161234-9ffc316fa3ec" and change it to
"github.com/specterops/chow v0.1.1", then run `go mod tidy` to update the
dependency graph and ensure the module downloads the tagged release.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@cmd/api/src/services/graphify/ingest_integration_test.go`:
- Around line 101-102: The test currently only checks that
ReadFileForIngest(ingestContext, invalidReader, readOptions) returns a non-nil
error; update the failure path to also capture and assert on the returned
ValidationReport (e.g., the report value returned by ReadFileForIngest) and
assert that report.CriticalErrors and/or report.ValidationErrors contain the
expected entries for the invalid input; locate the call to ReadFileForInest and
add assertions against the returned report's CriticalErrors and ValidationErrors
fields to validate the new validation flow.

In `@go.mod`:
- Line 47: Replace the pseudo-version entry for the module
github.com/specterops/chow in go.mod with the latest stable release version
v0.1.1; locate the line containing "github.com/specterops/chow
v0.1.2-0.20260501161234-9ffc316fa3ec" and change it to
"github.com/specterops/chow v0.1.1", then run `go mod tidy` to update the
dependency graph and ensure the module downloads the tagged release.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 19cc8f7d-78a9-45da-8e1e-45b8d26c82d7

📥 Commits

Reviewing files that changed from the base of the PR and between d293729 and b29d40e.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (24)
  • cmd/api/src/api/registration/registration.go
  • cmd/api/src/api/v2/fileingest.go
  • cmd/api/src/api/v2/fileingest_test.go
  • cmd/api/src/api/v2/model.go
  • cmd/api/src/daemons/datapipe/datapipe_integration_test.go
  • cmd/api/src/daemons/datapipe/pipeline.go
  • cmd/api/src/services/entrypoint.go
  • cmd/api/src/services/graphify/graphify_integration_test.go
  • cmd/api/src/services/graphify/ingest.go
  • cmd/api/src/services/graphify/ingest_integration_test.go
  • cmd/api/src/services/graphify/service.go
  • cmd/api/src/services/graphify/tasks.go
  • cmd/api/src/services/upload/file_validator.go
  • cmd/api/src/services/upload/jsonschema/edge.json
  • cmd/api/src/services/upload/jsonschema/metadata.json
  • cmd/api/src/services/upload/jsonschema/node.json
  • cmd/api/src/services/upload/jsonutils.go
  • cmd/api/src/services/upload/schema.go
  • cmd/api/src/services/upload/streamdecoder.go
  • cmd/api/src/services/upload/streamdecoder_test.go
  • cmd/api/src/services/upload/upload.go
  • cmd/api/src/services/upload/upload_test.go
  • go.mod
  • packages/go/graphify/graph/graph.go
💤 Files with no reviewable changes (8)
  • cmd/api/src/services/upload/jsonschema/node.json
  • cmd/api/src/services/upload/jsonschema/metadata.json
  • cmd/api/src/services/upload/jsonschema/edge.json
  • cmd/api/src/services/upload/jsonutils.go
  • cmd/api/src/services/upload/streamdecoder_test.go
  • cmd/api/src/services/upload/schema.go
  • cmd/api/src/services/upload/file_validator.go
  • cmd/api/src/services/upload/streamdecoder.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api A pull request containing changes affecting the API code. enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant