feat: push --dry-run preview mode by dhruva-reddy · Pull Request #16 · VapiAI/gitops

dhruva-reddy · 2026-05-01T20:00:59Z

ELI5

Problem. npm run push -- <env> immediately starts hitting the live
dashboard. There was no way to ask "what would this push do?" before
firing it. So a fat-fingered command — wrong org, missing file path,
wide-scope push when you meant scoped — hit production immediately,
and recovery meant pull + manual revert. The only existing dry-run
concept gated deletions, not creates or updates.

What this fix does. Adds a --dry-run flag to push. Instead of
firing POST/PATCH/DELETE, the engine counts the intent and prints
[dry-run] would <METHOD> <endpoint> <body-preview> per resource.
The state file is never written (so synthetic IDs don't pollute it),
and the end-of-run summary shows Would create N, would update M, would delete K. GETs still run because drift detection (Stack G) and
operator preview both need to see current platform state.

Outcome you'll notice. Run npm run push -- <env> --dry-run to
preview any push. Especially useful for "did I scope this right?" and
"is the pre-push lint reporting drift I should address first?" before
the real push. Cheapest individual operator-safety win in the stack —
no schema changes, no engine architecture moves.

Operators today can't validate "is this push doing what I think it's
doing" before it lands on prod. push.ts has a dry-run concept only for
deletions; updates and creates fire immediately. Cheapest individual
operator-safety win (improvements.md #5).

src/config.ts: parseFlags now accepts --dry-run alongside --force /
--bootstrap. Exports DRY_RUN.
src/api.ts: vapiRequest gates POST/PATCH on DRY_RUN — counts the
intent, prints [dry-run] would <METHOD> <endpoint> with a 120-char
body preview, and returns a synthetic id so caller code threads
through. vapiDelete gets the same treatment. GETs always run (drift
preview needs them).
src/push.ts: banner ("🧪 DRY-RUN") at start, summary at end ("Would
create N, would update M, would delete K"), saveState entirely skipped
in dry-run so synthetic ids never leak into the state file.
AGENTS.md: document --dry-run in Available Commands.
tests/push-dry-run.test.ts: --dry-run is parse-accepted, banner prints,
state file is NEVER created (verified end-to-end via spawn).
improvements.md: feat: Add QA evaluation structured outputs for Starlight (Brent Council) #5 → RESOLVED.

Closes improvements.md #5.

🤖 Generated with Claude Code

dhruva-reddy · 2026-05-01T20:01:11Z

## ELI5 **Problem.** The engine could *create* simulation suites and track them in state, and AGENTS.md described `simulations/suites/` as a first-class resource type. But there was no `npm run` command to actually *execute* a suite. `npm run eval` exists but runs the *legacy* `/evals` endpoint — a different thing — and the naming overlap actively misled engineers into running the wrong command. To fire a simulation suite from the CLI you had to write raw curl or go to the dashboard UI (losing reproducibility). **What this fix does.** Adds `npm run sim`. Two shapes: ``` npm run sim -- <org> --suite <name> --target <assistant-or-squad> npm run sim -- <org> --simulations <n1>,<n2> --target <assistant> ``` Resolves local resource names → state-file UUIDs the same way `npm run call` does, POSTs `/eval/simulation/run`, polls the run status, prints a summary table (pass/fail per simulation, mean run time, structured-output evals). **Outcome you'll notice.** Simulation suites become a normal part of the gitops workflow: author the suite as YAML, push it via `npm run push`, run it via `npm run sim`. No more dashboard clicking. Note the AGENTS.md call-out clarifying the difference between `npm run sim` (unified `/eval/simulation/*`) and `npm run eval` (legacy `/evals`) — renaming `eval` to disambiguate is a separate, backwards-incompatible follow-up. --- Engine fully tracks simulation suites in state and AGENTS.md describes simulations/suites/ as a first-class resource type, but there's no npm run command to actually execute one. npm run eval runs the legacy /evals endpoint, not the unified simulation runner. Customers go to the dashboard UI to trigger runs (losing reproducibility) or write per-customer shell wrappers. - src/sim.ts (NEW): runSimulationSuite + runSimulationsByName helpers. Resolves local-name → UUID via state file; POSTs /eval/simulation/run; polls /eval/simulation/run/:id until completion; prints pass/fail summary per simulation with mean run time + structured-output evals. Reuses src/api.ts:vapiRequest for HTTP and the local-name → UUID resolution pattern from src/eval.ts. - src/sim-cmd.ts (NEW): CLI entry. Args: npm run sim -- <org> --suite <name> --target <assistant-or-squad> npm run sim -- <org> --simulations <n1>,<n2> --target <assistant> npm run sim -- <org> --suite <name> --watch - package.json: sim script. - AGENTS.md: document npm run sim alongside npm run eval (call out the legacy /evals vs unified /eval/simulation/* distinction). - tests/sim.test.ts: arg parsing, UUID resolution, status polling, summary table formatting. Note: renaming npm run eval to disambiguate is a follow-up — that's a backwards-incompatible script-name change. For now the AGENTS.md note calls out the distinction. Closes improvements.md #16. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

## ELI5 **Problem.** `npm run push -- <env>` immediately starts hitting the live dashboard. There was no way to ask "what would this push do?" before firing it. So a fat-fingered command — wrong org, missing file path, wide-scope push when you meant scoped — hit production immediately, and recovery meant `pull` + manual revert. The only existing dry-run concept gated *deletions*, not creates or updates. **What this fix does.** Adds a `--dry-run` flag to `push`. Instead of firing POST/PATCH/DELETE, the engine counts the intent and prints `[dry-run] would <METHOD> <endpoint> <body-preview>` per resource. The state file is never written (so synthetic IDs don't pollute it), and the end-of-run summary shows `Would create N, would update M, would delete K`. GETs still run because drift detection (Stack G) and operator preview both need to see current platform state. **Outcome you'll notice.** Run `npm run push -- <env> --dry-run` to preview any push. Especially useful for "did I scope this right?" and "is the pre-push lint reporting drift I should address first?" before the real push. Cheapest individual operator-safety win in the stack — no schema changes, no engine architecture moves. --- Operators today can't validate "is this push doing what I think it's doing" before it lands on prod. push.ts has a dry-run concept only for deletions; updates and creates fire immediately. Cheapest individual operator-safety win (improvements.md #5). - src/config.ts: parseFlags now accepts --dry-run alongside --force / --bootstrap. Exports DRY_RUN. - src/api.ts: vapiRequest gates POST/PATCH on DRY_RUN — counts the intent, prints `[dry-run] would <METHOD> <endpoint>` with a 120-char body preview, and returns a synthetic id so caller code threads through. vapiDelete gets the same treatment. GETs always run (drift preview needs them). - src/push.ts: banner ("🧪 DRY-RUN") at start, summary at end ("Would create N, would update M, would delete K"), saveState entirely skipped in dry-run so synthetic ids never leak into the state file. - AGENTS.md: document --dry-run in Available Commands. - tests/push-dry-run.test.ts: --dry-run is parse-accepted, banner prints, state file is NEVER created (verified end-to-end via spawn). - improvements.md: #5 → RESOLVED. Closes improvements.md #5. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

## ELI5 **Problem.** The engine could *create* simulation suites and track them in state, and AGENTS.md described `simulations/suites/` as a first-class resource type. But there was no `npm run` command to actually *execute* a suite. `npm run eval` exists but runs the *legacy* `/evals` endpoint — a different thing — and the naming overlap actively misled engineers into running the wrong command. To fire a simulation suite from the CLI you had to write raw curl or go to the dashboard UI (losing reproducibility). **What this fix does.** Adds `npm run sim`. Two shapes: ``` npm run sim -- <org> --suite <name> --target <assistant-or-squad> npm run sim -- <org> --simulations <n1>,<n2> --target <assistant> ``` Resolves local resource names → state-file UUIDs the same way `npm run call` does, POSTs `/eval/simulation/run`, polls the run status, prints a summary table (pass/fail per simulation, mean run time, structured-output evals). **Outcome you'll notice.** Simulation suites become a normal part of the gitops workflow: author the suite as YAML, push it via `npm run push`, run it via `npm run sim`. No more dashboard clicking. Note the AGENTS.md call-out clarifying the difference between `npm run sim` (unified `/eval/simulation/*`) and `npm run eval` (legacy `/evals`) — renaming `eval` to disambiguate is a separate, backwards-incompatible follow-up. --- Engine fully tracks simulation suites in state and AGENTS.md describes simulations/suites/ as a first-class resource type, but there's no npm run command to actually execute one. npm run eval runs the legacy /evals endpoint, not the unified simulation runner. Customers go to the dashboard UI to trigger runs (losing reproducibility) or write per-customer shell wrappers. - src/sim.ts (NEW): runSimulationSuite + runSimulationsByName helpers. Resolves local-name → UUID via state file; POSTs /eval/simulation/run; polls /eval/simulation/run/:id until completion; prints pass/fail summary per simulation with mean run time + structured-output evals. Reuses src/api.ts:vapiRequest for HTTP and the local-name → UUID resolution pattern from src/eval.ts. - src/sim-cmd.ts (NEW): CLI entry. Args: npm run sim -- <org> --suite <name> --target <assistant-or-squad> npm run sim -- <org> --simulations <n1>,<n2> --target <assistant> npm run sim -- <org> --suite <name> --watch - package.json: sim script. - AGENTS.md: document npm run sim alongside npm run eval (call out the legacy /evals vs unified /eval/simulation/* distinction). - tests/sim.test.ts: arg parsing, UUID resolution, status polling, summary table formatting. Note: renaming npm run eval to disambiguate is a follow-up — that's a backwards-incompatible script-name change. For now the AGENTS.md note calls out the distinction. Closes improvements.md #16. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

dhruva-reddy · 2026-05-02T01:31:05Z

Merge activity

May 2, 1:31 AM UTC: @dhruva-reddy merged this pull request with Graphite.

## ELI5 **Problem.** The engine could *create* simulation suites and track them in state, and AGENTS.md described `simulations/suites/` as a first-class resource type. But there was no `npm run` command to actually *execute* a suite. `npm run eval` exists but runs the *legacy* `/evals` endpoint — a different thing — and the naming overlap actively misled engineers into running the wrong command. To fire a simulation suite from the CLI you had to write raw curl or go to the dashboard UI (losing reproducibility). **What this fix does.** Adds `npm run sim`. Two shapes: ``` npm run sim -- <org> --suite <name> --target <assistant-or-squad> npm run sim -- <org> --simulations <n1>,<n2> --target <assistant> ``` Resolves local resource names → state-file UUIDs the same way `npm run call` does, POSTs `/eval/simulation/run`, polls the run status, prints a summary table (pass/fail per simulation, mean run time, structured-output evals). **Outcome you'll notice.** Simulation suites become a normal part of the gitops workflow: author the suite as YAML, push it via `npm run push`, run it via `npm run sim`. No more dashboard clicking. Note the AGENTS.md call-out clarifying the difference between `npm run sim` (unified `/eval/simulation/*`) and `npm run eval` (legacy `/evals`) — renaming `eval` to disambiguate is a separate, backwards-incompatible follow-up. --- Engine fully tracks simulation suites in state and AGENTS.md describes simulations/suites/ as a first-class resource type, but there's no npm run command to actually execute one. npm run eval runs the legacy /evals endpoint, not the unified simulation runner. Customers go to the dashboard UI to trigger runs (losing reproducibility) or write per-customer shell wrappers. - src/sim.ts (NEW): runSimulationSuite + runSimulationsByName helpers. Resolves local-name → UUID via state file; POSTs /eval/simulation/run; polls /eval/simulation/run/:id until completion; prints pass/fail summary per simulation with mean run time + structured-output evals. Reuses src/api.ts:vapiRequest for HTTP and the local-name → UUID resolution pattern from src/eval.ts. - src/sim-cmd.ts (NEW): CLI entry. Args: npm run sim -- <org> --suite <name> --target <assistant-or-squad> npm run sim -- <org> --simulations <n1>,<n2> --target <assistant> npm run sim -- <org> --suite <name> --watch - package.json: sim script. - AGENTS.md: document npm run sim alongside npm run eval (call out the legacy /evals vs unified /eval/simulation/* distinction). - tests/sim.test.ts: arg parsing, UUID resolution, status polling, summary table formatting. Note: renaming npm run eval to disambiguate is a follow-up — that's a backwards-incompatible script-name change. For now the AGENTS.md note calls out the distinction. Closes improvements.md #16. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

This was referenced May 1, 2026

feat: simulation suite runner (npm run sim) #18

Open

docs: adopt upstream improvements.md log + voice-providers cheat-sheet #14

Merged

dhruva-reddy force-pushed the dhruva-reddy/feat/push-dry-run branch from 392855d to d9d9477 Compare May 1, 2026 22:56

dhruva-reddy force-pushed the dhruva-reddy/refactor/state-file-key-order branch from 898200a to 0f35c9e Compare May 1, 2026 22:56

adhamvapi approved these changes May 1, 2026

View reviewed changes

dhruva-reddy force-pushed the dhruva-reddy/refactor/state-file-key-order branch 2 times, most recently from 6430703 to 2fc1864 Compare May 2, 2026 01:21

dhruva-reddy force-pushed the dhruva-reddy/feat/push-dry-run branch from d9d9477 to 714523f Compare May 2, 2026 01:21

dhruva-reddy changed the base branch from dhruva-reddy/refactor/state-file-key-order to graphite-base/16 May 2, 2026 01:26

dhruva-reddy force-pushed the dhruva-reddy/feat/push-dry-run branch from 714523f to bf5161c Compare May 2, 2026 01:26

dhruva-reddy force-pushed the graphite-base/16 branch from 2fc1864 to c3c1c8a Compare May 2, 2026 01:26

graphite-app Bot changed the base branch from graphite-base/16 to main May 2, 2026 01:26

dhruva-reddy force-pushed the dhruva-reddy/feat/push-dry-run branch from bf5161c to 87fb394 Compare May 2, 2026 01:27

dhruva-reddy merged commit 2630d0c into main May 2, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: push --dry-run preview mode#16

feat: push --dry-run preview mode#16
dhruva-reddy merged 1 commit intomainfrom
dhruva-reddy/feat/push-dry-run

dhruva-reddy commented May 1, 2026

Uh oh!

dhruva-reddy commented May 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

dhruva-reddy commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dhruva-reddy commented May 1, 2026

ELI5

Uh oh!

dhruva-reddy commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

dhruva-reddy commented May 2, 2026

Merge activity

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dhruva-reddy commented May 1, 2026 •

edited

Loading