Agents by ScriptSmith · Pull Request #34 · ScriptSmith/hadrian

ScriptSmith · 2026-05-23T14:23:18Z

Adds:

Responses server-side tool execution loop, background mode
Containers and a server-side shell tool
Server-side MCP

greptile-apps · 2026-05-23T14:32:21Z

Greptile Summary

This PR adds a server-side agent execution loop to Hadrian, including: a background responses worker with retry semantics, a container/sandbox abstraction backed by microsandbox/opensandbox runtimes, a server-side shell tool, and a hadrian_hosted MCP client loop with human-in-the-loop approval gating.

Background executor (background_executor.rs, background_responses.rs): queued background=true responses are claimed via SELECT FOR UPDATE SKIP LOCKED, run through the same streaming pipeline as foreground requests, and retried on transient failures; stream errors deliberately avoid writing terminal state so retries see a clean row.
Containers API (routes/api/containers.rs, services/containers.rs, services/container_session.rs): full CRUD for /v1/containers, file upload/download with path-traversal guard (.. rejected in both the explicit path field and the filename fallback), and session replay from the DB on container re-attach.
MCP hadrian-hosted mode (services/mcp/): per-request tools/list rewrite into function tools, single-flight cache, SSRF-guarded HTTP client with DNS-pinning, approval parking via mcp_pending_approvals, and exactly-once resume via DELETE … RETURNING.

Confidence Score: 5/5

The core retry, persistence, and security paths are well-implemented and explicitly tested; only a documentation discrepancy in the MCP executor was found.

The retry logic correctly avoids writing terminal state on transient stream errors, the path-traversal guard covers both upload branches (with a test for each), and the MCP approval gate fails closed rather than bypassing when persistence is unavailable. The sole finding is a stale "warn-and-run" doc comment that contradicts tested fail-closed behavior in the MCP executor.

src/services/mcp/executor.rs — the field and constructor doc comments describing the no-persistence approval behavior should be updated to reflect the actual fail-closed semantics.

Important Files Changed

Filename	Overview
src/services/background_executor.rs	Background response executor: reconstructs request, runs it through the shared streaming pipeline, and drains the body. Stream errors do NOT write a terminal state (retry safety); terminal writes are owned by mark_background_failure after retries are exhausted. Logic is sound.
src/jobs/background_responses.rs	Worker loop with bounded concurrency: claims queued rows, runs them via run_with_retry. Retry correctly resets status to InProgress and bumps started_at before re-executing. Row re-fetch after failure picks up updated last_sequence_number.
src/services/response_persister.rs	Streaming persistence wrapper: on body-stream error it sends the error to the drain and returns without writing any terminal status (keeping the row in InProgress for retry). On clean stream end it writes the terminal state based on what SSE events it observed.
src/routes/api/containers.rs	Container REST endpoints. normalize_mnt_path applies the .. traversal guard uniformly to both the explicit path_field and the filename fallback, with a test covering both branches.
src/services/mcp/executor.rs	MCP executor with approval gating. park_for_approval correctly fails closed when persistence prerequisites are missing, but the field and constructor doc comments still say "warn-and-run".
src/services/mcp/service.rs	Long-lived MCP service: connection pool keyed by (server_url, SHA-256 of auth+headers), single-flight tools/list cache, SSRF validation with IP pinning.
src/services/mcp/resume.rs	Exactly-once approval resume: DELETE-RETURNING to claim before calling prevents double-execution. Server origin mismatch check prevents approved calls being redirected to a different host.
src/services/responses_pipeline.rs	Shared streaming pipeline factory. Shell environment validation, skill resolution, staged-file injection, and tool-loop registration all look correct.
src/services/server_tools/runner.rs	Tool loop runner: per-turn DONE sentinel swallowed; single terminal DONE emitted after loop. Continuation payload appends assistant items each turn for coherent Anthropic-compatible multi-turn history.
src/services/shell_tool.rs	Shell tool interception and environment resolution. Egress host wildcard matching uses byte-level dot check to prevent subdomain-bypass. Memory and TTL caps enforced at admission.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Route as /v1/responses
    participant DB
    participant Worker as Background Worker
    participant Executor as Background Executor
    participant Persister
    participant Provider as LLM Provider
    participant Shell as Shell Runtime
    participant MCP as MCP Server

    Client->>Route: "POST /v1/responses (background=true)"
    Route->>DB: "INSERT row (status=queued)"
    Route-->>Client: "202 {id, status: queued}"

    Worker->>DB: claim_queued (SELECT FOR UPDATE SKIP LOCKED)
    DB-->>Worker: ResponseRecord
    Worker->>Executor: execute_persisted_response(record)
    Executor->>Provider: stream request
    Provider-->>Executor: SSE stream

    Executor->>Persister: wrap_streaming_with_persistence
    Persister->>DB: write events (sequence++)

    loop Tool loop (max_iterations)
        Provider-->>Executor: function_call (shell/mcp)
        alt shell tool
            Executor->>Shell: exec(commands)
            Shell-->>Executor: stdout/stderr
        else MCP tool (requires_approval)
            Executor->>DB: INSERT mcp_pending_approvals
            Executor-->>Client: mcp_approval_request event
            Client->>Route: POST /v1/responses (mcp_approval_response)
            Route->>MCP: tools/call
            MCP-->>Route: result
        else MCP tool (no approval)
            Executor->>MCP: tools/call
            MCP-->>Executor: result
        end
        Executor->>Provider: continuation request
    end

    Provider-->>Persister: response.completed event
    Persister->>DB: "UPDATE status=completed"
    Client->>Route: "GET /v1/responses/{id}?stream=true"
    Route-->>Client: replay event log

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
src/services/mcp/executor.rs:177-222
**Approval gate doc says "warn-and-run" but code fails closed**

Two doc comments (on the `response_id` field and the `with_persistence` constructor) state the approval gate "degrades to a warn-and-run" when persistence prerequisites are missing. The actual code calls `synthesize_failed_call`, which emits a `response.mcp_call.failed` item and returns an error — it does *not* run the tool. The behavior is correct and even tested by `park_for_approval_fails_closed_without_persistence`, but the stale "warn-and-run" wording could lead an operator to believe tool calls proceed without approval in no-DB or `store=false` deployments, discouraging them from investigating why calls are silently failing.

_{Reviews (6): Last reviewed commit: "Review fixes" | Re-trigger Greptile}

ScriptSmith · 2026-05-23T15:52:10Z

@greptile-apps

ScriptSmith · 2026-05-23T16:12:00Z

@greptile-apps

greptile-apps · 2026-05-23T16:21:04Z

+fn serialize_payload_for_storage(
+    payload: &crate::api_types::CreateResponsesPayload,
+) -> serde_json::Value {
+    let mut value = serde_json::to_value(payload).unwrap_or(serde_json::Value::Null);
+    strip_input_file_data(&mut value);
+    strip_mcp_credentials(&mut value);
+    value
+}


Inline domain secret values stored and echoed in plaintext

serialize_payload_for_storage strips MCP authorization/headers credentials but does not strip ShellDomainSecretInline.value. A caller who sends a shell tool request with network_policy.domain_secrets[].type = "inline" (carrying a raw secret value like an API key) will have that value persisted verbatim in responses.request_payload. The field is then echoed back via GET /v1/responses/{id} because "tools" is in ECHO_FIELDS at responses_lookup.rs:93. The same MCP-credential-stripping pattern should apply here: redact (or omit) inline.value from the stored payload, mirroring how authorization is removed from mcp entries.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/routes/api/chat.rs Line: 309-316 Comment: **Inline domain secret values stored and echoed in plaintext** `serialize_payload_for_storage` strips MCP `authorization`/`headers` credentials but does not strip `ShellDomainSecretInline.value`. A caller who sends a shell tool request with `network_policy.domain_secrets[].type = "inline"` (carrying a raw secret value like an API key) will have that value persisted verbatim in `responses.request_payload`. The field is then echoed back via `GET /v1/responses/{id}` because `"tools"` is in `ECHO_FIELDS` at `responses_lookup.rs:93`. The same MCP-credential-stripping pattern should apply here: redact (or omit) `inline.value` from the stored payload, mirroring how `authorization` is removed from `mcp` entries. How can I resolve this? If you propose a fix, please make it concise.

ScriptSmith added 26 commits May 17, 2026 11:25

First pass adding agents

2ba3869

Review fixes

546b50a

Review fixes

e82c340

Add ownership to responses

4bc4ba6

Add container re-use

93971c8

Review fixes

a2c4774

Additional features and docs

d707108

Review fixes

f98de5c

Accumulate assistant responses correctly

d7a714a

Handle non-streaming output

be62f36

Match openai spec

6c45b79

Match openai spec

66077ce

Match openai spec

f61f6ac

Match openai spec

24dc8d7

Match openai spec

878b32d

Add server-side MCP support

ab06502

Support other file backends for containers

401652c

UI snapshot

7a9cbcb

Review fixes

ed11138

Review fixes

02ce38a

Review fixes

5adebdc

Container file artifacts

1598c0a

Fix container TTL

522c48a

Container cleanup job

b50f1b3

Review fixes

5468dca

Review fixes

f322176

greptile-apps Bot reviewed May 23, 2026

View reviewed changes

Comment thread src/services/mcp/client.rs

Comment thread src/routes/api/containers.rs Outdated

ScriptSmith added 2 commits May 24, 2026 00:44

Docs changes

9259537

Fix build

a0319a5

ScriptSmith added 2 commits May 24, 2026 01:51

Fix build

0a345b1

Review fixes

1a67e70

greptile-apps Bot reviewed May 23, 2026

View reviewed changes

Comment thread src/services/background_executor.rs

Review fixes

cd6d6f2

greptile-apps Bot reviewed May 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agents#34

Agents#34
ScriptSmith wants to merge 31 commits into
mainfrom
agents

ScriptSmith commented May 23, 2026

Uh oh!

greptile-apps Bot commented May 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

ScriptSmith commented May 23, 2026

Uh oh!

Uh oh!

ScriptSmith commented May 23, 2026

Uh oh!

greptile-apps Bot May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ScriptSmith commented May 23, 2026

Uh oh!

greptile-apps Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

ScriptSmith commented May 23, 2026

Uh oh!

Uh oh!

ScriptSmith commented May 23, 2026

Uh oh!

greptile-apps Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented May 23, 2026 •

edited

Loading