Skip to content

Agents#34

Open
ScriptSmith wants to merge 31 commits into
mainfrom
agents
Open

Agents#34
ScriptSmith wants to merge 31 commits into
mainfrom
agents

Conversation

@ScriptSmith
Copy link
Copy Markdown
Owner

Adds:

  • Responses server-side tool execution loop, background mode
  • Containers and a server-side shell tool
  • Server-side MCP

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 23, 2026

Greptile Summary

This PR adds a server-side agent execution loop to Hadrian, including: a background responses worker with retry semantics, a container/sandbox abstraction backed by microsandbox/opensandbox runtimes, a server-side shell tool, and a hadrian_hosted MCP client loop with human-in-the-loop approval gating.

  • Background executor (background_executor.rs, background_responses.rs): queued background=true responses are claimed via SELECT FOR UPDATE SKIP LOCKED, run through the same streaming pipeline as foreground requests, and retried on transient failures; stream errors deliberately avoid writing terminal state so retries see a clean row.
  • Containers API (routes/api/containers.rs, services/containers.rs, services/container_session.rs): full CRUD for /v1/containers, file upload/download with path-traversal guard (.. rejected in both the explicit path field and the filename fallback), and session replay from the DB on container re-attach.
  • MCP hadrian-hosted mode (services/mcp/): per-request tools/list rewrite into function tools, single-flight cache, SSRF-guarded HTTP client with DNS-pinning, approval parking via mcp_pending_approvals, and exactly-once resume via DELETE … RETURNING.

Confidence Score: 5/5

The core retry, persistence, and security paths are well-implemented and explicitly tested; only a documentation discrepancy in the MCP executor was found.

The retry logic correctly avoids writing terminal state on transient stream errors, the path-traversal guard covers both upload branches (with a test for each), and the MCP approval gate fails closed rather than bypassing when persistence is unavailable. The sole finding is a stale "warn-and-run" doc comment that contradicts tested fail-closed behavior in the MCP executor.

src/services/mcp/executor.rs — the field and constructor doc comments describing the no-persistence approval behavior should be updated to reflect the actual fail-closed semantics.

Important Files Changed

Filename Overview
src/services/background_executor.rs Background response executor: reconstructs request, runs it through the shared streaming pipeline, and drains the body. Stream errors do NOT write a terminal state (retry safety); terminal writes are owned by mark_background_failure after retries are exhausted. Logic is sound.
src/jobs/background_responses.rs Worker loop with bounded concurrency: claims queued rows, runs them via run_with_retry. Retry correctly resets status to InProgress and bumps started_at before re-executing. Row re-fetch after failure picks up updated last_sequence_number.
src/services/response_persister.rs Streaming persistence wrapper: on body-stream error it sends the error to the drain and returns without writing any terminal status (keeping the row in InProgress for retry). On clean stream end it writes the terminal state based on what SSE events it observed.
src/routes/api/containers.rs Container REST endpoints. normalize_mnt_path applies the .. traversal guard uniformly to both the explicit path_field and the filename fallback, with a test covering both branches.
src/services/mcp/executor.rs MCP executor with approval gating. park_for_approval correctly fails closed when persistence prerequisites are missing, but the field and constructor doc comments still say "warn-and-run".
src/services/mcp/service.rs Long-lived MCP service: connection pool keyed by (server_url, SHA-256 of auth+headers), single-flight tools/list cache, SSRF validation with IP pinning.
src/services/mcp/resume.rs Exactly-once approval resume: DELETE-RETURNING to claim before calling prevents double-execution. Server origin mismatch check prevents approved calls being redirected to a different host.
src/services/responses_pipeline.rs Shared streaming pipeline factory. Shell environment validation, skill resolution, staged-file injection, and tool-loop registration all look correct.
src/services/server_tools/runner.rs Tool loop runner: per-turn DONE sentinel swallowed; single terminal DONE emitted after loop. Continuation payload appends assistant items each turn for coherent Anthropic-compatible multi-turn history.
src/services/shell_tool.rs Shell tool interception and environment resolution. Egress host wildcard matching uses byte-level dot check to prevent subdomain-bypass. Memory and TTL caps enforced at admission.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Route as /v1/responses
    participant DB
    participant Worker as Background Worker
    participant Executor as Background Executor
    participant Persister
    participant Provider as LLM Provider
    participant Shell as Shell Runtime
    participant MCP as MCP Server

    Client->>Route: "POST /v1/responses (background=true)"
    Route->>DB: "INSERT row (status=queued)"
    Route-->>Client: "202 {id, status: queued}"

    Worker->>DB: claim_queued (SELECT FOR UPDATE SKIP LOCKED)
    DB-->>Worker: ResponseRecord
    Worker->>Executor: execute_persisted_response(record)
    Executor->>Provider: stream request
    Provider-->>Executor: SSE stream

    Executor->>Persister: wrap_streaming_with_persistence
    Persister->>DB: write events (sequence++)

    loop Tool loop (max_iterations)
        Provider-->>Executor: function_call (shell/mcp)
        alt shell tool
            Executor->>Shell: exec(commands)
            Shell-->>Executor: stdout/stderr
        else MCP tool (requires_approval)
            Executor->>DB: INSERT mcp_pending_approvals
            Executor-->>Client: mcp_approval_request event
            Client->>Route: POST /v1/responses (mcp_approval_response)
            Route->>MCP: tools/call
            MCP-->>Route: result
        else MCP tool (no approval)
            Executor->>MCP: tools/call
            MCP-->>Executor: result
        end
        Executor->>Provider: continuation request
    end

    Provider-->>Persister: response.completed event
    Persister->>DB: "UPDATE status=completed"
    Client->>Route: "GET /v1/responses/{id}?stream=true"
    Route-->>Client: replay event log
Loading
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
src/services/mcp/executor.rs:177-222
**Approval gate doc says "warn-and-run" but code fails closed**

Two doc comments (on the `response_id` field and the `with_persistence` constructor) state the approval gate "degrades to a warn-and-run" when persistence prerequisites are missing. The actual code calls `synthesize_failed_call`, which emits a `response.mcp_call.failed` item and returns an error — it does *not* run the tool. The behavior is correct and even tested by `park_for_approval_fails_closed_without_persistence`, but the stale "warn-and-run" wording could lead an operator to believe tool calls proceed without approval in no-DB or `store=false` deployments, discouraging them from investigating why calls are silently failing.

Reviews (6): Last reviewed commit: "Review fixes" | Re-trigger Greptile

Comment thread src/services/mcp/client.rs
Comment thread src/routes/api/containers.rs Outdated
@ScriptSmith
Copy link
Copy Markdown
Owner Author

@greptile-apps

Comment thread src/services/background_executor.rs
@ScriptSmith
Copy link
Copy Markdown
Owner Author

@greptile-apps

Comment thread src/routes/api/chat.rs
Comment on lines +309 to +316
fn serialize_payload_for_storage(
payload: &crate::api_types::CreateResponsesPayload,
) -> serde_json::Value {
let mut value = serde_json::to_value(payload).unwrap_or(serde_json::Value::Null);
strip_input_file_data(&mut value);
strip_mcp_credentials(&mut value);
value
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 security Inline domain secret values stored and echoed in plaintext

serialize_payload_for_storage strips MCP authorization/headers credentials but does not strip ShellDomainSecretInline.value. A caller who sends a shell tool request with network_policy.domain_secrets[].type = "inline" (carrying a raw secret value like an API key) will have that value persisted verbatim in responses.request_payload. The field is then echoed back via GET /v1/responses/{id} because "tools" is in ECHO_FIELDS at responses_lookup.rs:93. The same MCP-credential-stripping pattern should apply here: redact (or omit) inline.value from the stored payload, mirroring how authorization is removed from mcp entries.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/routes/api/chat.rs
Line: 309-316

Comment:
**Inline domain secret values stored and echoed in plaintext**

`serialize_payload_for_storage` strips MCP `authorization`/`headers` credentials but does not strip `ShellDomainSecretInline.value`. A caller who sends a shell tool request with `network_policy.domain_secrets[].type = "inline"` (carrying a raw secret value like an API key) will have that value persisted verbatim in `responses.request_payload`. The field is then echoed back via `GET /v1/responses/{id}` because `"tools"` is in `ECHO_FIELDS` at `responses_lookup.rs:93`. The same MCP-credential-stripping pattern should apply here: redact (or omit) `inline.value` from the stored payload, mirroring how `authorization` is removed from `mcp` entries.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant