Plan to add s2 storage to browser events#229
Conversation
|
Firetiger deploy monitoring skipped This PR didn't match the auto-monitor filter configured on your GitHub connection:
Reason: PR title and empty body provide insufficient information to determine if this changes kernel API endpoints or Temporal workflows; please clarify the scope of changes or opt in manually. To monitor this PR anyway, reply with |
| ### 1. Stream name = capture session ID | ||
|
|
||
| Each capture session maps to a dedicated stream named by the session UUID. Streams are created automatically on first write (S2 does this via create-stream-on-append basin feature). This means: | ||
|
|
||
| - Replaying a session = reading one stream from seq 0 | ||
| - Concurrent sessions write to separate streams with no coordination |
There was a problem hiding this comment.
do we persist capture session anywhere? mainly trying to understand how we'll do "reads" (e.g. after a browser session is destroyed or something)
There was a problem hiding this comment.
Yeah good point, I still have to make a pass to the kernel api to add the other endpoints. In that I will add db to add field.String("s2_stream") to capture this
| ### 3. Batching: 100ms linger / 50 records (S2 backend) | ||
|
|
||
| The S2 SDK batcher coalesces records before flushing to the network. Configuration: | ||
|
|
||
| ``` | ||
| Linger: 100ms | ||
| MaxRecords: 50 | ||
| ``` | ||
|
|
||
| These are independent of the ring buffer read loop — the writer appends one record per ring Read, and the batcher decides when to flush. |
There was a problem hiding this comment.
may want to have these be env vars / configurable so we can externally control
|
|
||
| --- | ||
|
|
||
| ## System Context |
There was a problem hiding this comment.
I think this flow makes sense overall. I think it'd be helpful to have more clarity overall on:
- how enabling the s2 delivery ties into the existing APIs
- credentials for s2 within the vm
| 1. ctx cancelled (SIGINT/SIGTERM) | ||
| 2. EventsStorageWriter.Run returns (reader unblocks from cancelled ctx) | ||
| 3. storageDone channel closes | ||
| 4. storageWriter.Close() — drains in-flight S2 writes |
There was a problem hiding this comment.
I think it's also helpful to walk through the actual browser teardown pathway and confirm we have the right interface in this API server to at best effort flush capture stream to s2. I'd expect a number of our users would do something like a try -> {start capture session, run automation} -> catch (log error) -> finally {delete browser}. So ensuring we can get the data out before the vm is gone is valuable here ^^
Note
Low Risk
Documentation-only change adding a design plan; no runtime code, APIs, or schemas are modified in this PR.
Overview
Adds a new design document,
plans/s2-storage.md, outlining a proposed durable storage sink for browser events using S2 (including proposed components, configuration/env vars, shutdown sequencing, and planned endpoint/schema touchpoints).Reviewed by Cursor Bugbot for commit 676dbd7. Bugbot is set up for automated code reviews on this repo. Configure here.