Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
76d837f
feat: add CDP pipeline, cdpmonitor, and wire into API service
archandatta Apr 2, 2026
5c0f7ae
refactor: rename Pipeline to CaptureSession and delete pipeline.go
archandatta Apr 2, 2026
ad2dd63
review: fix request context leak in StartCapture, add missing categor…
archandatta Apr 2, 2026
365787c
review: clean up syntax
archandatta Apr 6, 2026
d086571
review: move CategoryFor to cdpmonitor package
archandatta Apr 7, 2026
3339cac
review: internalize ring buffer and file writer in CaptureSession con…
archandatta Apr 7, 2026
64e67b1
review: write logs under /var/log/kernel and ensure dir exists
archandatta Apr 7, 2026
048f9c6
review: add capture config to /events/start and OpenAPI spec
archandatta Apr 7, 2026
e57e58a
review: fix lifecycle context, stop-before-reset ordering, seq reset,…
archandatta Apr 7, 2026
886b893
fix: oapi version
archandatta Apr 7, 2026
06d2470
fix: Shutdown cancels context outside lock, racing with StartCapture
archandatta Apr 7, 2026
6bb5402
review: validate DetailLevel with generated Valid
archandatta Apr 7, 2026
cb45a55
fix: reset ring buffer on session restart to unstrand existing readers
archandatta Apr 7, 2026
02ee74e
chore: remove dead categoryFor function
archandatta Apr 7, 2026
3523994
review: guard zero-capacity ring buffer and fix reader reset after bu…
archandatta Apr 7, 2026
cb1a1a7
review: use t.TempDir in test helper, map-based ValidCategory, avoid …
archandatta Apr 7, 2026
c9e78a3
review: add captureConfigFrom and StartCapture/StopCapture handler tests
archandatta Apr 7, 2026
1cddf53
feat: refactor events API to resource-style capture sessions
archandatta Apr 8, 2026
c6dd362
review: update file writer to be internal to the package
archandatta Apr 9, 2026
ff8bddf
review: tighten to `Write(filename string, data []byte) error`
archandatta Apr 9, 2026
a8bdeaf
review: update panic -> error
archandatta Apr 9, 2026
8f88ed0
review: update oapi and remove detail level
archandatta Apr 9, 2026
3eaacb3
review: remove url
archandatta Apr 9, 2026
214858a
chore: restore server/api on branch
archandatta Apr 9, 2026
b06132a
review: harden captureConfigFromOAPI and clarify stop comment
archandatta Apr 9, 2026
8ebb5e3
review: unexport ringBuffer and drop AllCategories wrapper
archandatta Apr 9, 2026
4bcba48
review: replace event producers with cdp monitor in stop description
archandatta Apr 9, 2026
1295232
remove test line
archandatta Apr 9, 2026
a4fd0d6
review: update uuid to cuid2
archandatta Apr 10, 2026
d6b348b
Merge branch 'main' into archand/kernel-1116/cdp-pipeline
archandatta Apr 10, 2026
3da65c3
Merge branch 'main' into archand/kernel-1116/cdp-pipeline
archandatta Apr 10, 2026
a7b2e54
fix naming
archandatta Apr 10, 2026
85d570a
feat: add cdpmonitor foundation — types, util, computed state machine…
archandatta Apr 13, 2026
bed53f8
self review
archandatta Apr 13, 2026
d73793c
review: cursor feedback
archandatta Apr 13, 2026
348243a
[kernel-1116] CDP monitor core (#214)
archandatta Apr 13, 2026
a62c403
review: update types and sensitive interaction data
archandatta Apr 14, 2026
0605227
feat: add two-layer CDP decode, protocol-faithful types, then monitor…
archandatta Apr 14, 2026
33c07d3
Merge branch 'main' into archand/kernel-1116/cdp-pipeline
archandatta Apr 21, 2026
9c6e066
review: clean up monitor health and types
archandatta Apr 22, 2026
5dd9273
review: add chromium version
archandatta Apr 22, 2026
7550bc1
review: remove dead code
archandatta Apr 22, 2026
f3d3166
Merge branch 'archand/kernel-1116/cdp-pipeline' into archand/kernel-1…
archandatta Apr 22, 2026
1cfbc5e
fix injection script
archandatta Apr 22, 2026
2e0c4a0
review: remove sensitive data from inject
archandatta Apr 22, 2026
bf4b04c
review
archandatta Apr 22, 2026
90a3ae1
review: sensitive data audit interaction.js
archandatta Apr 22, 2026
4feef7e
review: reconnect failure leaks goroutines and deadlocks Stop
archandatta Apr 22, 2026
8e94162
Merge branch 'main' into archand/kernel-1116/cdp-foundation
archandatta Apr 22, 2026
5465e59
review: create cdpMonitorController
archandatta Apr 22, 2026
7c4c654
review: add ctx to monitor and update comment
archandatta Apr 23, 2026
bca495b
review: lift lifeMu to dispatch level to make ctx handling explicit
archandatta Apr 23, 2026
fd4d4d3
review: add readme for cdp monitor
archandatta Apr 23, 2026
83a164b
review: update monitor to capture ids from cdp to group request data
archandatta Apr 29, 2026
0139432
review: release sessionsMu before cs.stop() to honour lock ordering
archandatta Apr 29, 2026
d47480d
review: update cdp monitor fields
archandatta Apr 30, 2026
62c7cf1
review: update source
archandatta May 1, 2026
fc7ab87
review: update naming convention
archandatta May 1, 2026
916f275
review: update readme
archandatta May 1, 2026
029174c
review: fix lcp capture
archandatta May 1, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 11 additions & 2 deletions server/cmd/api/api/api.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import (
"context"
"errors"
"fmt"
"log/slog"
"os"
"os/exec"
"sync"
Expand All @@ -20,6 +21,14 @@ import (
"github.com/kernel/kernel-images/server/lib/scaletozero"
)

type cdpMonitorController interface {
Start(ctx context.Context) error
Stop()
IsRunning() bool
}

var _ cdpMonitorController = (*cdpmonitor.Monitor)(nil)

type ApiService struct {
// defaultRecorderID is used whenever the caller doesn't specify an explicit ID.
defaultRecorderID string
Expand Down Expand Up @@ -73,7 +82,7 @@ type ApiService struct {

// CDP event pipeline and cdpMonitor.
captureSession *events.CaptureSession
cdpMonitor *cdpmonitor.Monitor
cdpMonitor cdpMonitorController
monitorMu sync.Mutex
lifecycleCtx context.Context
lifecycleCancel context.CancelFunc
Expand Down Expand Up @@ -103,7 +112,7 @@ func New(
return nil, fmt.Errorf("captureSession cannot be nil")
}

mon := cdpmonitor.New(upstreamMgr, captureSession.Publish, displayNum)
mon := cdpmonitor.New(upstreamMgr, captureSession.Publish, displayNum, slog.Default())
ctx, cancel := context.WithCancel(context.Background())

return &ApiService{
Expand Down
7 changes: 7 additions & 0 deletions server/cmd/api/api/capture_session_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -248,5 +248,12 @@ func newTestService(t *testing.T, mgr recorder.RecordManager) *ApiService {
t.Helper()
svc, err := New(mgr, newMockFactory(), newTestUpstreamManager(), scaletozero.NewNoopController(), newMockNekoClient(t), newCaptureSession(t), 0)
require.NoError(t, err)
svc.cdpMonitor = &stubCdpMonitor{}
return svc
}

type stubCdpMonitor struct{}

func (s *stubCdpMonitor) Start(_ context.Context) error { return nil }
func (s *stubCdpMonitor) Stop() {}
func (s *stubCdpMonitor) IsRunning() bool { return false }
221 changes: 221 additions & 0 deletions server/lib/cdpmonitor/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,221 @@
# CDP Monitor

The monitor is the browser-facing layer of the kernel browser logging pipeline. It connects to Chrome's DevTools endpoint, tracks all page sessions via CDP's `Target.setAutoAttach`, and converts raw CDP notifications into typed `events.Event` values for downstream consumers.

## Overview

`cdpmonitor` manages a Chrome DevTools Protocol (CDP) WebSocket connection to a running Chrome browser. It subscribes to CDP events across all attached tabs, translates them into structured `events.Event` values, and publishes them via a caller-supplied `PublishFunc`. It also derives synthetic events from sequences of CDP events and takes screenshots on significant page activity.

Chrome can restart independently of the monitor. When that happens, `UpstreamProvider` pushes a new DevTools URL and the monitor reconnects automatically, emitting lifecycle events so consumers can track continuity.

## Event taxonomy

**CDP-derived** (1-to-1 with a CDP notification): `console_log`, `console_error`, `network_request`, `network_response`, `network_loading_failed`, `page_tab_opened`, `page_navigation`, `page_dom_content_loaded`, `page_load`, `page_layout_shift`

**Computed** (inferred from sequences of CDP events): `network_idle` (fires when in-flight requests drop to zero), `page_layout_settled` (1 s after `page_load` with no intervening layout shifts), `page_navigation_settled` (fires once `page_dom_content_loaded`, `network_idle`, and `page_layout_settled` have all fired for the same navigation).

**Interaction** (fired by `interaction.js` via `Runtime.bindingCalled`): `interaction_click`, `interaction_key`, `interaction_scroll_settled`

**Monitor lifecycle** (emitted by the monitor itself, not by Chrome): `monitor_screenshot`, `monitor_disconnected`, `monitor_reconnected`, `monitor_reconnect_failed`, `monitor_init_failed`

## Responsibilities

| Concern | Where |
| --- | --- |
| WebSocket lifecycle (connect, read, reconnect) | `monitor.go` |
| CDP domain setup per session | `domains.go` |
| Event translation (CDP params to `events.Event`) | `handlers.go` |
| Synthetic event state machines | `computed.go` |
| Screenshot capture via ffmpeg | `screenshot.go` |
| CDP protocol types | `cdp_proto.go`, `types.go` |
| Interaction tracking injected into the page | `interaction.js` |
| Body/MIME capture sizing and text truncation helpers | `util.go` |

## Internals

### Reconnect model

`subscribeToUpstream` listens to `UpstreamProvider.Subscribe()` for new DevTools URLs. On each URL change (indicating Chrome restarted), `handleUpstreamRestart` tears down the existing connection, dials the new URL with capped-exponential backoff (250 ms → 500 ms → 1 s → 2 s, up to 10 attempts), then restarts `readLoop` and re-initializes all CDP sessions. `restartMu` serializes concurrent restart signals so rapid Chrome restarts do not produce overlapping reconnects.

### Goroutines

| Goroutine | Lifetime | Tracked by |
| --- | --- | --- |
| `readLoop` | one per WebSocket connection | `done` channel |
| `subscribeToUpstream` | same as `lifecycleCtx` | `asyncWg` |
| `sweepPendingRequests` | same as `lifecycleCtx` | `asyncWg` |
| `initSession` | short-lived, one per connect or reconnect | `asyncWg` |
| `attachExistingTargets` wrapper | short-lived, one per existing target on reconnect | `asyncWg` |
| `enableDomains` + `injectScript` | short-lived, one per target attach | `asyncWg` |
| `fetchResponseBody` | one per completed network request | `asyncWg` |
| `captureScreenshot` | one per screenshot trigger | `asyncWg` |

`Stop()` cancels `lifecycleCtx`, waits for `readLoop` via `done`, then waits for all other goroutines via `asyncWg` before closing the connection.

### Lock ordering

Locks must be acquired left to right. Never hold a lock on the left while acquiring one further right.

```
restartMu -> lifeMu -> pendReqMu -> computed.mu -> pendMu
restartMu -> lifeMu -> sessionsMu
```

`computed.mu` and `sessionsMu` are never held simultaneously; `cs.stop()` and `cs.resetOnNavigation()` are called only after the relevant `sessionsMu` critical section is complete.

`bindingRateMu` is independent of this ordering and is always acquired alone.

| Lock | Protects |
| --- | --- |
| `restartMu` | Serializes `handleUpstreamRestart` to prevent overlapping reconnects from rapid Chrome restarts |
| `lifeMu` | `conn`, `lifecycleCtx`, `cancel`, `done`, `readReady` -- all fields that change during Start / Stop / reconnect |
| `pendReqMu` | `pendingRequests` (requestId -> `networkReqState`): in-flight network requests accumulating request/response metadata until `loadingFinished` |
| `computed.mu` | All `computedState` fields: counters and timers for the `network_idle`, `page_layout_settled`, and `page_navigation_settled` state machines |
| `pendMu` | `pending` (id -> reply channel): in-flight CDP commands waiting for a response from Chrome |
| `sessionsMu` | `sessions` (sessionID -> `targetInfo`): the set of currently attached CDP targets (tabs, iframes, workers) |
| `bindingRateMu` | `bindingLastSeen` (sessionID:eventType -> time): rate-limit state for `__kernelEvent` binding calls |

Fields that need no mutex use `sync/atomic`: `nextID`, `mainSessionID`, `running`, `lastScreenshotAt`, `screenshotInFlight`.

### WebSocket concurrency

`coder/websocket` guarantees one concurrent `Read` and one concurrent `Write` are safe on the same connection. `readLoop` is the sole reader. All writes go through `send`, which calls `conn.Write` directly -- `conn.Write` is internally serialized by the library, so no external write mutex is needed.

## Event data model

### Envelope and top-level fields

Every event arrives as an `Envelope`:

```json
{
"capture_session_id": "cs_abc123",
"seq": 42,
"event": {
"ts": 1746123456789000,
"type": "network_request",
"category": "network",
"source": { ... },
"data": { ... },
"truncated": false
}
}
```

| Field | Type | Description |
| --- | --- | --- |
| `capture_session_id` | string | Pipeline-assigned ID for the capture session (not a CDP concept). |
| `seq` | uint64 | Monotonically increasing per-capture-session sequence number. |
| `event.ts` | int64 | Wall-clock time the monitor emitted the event, as **Unix microseconds** (µs since epoch). |
| `event.type` | string | See [Event taxonomy](#event-taxonomy). |
| `event.category` | string | One of: `console`, `network`, `page`, `interaction`, `system`. |
| `event.truncated` | bool | `true` if `data` was nulled to fit the 1 MB pipeline limit. |

### Source object

```json
"source": {
"kind": "cdp",
"event": "Network.requestWillBeSent",
"metadata": {
"cdp_session_id": "...",
"target_id": "...",
"target_type": "page"
}
}
```

| Field | Description |
| --- | --- |
| `event` | The raw CDP method that triggered the event (e.g. `Network.requestWillBeSent`). Empty for computed events. |
| `metadata.cdp_session_id` | The CDP WebSocket session multiplexer ID for this target. Changes if Chrome restarts. |
| `metadata.target_id` | Stable identifier for the browser target (tab/window). Survives navigations within the same tab. |
| `metadata.target_type` | Target type as reported by Chrome: `page`, `iframe`, `worker`, etc. |

### CDP identity primer

Five IDs appear across events. Understanding how they nest prevents confusion:

```
target_id <- one per tab/window; stable across navigations
└── cdp_session_id <- WebSocket multiplexer channel to that target; resets on Chrome restart
└── frame_id <- one per frame (top-level or iframe); changes on navigation
└── loader_id <- one per document load; links a navigation to its network requests
└── request_id <- one per request (stable across redirects in a chain)
```

| ID | Where it appears | What it identifies |
| --- | --- | --- |
| `target_id` | `source.metadata`, most `data` objects | The browser tab. Use this to group all events from one tab session. |
| `cdp_session_id` | `source.metadata` | The WebSocket sub-channel. Not stable across reconnects. |
| `frame_id` | `page_navigation`, `network_request`, `network_response`, `network_loading_failed` | The frame the request or navigation belongs to. Top-level frame has no `parent_frame_id`. |
| `source_frame_id` | `page_layout_shift` | The frame where the layout shift occurred. Distinct from the nav context `frame_id`, which is always the top-level navigated frame. |
| `loader_id` | `page_navigation`, `network_request`, `network_response` | The document load that owns a request. Join `network_request.loader_id` to `page_navigation.loader_id` to correlate requests with the navigation that triggered them. |
| `request_id` | `network_request`, `network_response`, `network_loading_failed` | A single request chain (including redirects). Links request to its eventual response or failure. |

### Navigation context fields

Most event `data` objects include a nav context block stamped at the last `page_navigation`. These fields reflect the top-level frame most recently navigated in the session:

| Field | Description |
| --- | --- |
| `session_id` | Same as `source.metadata.cdp_session_id`. Repeated for data-only consumers. |
| `frame_id` | Frame ID of the navigated top-level frame. |
| `loader_id` | Loader ID of the current document. |
| `url` | URL of the current page at the time of the last navigation. |
| `nav_seq` | Monotonically increasing counter, incremented on each `page_navigation`. Use it to detect that the page has navigated between two events in the same session. |

### Per-event data fields

Fields below are the unique additions per event type. Unless otherwise noted, events also include the nav context fields described above. Network events are the exception: they carry their own `loader_id` and `frame_id` directly and do not include nav context.

#### Console events

| Event | Unique fields |
| --- | --- |
| `console_log` | `level` (CDP type string), `text` (first arg), `args` (all args as strings), `stack_trace` |
| `console_error` | Same as `console_log` when `source.event` is `Runtime.consoleAPICalled`. When `source.event` is `Runtime.exceptionThrown`: `text`, `line`, `column`, `source_url` (script file URL, not page URL), `stack_trace`. |

#### Network events

| Event | Fields |
| --- | --- |
| `network_request` | `request_id`, `loader_id`, `frame_id`, `document_url`, `method`, `url`, `headers`, `initiator_type`. Optional: `post_data`, `resource_type`, `is_redirect` + `redirect_url`. |
| `network_response` | `request_id`, `loader_id`, `frame_id`, `method`, `url`, `status`, `headers`. Optional: `status_text`, `mime_type`, `resource_type`, `body` (truncated text body for textual MIME types). |
| `network_loading_failed` | `request_id`, `error_text`, `canceled`. Optional (absent when the request record was not found): `url`, `loader_id`, `frame_id`, `resource_type`. |

#### Page events

| Event | Unique fields |
| --- | --- |
| `page_tab_opened` | `target_id`, `target_type`, `url`, `opener_id`, `title`. Emitted before the first navigation; no nav context. |
| `page_navigation` | `session_id`, `target_id`, `target_type`, `url`, `frame_id`, `parent_frame_id` (absent for top-level frames), `loader_id`. This event establishes the nav context stamped on all subsequent events for the session. |
| `page_dom_content_loaded` | Nav context + `cdp_timestamp` (CDP monotonic seconds; not a wall-clock timestamp -- use `event.ts` for ordering). |
| `page_load` | Nav context + `cdp_timestamp` (CDP monotonic seconds). |
| `page_layout_shift` | Nav context + `source_frame_id`, `time`, `duration`. Optional `layout_shift_details` object: `score`, `had_recent_input`. Optional `lcp_details` object: `render_time`, `load_time`, `size`, `element_id`, `url`, `node_id`. Chrome multiplexes LCP candidate data through the same `PerformanceTimeline.timelineEventAdded` notification, so both may appear on a single event. |

#### Computed events

`network_idle`, `page_layout_settled`, and `page_navigation_settled` carry nav context fields only.

#### Interaction events

All interaction events include nav context plus the fields below.

| Event | Unique fields |
| --- | --- |
| `interaction_click` | `x`, `y` (viewport coords), `selector` (CSS selector of clicked element), `tag`, `text` (element text; empty for sensitive inputs). |
| `interaction_key` | `key` (key name), `selector`, `tag`. Not emitted for sensitive input fields. |
| `interaction_scroll_settled` | `from_x`, `from_y`, `to_x`, `to_y` (scroll positions in px), `target_selector`. |

#### Monitor lifecycle events

Lifecycle events use `source.kind = "local_process"` and carry no nav context, except `monitor_screenshot` which includes nav context alongside the image payload.

| Event | Fields |
| --- | --- |
| `monitor_screenshot` | Nav context + `png` (base64-encoded PNG). |
| `monitor_disconnected` | `reason: "chrome_restarted"`. |
| `monitor_reconnected` | `reconnect_duration_ms`. |
| `monitor_reconnect_failed` | `reason: "reconnect_exhausted"`. |
| `monitor_init_failed` | `step` (name of the init step that failed, e.g. `"Target.setAutoAttach"`). |
Loading
Loading