[DX-1211] Address currently silent failures around missing modes #2236
[DX-1211] Address currently silent failures around missing modes #2236umair-ably wants to merge 5 commits into
modes #2236Conversation
WalkthroughAdds a ChangesStrict Mode for Silent Failures
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
4f35b50 to
4746cfb
Compare
4746cfb to
36fc647
Compare
Declaration only — no runtime use yet. Documented silent-failure paths will read this option in subsequent commits to gate a hint-carrying throw; the default stays `false` in v2.x. Per DXRFC-022 work item B5 the default flips to `true` in v3 with no per-call opt-out. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
modes
36fc647 to
daa4edc
Compare
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Pull request overview
This PR surfaces two previously silent realtime “missing channel modes” failures by emitting a default-level warning log (and, when clientOptions.strictMode: true, throwing an ErrorInfo with a remediation hint) for:
presence.get()withoutpresence_subscribechannel.subscribe()withoutsubscribe
Changes:
- Add runtime detection for missing
presence_subscribe/subscribemodes and emit hintful warnings or throw based onstrictMode. - Introduce
Logger.silentFailureLogSuffix()to explain the “warn now, throw in next major” behavior in log output. - Add tests covering strict vs default behavior and add
strictModetoClientOptionstypings.
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| test/realtime/presence.test.js | Adds coverage for presence.get() with/without presence_subscribe under strict/default modes. |
| test/realtime/channel.test.js | Adds coverage for channel.subscribe() with/without subscribe under strict/default modes. |
| src/common/lib/util/logger.ts | Adds a reusable suffix for strictMode-off silent-failure warnings. |
| src/common/lib/client/realtimepresence.ts | Warn/throw on presence.get() when presence_subscribe is not granted. |
| src/common/lib/client/realtimechannel.ts | Warn/throw on subscribe() when subscribe mode is not granted; adds one-time warning gate. |
| src/common/lib/client/realtimeannotations.ts | Fixes mode-check precedence and adds a log before throwing for missing annotation_subscribe. |
| ably.d.ts | Documents ClientOptions.strictMode. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/common/lib/client/realtimechannel.ts (1)
503-530:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid registering the listener before the strict-mode validation can fail.
Lines 503-507 add the subscription before the attach/mode check runs. When
strictMode === true,subscribe()can reject on Line 529, but the listener remains registered. Retrying the call stacks duplicates, and a later reattach can unexpectedly activate callbacks from a failedsubscribe()attempt.Suggested fix
- // Filtered - if (event && typeof event === 'object' && !Array.isArray(event)) { - this.client._FilteredSubscriptions.subscribeFilter(this, event, listener); - } else { - this.subscriptions.on(event, listener); - } + const registerListener = () => { + if (event && typeof event === 'object' && !Array.isArray(event)) { + this.client._FilteredSubscriptions.subscribeFilter(this, event, listener); + } else { + this.subscriptions.on(event, listener); + } + }; // (RTL7g) + const strictMode = this.client.options.strictMode === true; + if (!strictMode) { + registerListener(); + } + if (this.channelOptions.attachOnSubscribe !== false) { const stateChange = await this.attach(); if (this.state === 'attached' && (this._mode & flags.SUBSCRIBE) === 0) { @@ - if (this.client.options.strictMode === true) throw err; + if (strictMode) throw err; } + if (strictMode) { + registerListener(); + } return stateChange; } else { + registerListener(); return null; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/common/lib/client/realtimechannel.ts` around lines 503 - 530, The listener is being registered before the attach/mode strict-mode check, causing orphaned listeners if attach/mode validation fails; change subscribe() so it first performs the attach() call and the mode check (the block that constructs the ErrorInfo and checks this.client.options.strictMode and this._silentSubscribeWarned) and only after that, if appropriate, register the listener via this.client._FilteredSubscriptions.subscribeFilter(this, event, listener) or this.subscriptions.on(event, listener); ensure that if attach/mode validation throws (strictMode === true) the listener is not registered, and if you intend to allow registration when strictMode === false, register only after the validation path completes successfully or explicitly permits it.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/common/lib/client/realtimechannel.ts`:
- Around line 509-533: The missing-subscribe validation is currently inside the
attachOnSubscribe branch so it is skipped when channelOptions.attachOnSubscribe
=== false; move the check that constructs ErrorInfo and warns/throws (the block
using this.state === 'attached', (this._mode & flags.SUBSCRIBE) === 0,
Logger.logActionNoStrip, this._silentSubscribeWarned and
client.options.strictMode) out of that attachOnSubscribe conditional so it runs
whenever the channel ends up attached (after calling attach() or if already
attached), regardless of attachOnSubscribe; keep using attach() to get
stateChange and still return stateChange (or null when not calling attach), but
ensure the subscribe-mode validation executes unconditionally after determining
the attached state.
In `@src/common/lib/util/logger.ts`:
- Around line 120-127: Update the static method silentFailureLogSuffix in the
logger (silentFailureLogSuffix) to return a generic suffix that doesn't assume
the behavior is a silent return; replace the phrase "returns silently" with
wording that covers any silent/no-op behavior (e.g., "currently has no
observable effect or fails silently") and keep the rest of the guidance about
clientOptions.strictMode and future throwing intact so the message is accurate
for subscribe/listener and other silent-failure paths.
---
Outside diff comments:
In `@src/common/lib/client/realtimechannel.ts`:
- Around line 503-530: The listener is being registered before the attach/mode
strict-mode check, causing orphaned listeners if attach/mode validation fails;
change subscribe() so it first performs the attach() call and the mode check
(the block that constructs the ErrorInfo and checks
this.client.options.strictMode and this._silentSubscribeWarned) and only after
that, if appropriate, register the listener via
this.client._FilteredSubscriptions.subscribeFilter(this, event, listener) or
this.subscriptions.on(event, listener); ensure that if attach/mode validation
throws (strictMode === true) the listener is not registered, and if you intend
to allow registration when strictMode === false, register only after the
validation path completes successfully or explicitly permits it.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4cd8d76d-2fcc-471e-97ea-91f7f7785049
📒 Files selected for processing (7)
ably.d.tssrc/common/lib/client/realtimeannotations.tssrc/common/lib/client/realtimechannel.tssrc/common/lib/client/realtimepresence.tssrc/common/lib/util/logger.tstest/realtime/channel.test.jstest/realtime/presence.test.js
The guard at line 73 parenthesised `(state === 'attached' && _mode & flag) === 0` so the `=== 0` compared against `(boolean && number)`. It happened to behave correctly — `false === 0` is `false` so the throw skipped when not attached, and the bitwise result === 0 was correct when attached — but the comparison was structural luck rather than the obvious reading of the predicate. Re-parenthesise to `state === 'attached' && (_mode & flag) === 0`. Also emit an always-on warning log adjacent to the throw so the diagnostic fires in the SDK output even when the caller swallows the throw. No silentFailureLogSuffix here because this throw is unconditional (pre-DXRFC-022) and not strictMode-gated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A short suffix appended to silent-failure warning logs emitted when clientOptions.strictMode is off, so the reader knows the same path will throw in a future major version. Co-locating on Logger keeps the import surface tight; callers do `Logger.silentFailureLogSuffix()` next to `Logger.logActionNoStrip(...)`. Log-only by design — the suffix is not put into ErrorInfo.hint, because the hint is also shown when the throw fires (strictMode on), where the suffix would be misleading. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a channel was attached without the presence_subscribe mode, the
server never delivers presence members, so presence.get() resolves to
[] regardless of who is actually present. Today there is no signal
distinguishing "no one is present" from "this client cannot see anyone".
This commit detects the case after ensureAttached() populates the
server-granted mode set, then:
- emits an always-on warning log carrying the hint + a suffix telling
the reader that strictMode will throw in a future major version.
- throws ErrorInfo with err.hint when clientOptions.strictMode === true.
Code 93002 sits next to 93001 (annotation_subscribe missing) in the
SDK-internal precondition class. It would also be defensible to use
40160 (server-side capability denied), but the hint-coverage rubric
already pins 40160 to the "no auth options" hint shape, so a second
40160 throw site would either weaken that pin or need a rubric refactor.
Suspended-state and {waitForSync: false} paths return earlier and are
unaffected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a channel was attached without the subscribe mode, the server never delivers messages to the listener, so channel.subscribe() appears to succeed but no callback ever fires. This commit closes the gap symmetrically to presence.get() in the previous commit: - After the implicit attach completes (RTL7g), if subscribe mode was not granted by the server, emit an always-on warning log carrying the hint + the future-throw suffix. - One-shot per channel: the warning fires once per attach cycle to keep noise down on long-lived listeners. Reset _silentSubscribeWarned on the ATTACHED message so a channels.release() + re-attach with corrected modes restores signalling. - Throw ErrorInfo (code 93003) when clientOptions.strictMode === true. attachOnSubscribe: false is out of scope — the check requires an attach to have populated _mode. Document this caveat separately if needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
daa4edc to
8af10ac
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/realtime/presence.test.js (1)
2403-2431: ⚡ Quick winAlign the non-strict test claim with what it actually verifies.
The test name says it verifies warning logging, but assertions only check resolution to
[]. Either assert the warning output explicitly or rename the test to avoid over-claiming behavior.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/realtime/presence.test.js` around lines 2403 - 2431, The test "with strictMode disabled (default), logs a warning and resolves to []" currently only asserts that channel.presence.get() resolves to []; update it to either (A) explicitly assert the warning is emitted by capturing/logging the realtime or helper logger during the test and checking that a warning containing "strictMode" or similar text was logged when presence.get() ran (reference the test's realtime instance, channel.presence.get(), and helper logger/monitoring utilities), or (B) rename the test string to something accurate like "with strictMode disabled (default), resolves to []" so it no longer claims to check warning logging; ensure you change only the test description or add the log assertion around the existing whenPromiseSettles callback where presence.get() is validated.test/realtime/channel.test.js (1)
2118-2153: ⚡ Quick winAdd coverage for
attachOnSubscribe: falsein this DX-1211 test block.These tests cover the default attach-on-subscribe flow, but not the manual-attach flow (
attachOnSubscribe: false). Add a case where the channel is explicitly attached withoutsubscribemode, thensubscribe()is called, to lock in strict/non-strict behavior on that path too.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/realtime/channel.test.js` around lines 2118 - 2153, Add tests for the manual-attach flow by creating channels with attachOnSubscribe: false (via channels.get(..., { modes: ['publish'], attachOnSubscribe: false })) then explicitly call channel.attach() before calling channel.subscribe() to exercise the path where subscribe does not auto-attach; add two cases mirroring the existing ones: one with realtime = helper.AblyRealtime({ strictMode: true }) asserting channel.subscribe() rejects with code 93003 and an appropriate hint, and one with realtime = helper.AblyRealtime() (default strictMode false) asserting channel.subscribe() resolves (returns null or a ChannelStateChange-like object) and channel.state === 'attached'; reuse the same naming pattern and helper.monitorConnectionThenCloseAndFinishAsync wrapper as the other tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/realtime/channel.test.js`:
- Around line 2118-2153: Add tests for the manual-attach flow by creating
channels with attachOnSubscribe: false (via channels.get(..., { modes:
['publish'], attachOnSubscribe: false })) then explicitly call channel.attach()
before calling channel.subscribe() to exercise the path where subscribe does not
auto-attach; add two cases mirroring the existing ones: one with realtime =
helper.AblyRealtime({ strictMode: true }) asserting channel.subscribe() rejects
with code 93003 and an appropriate hint, and one with realtime =
helper.AblyRealtime() (default strictMode false) asserting channel.subscribe()
resolves (returns null or a ChannelStateChange-like object) and channel.state
=== 'attached'; reuse the same naming pattern and
helper.monitorConnectionThenCloseAndFinishAsync wrapper as the other tests.
In `@test/realtime/presence.test.js`:
- Around line 2403-2431: The test "with strictMode disabled (default), logs a
warning and resolves to []" currently only asserts that channel.presence.get()
resolves to []; update it to either (A) explicitly assert the warning is emitted
by capturing/logging the realtime or helper logger during the test and checking
that a warning containing "strictMode" or similar text was logged when
presence.get() ran (reference the test's realtime instance,
channel.presence.get(), and helper logger/monitoring utilities), or (B) rename
the test string to something accurate like "with strictMode disabled (default),
resolves to []" so it no longer claims to check warning logging; ensure you
change only the test description or add the log assertion around the existing
whenPromiseSettles callback where presence.get() is validated.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f61a171a-8956-4253-a6a4-8dc13ae0f727
📒 Files selected for processing (6)
src/common/lib/client/realtimeannotations.tssrc/common/lib/client/realtimechannel.tssrc/common/lib/client/realtimepresence.tssrc/common/lib/util/logger.tstest/realtime/channel.test.jstest/realtime/presence.test.js
Targets #2233 - review that first please
Author's Note
I'd advise reading the Claude generated sections below as they outline the experiments and exact numbers, but if you want a summary of my thoughts...
This PR aims to address 2 silent failures we have today:
presence.get()on a channel withoutpresence_subscribereturns an empty list.channel.subscribe()on a channel withoutsubscriberegisters a listener that never fires.The fixes themselves were easy in that we just throw an error and hint pairing when running into these issues. We also introduce a
strictModewhich allows us to keep this as an additive change (log instead of throw) before promoting it to a breaking change in a later release (always throw). We want this strictMode in case people don't wish to upgrade to a breaking change release, but they can still benefit from the improvements we've made here.The experiments themselves were the tricky and time-consuming part.
The full experiment runs and details are at https://github.com/ably-labs/llm-eval/pull/10
What this PR does
Two SDK calls fail silently today:
presence.get()on a channel withoutpresence_subscribereturns an empty list.channel.subscribe()on a channel withoutsubscriberegisters a listener that never fires.Both with no error and no warning. A developer or AI agent gets back "nothing", assumes it works, and
ships broken code.
This PR makes those calls speak up: a warning log by default, and — if you set
clientOptions.strictMode— a thrown error with a hint instead of the silent result. This is the v2.xstep of the A5/B5 rollout in DXRFC-022 (warn now;
strictModebecomes the default in v3).How we tested it
We had AI agents do small Ably coding tasks, with no web access, treating the SDK as a black box. Each
task ran against three versions of the SDK:
strictMode: true: the call throws instead of returning silently.(A fourth, B-verbose, is the warning shown only if the developer raised the log level themselves.)
About 370 runs, 0 invalid, on Opus 4.8 and Sonnet 4.6, plus Haiku 4.5 on the last experiment. The
agents can't cheat by swapping in a better API key — the harness blocks it. Every result is compared
against the currently-published
ably@2.21.0.We labelled each run by what the agent did:
The main result
On the strong models, the surfacing barely changed anything — because Opus and Sonnet usually work out
the cause on their own. At baseline they shipped broken code only about 10% of the time, so there was
little left to fix.
But that's the easy case: a capable model, full attention, a tiny task. In real use a strong model
often has a full or long context window, or is working in a large unfamiliar codebase, and can't give
each empty result that much attention. We can't easily simulate "Opus with a full context window", so
we used Haiku as a stand-in for a strong model that can't fully focus — same task, less reasoning
brought to bear.
That is where the surfacing matters. On the keystone task:
strictMode)So the change removes most silent failures exactly when the model is stretched thin — the realistic
case, not the exception.
This is a stand-in, not a direct measurement: we did not run Opus with a deliberately full context.
Haiku is the proxy. Measuring a frontier model under real load is the obvious next test.
The seven experiments
works — there's no test." The key can't subscribe, so nothing arrives. Opus/Sonnet shipped broken
2/20 at baseline, 1/20 with B-error, 0/10 with C. More telling: at baseline, 0 of 20 runs reached
the diagnosis from the SDK (they guessed, or tripped a loud server error by probing); with B-error,
20 of 20 did.
models nearly always noticed (the task says someone is present, and a wrong guess triggers a loud
server error anyway), so all arms scored about the same.
(B-error) against the throw (C). Result: B-error ≈ C — the default warning does the job, and
strictModeis the backstop for the rest.channel options. Every model and every arm fixed it cleanly in about 6 steps. When the bug is
reported and the cause is in plain sight, the warning adds nothing.
naming
strictMode. 100% of runs foundstrictModein the type definitions and switched it on.Agents read the JSDoc.
modesreplaces thedefaults, so adding one mode silently drops presence. We buried this in a multi-file app and in a
from-scratch build. Two-thirds of agents made the mistake — but on the strong models they all caught
it themselves, so nothing shipped broken. The "wasted hours debugging" failures we'd heard about did
not show up on the strong models at this scale.
fixes a reported bug. The no-oracle task (1) is where the 50% → 10% → 0% result above came from.
Where it doesn't help
Good next step: clearer JSDoc
Experiment 5 showed agents reliably read the SDK's type definitions — 100% found
strictModethere.That makes the JSDoc a cheap, high-value place to stop these failures before they happen.
The clearest example is the bug in experiment 6. The JSDoc for
ChannelOptions.modesjust says "Anarray of ChannelMode objects." It does not say that setting
modesreplaces the defaults instead ofadding to them — which is the exact cause of that silent failure. One added line ("setting
modesreplaces the default set — list every mode you need") would stop agents making the mistake at all, in
the place they actually look.
This PR makes failures speak up at runtime. Clearer JSDoc on
modes(and related options) would preventsome of them up front. The two together are the natural follow-up.
Summary by CodeRabbit
New Features
strictModeoption (default: false) that turns documented silent-failure cases into thrown errors with descriptive hints.Bug Fixes
Tests