Skip to content

RATIS-2497. Pass server to the dummy watch request in OrderedAsync#1438

Merged
szetszwo merged 2 commits intoapache:masterfrom
szetszwo:RATIS-2497
Apr 23, 2026
Merged

RATIS-2497. Pass server to the dummy watch request in OrderedAsync#1438
szetszwo merged 2 commits intoapache:masterfrom
szetszwo:RATIS-2497

Conversation

@szetszwo
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Contributor

@ivandika3 ivandika3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @szetszwo for the patch. Left one comment.

Comment on lines 42 to +43
return TraceClient.asyncSend(
() -> client.getOrderedAsync().send(type, message, server), type, server);
() -> client.getOrderedAsync(server).send(type, message, server), type, server);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, if the server is not a leader, it will throw NotLeaderException since watchAsync calls checkLeaderState. This might cause AsyncStreamObservers#completeReplyExceptionally which will trigger client to invalidate all the pending replies.

One idea is to allow follower to accept the watch request if the RaftClientRequest serverId matches, but currently WatchRequests is stored in the LeaderStateImpl. So we might want to make WATCH(0) to be a special "bootstrap" request that can be handled by follower, otherwise the watch should throw NotLeaderException to preserve the watch semantics.

Also, could you help to add some tests to test to behavior similar to #1433? Set the dummy request to true for LinearizableReadTests should be enough.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... watchAsync calls checkLeaderState ...

You are right -- we should use staleRead instead.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... Set the dummy request to true for LinearizableReadTests should be enough.

Sure. Let me include the change.

@szetszwo
Copy link
Copy Markdown
Contributor Author

szetszwo commented Apr 22, 2026

Currently, if the server is not a leader, it will throw NotLeaderException ...

... we should use staleRead instead.

@ivandika3 , Tried to set a special minIndex (e.g. Long.MAX_VALUE/MIN_VALUE) to staleRead but it became an incompatible change since users might also set it.

Then, found that watch request passed a null message and users could not set it. So, we may send a watch request with a "DUMMY" message. The server can detect it and return immediately without leader check.

LinearizableReadTests can pass.

@ivandika3
Copy link
Copy Markdown
Contributor

@szetszwo Thanks for the patch. LGTM +1.

Tried to set a special minIndex (e.g. Long.MAX_VALUE/MIN_VALUE) to staleRead but it became an incompatible change since users might also set it.

Yes, we should not use staleRead since the behavior might be coupled with StateMachine#queryStale which can cause unpredictability.

Then, found that watch request passed a null message and users could not set it. So, we may send a watch request with a "DUMMY" message. The server can detect it and return immediately without leader check.

This is a good workaround.

LinearizableReadTests can pass.

Thanks, do you think we should include tests like runTestFollowerLinearizableReadStaysOnFollower in #1433 to be more explicit? I'm OK if you choose to leave it be.

@ivandika3
Copy link
Copy Markdown
Contributor

@OneSizeFitsQuorum Would you like to take a look?

@OneSizeFitsQuorum
Copy link
Copy Markdown
Contributor

LGTM~
The DUMMY request only establishes the RPC connection and bypasses leader state checks, while real read requests still follow the full linearizability path.
I think this is a good workaround!

@szetszwo
Copy link
Copy Markdown
Contributor Author

@ivandika3 , @OneSizeFitsQuorum , thanks for reviewing this!

  • Checked the test that the read-from-follower client only talked to a that follower and there were no failovers.
  • This is a compatible change:
    • Old client -> New server: the old client sends a normal watch request to the new server. It will processed normally.
    • New client -> Old server: the new client sends a DUMMY watch request to the old server. The DUMMY message will be ignored and the watch request will be processed as a normal watch request.

@szetszwo szetszwo merged commit 3d9fd18 into apache:master Apr 23, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants