Skip to content

fix: set UTF-8 charset before reading request body in servlet transports#930

Open
suryateja-g13 wants to merge 1 commit intomodelcontextprotocol:mainfrom
suryateja-g13:fix/880-servlet-request-charset-encoding
Open

fix: set UTF-8 charset before reading request body in servlet transports#930
suryateja-g13 wants to merge 1 commit intomodelcontextprotocol:mainfrom
suryateja-g13:fix/880-servlet-request-charset-encoding

Conversation

@suryateja-g13
Copy link
Copy Markdown

Summary

Fixes #880

All three servlet-based server transports called request.getReader() without first calling request.setCharacterEncoding("UTF-8"). Per the Jakarta Servlet spec, getReader() defaults to ISO-8859-1 when the Content-Type header does not include an explicit charset parameter. Since Content-Type: application/json without a charset is valid per RFC 8259, non-ASCII characters in tool names, argument values, and notification data were silently corrupted.

The fix adds request.setCharacterEncoding("UTF-8") immediately before each request.getReader() call in the three affected transports:

  • HttpServletStreamableServerTransportProvider
  • HttpServletSseServerTransportProvider
  • HttpServletStatelessServerTransport

This mirrors the fix already applied to StdioServerTransportProvider in #826 and the response path in #881.

Changes

  • HttpServletStreamableServerTransportProvider.java — set UTF-8 encoding before getReader() in POST handler
  • HttpServletSseServerTransportProvider.java — set UTF-8 encoding before getReader() in message handler
  • HttpServletStatelessServerTransport.java — set UTF-8 encoding before getReader() in request handler

Testing

  • All existing servlet integration tests pass: HttpServletStreamableIntegrationTests (32 tests), HttpServletSseIntegrationTests, HttpServletStatelessIntegrationTests (10 tests) — 74 total, 0 failures.
  • The fix can be manually verified by sending a tool call with non-ASCII arguments (e.g. Japanese, Chinese, emoji) to any servlet-based MCP server without charset=utf-8 in the Content-Type header.

All three servlet-based server transports called request.getReader()
without first setting the character encoding. Per the Jakarta Servlet
spec, getReader() defaults to ISO-8859-1 when the Content-Type header
has no explicit charset parameter. Since application/json without a
charset is valid per RFC 8259, non-ASCII characters in tool names,
argument values, and notification data were silently corrupted.

The analogous server-side fix was applied to StdioServerTransportProvider
in modelcontextprotocol#826 and to the HTTP response path in modelcontextprotocol#881. This commit completes the
fix by applying request.setCharacterEncoding("UTF-8") before getReader()
in HttpServletStreamableServerTransportProvider,
HttpServletSseServerTransportProvider, and
HttpServletStatelessServerTransport.

Fixes modelcontextprotocol#880
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Servlet-based server transports read request body with wrong charset encoding

1 participant