High-performance JSON streaming parser for Node.js with multiple implementations optimized for different use cases.
npm install '@oresoftware/json-stream-parser'Automatic Native Addon Compilation: The package automatically attempts to compile the native addon for your local machine architecture during installation via the postinstall script. The build system tries modern tools first, then falls back to traditional tools:
- cmake-js (modern CMake-based build) - requires CMake
- node-gyp-build (modern wrapper) - drop-in replacement for node-gyp
- node-gyp (traditional fallback) - usually bundled with npm
If compilation fails (e.g., missing build tools), the package will still work but the native parser will be unavailable. You can:
- Use the pure JavaScript
JSONParserinstead (works without native addon) - Install build tools and run
npm run build:nativemanually - On most systems,
node-gypwill automatically install required dependencies
Build Requirements (for native addon):
- Option 1 (Modern): CMake 3.10+ and
cmake-js(vianpm install -g cmake-js) - Option 2 (Modern):
node-gyp-build(vianpm install -g node-gyp-build) - Option 3 (Traditional): Python 3.x, C++ compiler (g++ on Linux/macOS, Visual Studio on Windows), and
node-gyp(usually bundled with npm)
import {JSONParser} from '@oresoftware/json-stream-parser';
import * as net from 'net';
// TCP connection example
const ws = net.createConnection(6970, 'localhost');
ws.setEncoding('utf8')
.pipe(new JSONParser()) // TCP connection is bidirectional/full-duplex
.on('data', (obj) => {
// Receive parsed JSON objects from the TCP server
console.log('Received:', obj);
});
// Send JSON data to the server
ws.write(JSON.stringify({some: 'data'}) + '\n', 'utf8');import * as fs from 'node:fs';
import {createJsonParserNativeFromFd} from '@oresoftware/json-stream-parser';
const fd = fs.openSync('/path/to/file.jsonl', 'r');
// Optimized mode is the default (passRawBuffers: true)
const s = createJsonParserNativeFromFd(fd, {
delimiter: '\n',
batchSize: 2048
});
s.on('data', (obj) => {
// obj is a fully parsed POJO (nested objects/arrays supported)
console.log('Parsed:', obj);
});| Use Case | Parser | Why |
|---|---|---|
| TCP/WebSocket connections | JSONParser |
Streams don't have file descriptors |
| stdin/stdout | JSONParser |
Process streams |
| Child process pipes | JSONParser |
Stream-based |
| File descriptors | createJsonParserNativeFromFd |
Background I/O thread, better performance |
| Large files | createJsonParserNativeFromFd |
Optimized for throughput |
| Main thread busy | createJsonParserNativeFromFd |
Background I/O + zero-copy buffers |
import * as net from 'net';
import {JSONParser} from '@oresoftware/json-stream-parser';
const [port, host] = [6970, 'localhost'];
const ws = net.createConnection(port, host);
ws.setEncoding('utf8')
.pipe(new JSONParser()) // TCP connection is bidirectional/full-duplex
.on('data', (obj) => {
// Receive parsed JSON objects from the TCP server
console.log('Received:', obj);
});
// Send JSON data to the server
ws.write(JSON.stringify({some: 'data'}) + '\n', 'utf8', (err) => {
if (err) console.error('Write error:', err);
});import * as net from 'net';
import {JSONParser} from '@oresoftware/json-stream-parser';
const server = net.createServer((socket) => {
console.log('Client connected');
const parser = new JSONParser();
socket.setEncoding('utf8').pipe(parser);
parser.on('data', (obj) => {
console.log('Received from client:', obj);
// Echo back with response
socket.write(JSON.stringify({
echo: obj,
timestamp: Date.now()
}) + '\n');
});
socket.on('end', () => {
console.log('Client disconnected');
});
});
server.listen(6970, () => {
console.log('TCP JSON server listening on port 6970');
});import {JSONParser} from '@oresoftware/json-stream-parser';
process.stdin.resume()
.pipe(new JSONParser())
.on('data', (obj) => {
// Process parsed JSON objects
console.log('Parsed:', obj);
});import {spawn} from 'node:child_process';
import {JSONParser} from '@oresoftware/json-stream-parser';
const child = spawn('bash', ['-c', 'echo \'{"foo":"bar"}\n\'']);
child.stdout
.pipe(new JSONParser())
.on('data', (obj) => {
console.log('Received:', obj); // => {foo: 'bar'}
});delayEvery: integer
Every x chunks, will use
setImmediate()to delay processing; good for not blocking the event loop too much.
emitNonJSON: boolean
If there is a line of input that cannot be JSON parsed, it will be emitted as
"string", but it will not be pushed to output.
The native parser captures the file descriptor directly in C++, bypassing Node.js streams. Data flows: Kernel → C++ background thread → Zero-copy → JS (no Node.js stream layer).
import * as fs from 'node:fs';
import * as net from 'node:net';
import {
createJsonParserNativeFromFd,
createJsonParserNativeFromStdin,
createJsonParserNativeFromPath,
createJsonParserNativeFromSocket,
RawStringSymbol,
RawJSONBytesSymbol
} from '@oresoftware/json-stream-parser';
// 1) From a file path (auto-opens + auto-closes the FD)
const s1 = createJsonParserNativeFromPath('/path/to/file.jsonl', { delimiter: '\n' });
// 2) From stdin (fd=0)
// IMPORTANT: do not also do `process.stdin.pipe(...)` at the same time.
const s2 = createJsonParserNativeFromStdin({ delimiter: '\n' });
// 3) From any existing FD you already have
const fd = fs.openSync('/path/to/file.jsonl', 'r'); // you own this FD
const s3 = createJsonParserNativeFromFd(fd, { delimiter: '\n', closeFdOnEnd: false, passRawBuffers: true });
// 4) From a TCP socket (net.Socket)
// IMPORTANT: do not attach 'data' listeners / pipe() this socket in JS at the same time.
const sock = net.createConnection(6970, 'localhost');
const s4 = createJsonParserNativeFromSocket(sock, { delimiter: '\n' });
// 5) From a unix domain socket path (net.Socket)
const usock = net.createConnection({ path: '/tmp/my.sock' });
const s5 = createJsonParserNativeFromSocket(usock, { delimiter: '\n' });
const s = s1; // pick one of the above
s.on('data', (obj) => {
// obj is a fully parsed POJO/array/value (nested OK)
// metadata (if enabled) uses the same symbols as the TS parser:
// obj[RawStringSymbol], obj[RawJSONBytesSymbol]
console.log('Parsed:', obj);
});
// Optional metadata + behavior flags:
const sWithMeta = createJsonParserNativeFromPath('/path/to/file.jsonl', {
delimiter: '\n',
batchSize: 64,
includeRawString: true,
includeByteCount: true,
emitNonJSON: true
});
sWithMeta.on('string', (line) => {
// only when emitNonJSON: true
console.log('Non-JSON line:', line);
});
sWithMeta.on('stats', (stats) => {
// { bytesRead, bytesWritten, linesOk, linesFailed, ended }
console.log('Stats:', stats);
});When you use createJsonParserNativeFromStdin() or createJsonParserNativeFromSocket(), native code reads the FD directly.
Do not also consume that same stream in JS-land (no .pipe(), no 'data' listeners), or you’ll race for bytes.
stdin is file descriptor 0 - pass it directly:
import {createJsonParserNativeFromFd} from '@oresoftware/json-stream-parser';
// stdin (fd 0) → C++ background thread → JS (bypasses Node.js streams)
const parser = createJsonParserNativeFromFd(0, {
delimiter: '\n',
batchSize: 2048
});
parser.on('data', (obj) => {
console.log('Parsed:', obj);
});You can extract the file descriptor from a socket and use the native parser:
import * as net from 'net';
import {createJsonParserNativeFromFd} from '@oresoftware/json-stream-parser';
const server = net.createServer((socket) => {
// Get underlying fd (internal API - may change between Node.js versions)
const fd = (socket as any)._handle?.fd;
if (fd !== undefined && fd >= 0) {
// Pass fd directly to C++ - bypasses Node.js stream layer!
const parser = createJsonParserNativeFromFd(fd, {
delimiter: '\n',
batchSize: 2048
});
parser.on('data', (obj) => {
console.log('Received:', obj);
// socket.write() still works for sending data
socket.write(JSON.stringify({echo: obj}) + '\n');
});
} else {
// Fallback to stream parser if fd not available
const {JSONParser} = require('@oresoftware/json-stream-parser');
socket.pipe(new JSONParser()).on('data', (obj) => {
console.log('Received:', obj);
});
}
});
server.listen(6970);Note: Socket fd access uses internal Node.js APIs. See Direct File Descriptor Access for details.
const parser = new JSONParser({
delimiter: '\n', // Separator between JSON objects (default: '\n')
emitNonJSON: false, // Emit non-JSON lines as 'string' events (default: false)
delayEvery: 0 // Yield to event loop every N chunks (default: 0 = no yielding)
});import {JSONParser, RawStringSymbol, RawJSONBytesSymbol} from '@oresoftware/json-stream-parser';
const parser = new JSONParser({
includeRawString: true, // Attach original JSON string
includeByteCount: true // Attach byte count
});
parser.on('data', (obj) => {
const rawJson = obj[RawStringSymbol]; // Original JSON string
const byteCount = obj[RawJSONBytesSymbol]; // Byte count
// Your parsed object
console.log('Parsed:', obj);
});If your JSON is separated by something other than newlines:
const parser = new JSONParser({
delimiter: '∆∆∆' // Use custom delimiter to separate JSON chunks
});
stream.pipe(parser);The native parser uses a C++ background thread (std::thread) for I/O and provides better performance, especially when the main thread is busy.
- ✅ Background I/O: File reading happens on separate thread
- ✅ Zero-copy buffers: Efficient data transfer (default mode)
- ✅ V8 optimization: Uses V8's highly optimized
JSON.parse() - ✅ Nested objects: Supports objects and arrays of any depth
- ✅ Resilient under load: Only 1.5x slower at 90% CPU load
Automatic: The native addon is automatically compiled during npm install via the postinstall script.
Manual: If you need to rebuild it manually:
npm run build:native
# or
node-gyp rebuildThe native parser supports two implementation modes, selectable via passRawBuffers:
Mode 1: Optimized (passRawBuffers: true, default)
- Native thread: I/O + splitting
- Main thread: JSON.parse() on raw buffers
- Best for: When main thread is busy (I/O offloaded)
- Performance: ~13ms for 10K objects
Mode 2: C++ Parsing (passRawBuffers: false)
- Native thread: I/O + splitting + C++ JSON parsing + N-API object construction
- Main thread: Receives pre-parsed POJSOs
- Best for: When main thread is idle (direct object construction)
- Performance: ~12.6ms for 10K objects
Both modes are fully supported and tested. Performance difference is minimal (~2%).
const s = createJsonParserNativeFromFd(fd, {
delimiter: '\n', // Separator between JSON objects
batchSize: 2048, // Batch size (larger = better throughput)
passRawBuffers: true, // Default: true (optimized mode)
includeRawString: true, // Attach original JSON string
includeByteCount: true, // Attach byte count
emitNonJSON: true, // Emit non-JSON lines as 'string' events
trackBytesRead: true, // Track bytes read
trackBytesWritten: true, // Track bytes written
yieldEvery: 0 // Yield to event loop every N items
});For 5,000 nested JSON objects:
| CPU Load | Time (ms) | Throughput (obj/sec) | Slowdown |
|---|---|---|---|
| 0% (idle) | 21.27 | 235,089 | 1.00x |
| 50% | 29.04 | 172,195 | 1.37x |
| 90% | 32.05 | 155,984 | 1.51x |
Even at 90% CPU load, the native parser maintains excellent performance!
For a pure JavaScript implementation using Node.js worker threads:
import {createJsonParserWorkerFromFd} from '@oresoftware/json-stream-parser';
const fd = fs.openSync('/path/to/file.jsonl', 'r');
const s = createJsonParserWorkerFromFd(fd, {
delimiter: '\n',
batchSize: 512
});
s.on('data', (obj) => {
// obj is a fully parsed POJO (parsed in worker thread, passed via structured cloning)
// Works with nested objects and arrays
console.log('Parsed:', obj);
});Note: The native parser is typically faster than the worker parser because:
- Native uses zero-copy buffers (just pointer transfer)
- Worker uses structured cloning (full serialization/deserialization of object graph)
- For nested objects, structured cloning overhead can be significant
const parser = new JSONParser();
parser.on('error', (err) => {
console.error('Parse error:', err);
// Parser continues processing other chunks
});
parser.on('data', (obj) => {
try {
// Process object
} catch (err) {
console.error('Processing error:', err);
}
});- JSONParser (TS): Fastest (~16ms for 5K objects)
- Native-optimized: Slightly slower (~21ms) due to thread overhead
- Worker: Slowest (~31ms) due to structured cloning
- Native-optimized: Best performance (only 1.37x slower at 50% load)
- Worker: Good performance (parsing offloaded to worker thread)
- JSONParser: Degrades significantly (parsing blocks main thread)
class JSONParser<T = any> extends stream.Transform {
constructor(opts?: JSONParserOpts);
// Events
on(event: 'data', listener: (obj: T) => void): this;
on(event: 'string', listener: (line: string) => void): this;
on(event: 'error', listener: (err: Error) => void): this;
}
interface JSONParserOpts {
delimiter?: string;
emitNonJSON?: boolean;
includeRawString?: boolean;
includeByteCount?: boolean;
delayEvery?: number;
wrapMetadata?: boolean;
}function createJsonParserNativeFromFd(
fd: number,
opts?: JsonParserNativeOpts
): stream.Readable;
interface JsonParserNativeOpts {
delimiter?: string;
batchSize?: number;
passRawBuffers?: boolean; // Default: true (optimized mode)
includeRawString?: boolean;
includeByteCount?: boolean;
emitNonJSON?: boolean;
trackBytesRead?: boolean;
trackBytesWritten?: boolean;
yieldEvery?: number;
wrapMetadata?: boolean;
}function createJsonParserWorkerFromFd(
fd: number,
opts?: JsonParserWorkerOpts
): stream.Readable;
interface JsonParserWorkerOpts {
delimiter?: string;
batchSize?: number;
emitNonJSON?: boolean;
trackBytesRead?: boolean;
trackBytesWritten?: boolean;
yieldEvery?: number;
}The native parser captures file descriptors directly in C++ so data doesn't flow through Node.js before reaching the native runtime:
- Bypasses Node.js streams - Data goes directly from kernel to C++ background thread
- Zero-copy transfer - Raw buffers passed to JS without intermediate copies
- Background I/O - Reading happens on separate C++ thread using
read()syscall - Better performance - No JavaScript stream processing overhead
See Direct File Descriptor Access for details on using stdin and socket file descriptors.
The native parser captures file descriptors directly in C++ so data doesn't flow through Node.js before reaching the native runtime:
- Bypasses Node.js streams - Data goes directly from kernel to C++ background thread
- Zero-copy transfer - Raw buffers passed to JS without intermediate copies
- Background I/O - Reading happens on separate C++ thread using
read()syscall - Better performance - No JavaScript stream processing overhead
Data Flow:
Kernel → C++ Background Thread (read syscall) → Zero-copy Buffer → JS
No Node.js stream layer involved! See Direct File Descriptor Access for details on using stdin and socket file descriptors.
| CPU Load | Native-Opt (ms) | Pure JS (ms) | Winner | Speedup |
|---|---|---|---|---|
| 0% (idle) | 20.56 | 18.93 | Pure JS | 1.09x faster |
| 25% (low) | 24.45 | 21.98 | Pure JS | 1.11x faster |
| 50% (medium) | 26.77 | 26.45 | Pure JS | 1.01x faster (tied) |
| 75% (high) | 25.67 | 42.20 | Native-Opt | 1.64x faster |
| 90% (very high) | 35.84 | 59.60 | Native-Opt | 1.66x faster |
- Pure JS: 20.46ms avg
- Native-Opt: 22.51ms avg
- Winner: Pure JS (1.10x faster)
- Why: Thread overhead outweighs benefits when main thread is idle
- Pure JS: 26.45ms
- Native-Opt: 26.77ms
- Winner: Pure JS (1.01x faster - essentially tied)
- Why: Thread overhead still present, but benefits start to show
- Pure JS: 50.90ms avg
- Native-Opt: 30.75ms avg
- Winner: Native-Optimized (1.66x faster)
- Why: Background I/O thread prevents blocking busy main thread
0% CPU Load:
Native-Opt: █████████████████████ 20.56ms
Pure JS: ███████████████████ 18.93ms ← JS wins
25% CPU Load:
Native-Opt: █████████████████████████ 24.45ms
Pure JS: ██████████████████████ 21.98ms ← JS wins
50% CPU Load:
Native-Opt: ███████████████████████████ 26.77ms
Pure JS: ███████████████████████████ 26.45ms ← Tied
75% CPU Load:
Native-Opt: ██████████████████████████ 25.67ms
Pure JS: ██████████████████████████████████████████ 42.20ms ← Native wins!
90% CPU Load:
Native-Opt: ████████████████████████████████████ 35.84ms
Pure JS: ████████████████████████████████████████████████████████████ 59.60ms ← Native wins!
-
Low/Medium Load: Pure JS is faster or tied
- Thread overhead makes native slower when main thread is idle
- Pure JS has no thread overhead
-
High Load: Native-Optimized is significantly faster
- At 75% load: 1.64x faster
- At 90% load: 1.66x faster
- Background I/O thread prevents blocking
-
Crossover Point: Around 50% CPU load
- Below 50%: Pure JS wins
- Above 50%: Native-Optimized wins
-
Use Pure JS (
JSONParser) when:- Main thread is idle or low load (<50% CPU)
- You want simplicity (no native addon)
- You're using streams (TCP, stdin, etc.)
-
Use Native-Optimized when:
- Main thread is busy (>50% CPU)
- You have file descriptors
- You need consistent performance under load
- You're processing large files
- Architecture - System design and data flow diagrams
- JSON Parsing Location - Where JSON parsing happens (C++ vs JS thread)
- Direct File Descriptor Access - How native parser captures FDs in C++
- TCP Usage Guide - Detailed TCP connection examples
- CPU Load Performance - Performance under different CPU loads
- Native vs Worker - Comparison of native and worker parsers
SEE LICENSE IN license.md