Vectorize Utf8JsonReader.SkipWhiteSpace#129701
Conversation
Apply the transferable idea from the simdjson paper (arXiv:1902.08318) -- vectorized scanning to the next interesting byte -- to the reader's whitespace-skipping hot path. SkipWhiteSpace now uses a hybrid strategy: the existing scalar loop handles the first MaxScalarWhiteSpaceScanLength (16) whitespace bytes (the common case, at no added cost since the threshold check lives inside the whitespace branch), then hands longer runs to a vectorized IndexOfAnyExcept(SearchValues) scan, reproducing the exact _lineNumber/_bytePositionInLine bookkeeping via the existing CountNewLines helper. All changes are internal and gated on #if NET (netstandard keeps the pure scalar loop), so there is no public API or ref-struct layout change -- source- and binary-compatible. End-to-end this is neutral on minified/shallow-pretty documents and ~20% faster on deeply-nested pretty JSON; the isolated whitespace scan is 2-7x faster on long runs. Adds targeted tests covering the scalar-to-vector boundary including embedded newlines. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Tagging subscribers to this area: @dotnet/area-system-text-json |
There was a problem hiding this comment.
Pull request overview
This PR optimizes Utf8JsonReader.SkipWhiteSpace by adding a .NET-only hybrid scalar→vectorized whitespace scan for long whitespace runs, aiming to improve throughput on whitespace-heavy JSON while keeping common cases effectively unchanged.
Changes:
- Adds a
#if NETlong-run fallback inUtf8JsonReader.SkipWhiteSpacethat uses a vectorized “find first non-whitespace” search plus newline/byte-position bookkeeping viaJsonReaderHelper.CountNewLines. - Introduces a
SearchValues<byte>-backedIndexOfFirstNonWhiteSpacehelper for .NETCoreApp builds. - Adds targeted reader tests for long whitespace runs and for correct
LineNumber/BytePositionInLinereporting after long whitespace before an invalid token.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/libraries/System.Text.Json/src/System/Text/Json/Reader/Utf8JsonReader.cs | Adds a .NET-only scalar threshold and then vectorized scan for long whitespace runs, preserving line/byte bookkeeping. |
| src/libraries/System.Text.Json/src/System/Text/Json/Reader/JsonReaderHelper.net8.cs | Adds SearchValues-based whitespace classification and IndexOfFirstNonWhiteSpace helper for vectorized scanning. |
| src/libraries/System.Text.Json/src/System/Text/Json/JsonConstants.cs | Introduces MaxScalarWhiteSpaceScanLength constant to control the scalar→vector threshold. |
| src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Utf8JsonReaderTests.cs | Adds tests covering long whitespace runs and exception location reporting after long whitespace. |
| Assert.True(json.Read()); | ||
| Assert.Equal(JsonTokenType.Number, json.TokenType); | ||
| Assert.Equal(38, json.ValueSpan.Length); | ||
|
|
|
@EgorBot -amd -intel -arm64 --filter "System.Text.Json.Read" Runs the existing dotnet/performance Note This comment was generated with the assistance of GitHub Copilot. |
|
@EgorBot -amd -intel -arm64 --filter "System.Text.Json.DeserializeFromString" |
Benchmarks showed the scalar-prefix threshold (the hybrid gate) regressed the common shallow/medium pretty-printed shapes (~1.03x/1.07x slower than baseline) because of per-byte counter overhead that is rarely amortized, and it only vectorized the run after the first 16 bytes. Always handing the run to the SearchValues-based IndexOfAnyExcept is both simpler and faster on every whitespace-bearing shape (0.91x shallow, 0.68x medium, 0.42x deep pretty) and neutral on minified. Removes the MaxScalarWhiteSpaceScanLength constant and the hybrid bookkeeping. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
@EgorBot -amd -intel -arm64 --filter "System.Text.Json.Read" Re-running after simplifying Note This comment was created with the assistance of GitHub Copilot. |
|
@EgorBot -amd -intel -arm64 --filter "System.Text.Json.DeserializeFromString" |
|
Bespoke benchmark focused on
@EgorBot -amd -intel -arm64 using System;
using System.Buffers;
using System.Text;
using System.Text.Json;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
BenchmarkSwitcher.FromAssembly(typeof(JsonWhitespaceReadBench).Assembly).Run(args);
[MemoryDiagnoser]
public class JsonWhitespaceReadBench
{
private byte[] _payload = default!;
[Params("Minified", "Pretty", "PrettyDeepNested", "WhitespaceHeavy")]
public string Payload = "Pretty";
[GlobalSetup]
public void Setup()
{
_payload = Payload switch
{
"Minified" => BuildDoc(depth: 4, breadth: 4, indented: false),
"Pretty" => BuildDoc(depth: 4, breadth: 4, indented: true),
"PrettyDeepNested" => BuildDoc(depth: 8, breadth: 2, indented: true),
"WhitespaceHeavy" => BuildWhitespaceHeavy(count: 2000),
_ => throw new ArgumentOutOfRangeException(nameof(Payload)),
};
}
[Benchmark]
public long Read()
{
var reader = new Utf8JsonReader(_payload, isFinalBlock: true, state: default);
long tokens = 0;
while (reader.Read())
{
tokens++;
}
return tokens;
}
private static byte[] BuildDoc(int depth, int breadth, bool indented)
{
var buffer = new ArrayBufferWriter<byte>();
using (var writer = new Utf8JsonWriter(buffer, new JsonWriterOptions { Indented = indented }))
{
WriteObject(writer, depth, breadth);
}
return buffer.WrittenSpan.ToArray();
}
private static void WriteObject(Utf8JsonWriter w, int depth, int breadth)
{
w.WriteStartObject();
w.WriteString("name", "some descriptive name value");
w.WriteNumber("id", 1234567);
w.WriteBoolean("enabled", true);
w.WriteNull("optional");
w.WriteString("timestamp", "2024-01-01T12:00:00Z");
w.WriteStartArray("tags");
for (int i = 0; i < breadth; i++)
{
w.WriteStringValue("tag-value-" + i);
}
w.WriteEndArray();
if (depth > 0)
{
w.WriteStartArray("children");
for (int i = 0; i < breadth; i++)
{
WriteObject(w, depth - 1, breadth);
}
w.WriteEndArray();
}
w.WriteEndObject();
}
private static byte[] BuildWhitespaceHeavy(int count)
{
// Valid JSON array with large mixed-whitespace runs (newline + tabs) inserted
// at every legal position to stress the whitespace-skipping path.
const string Ws = "\n\t\t\t\t\t\t\t\t";
var sb = new StringBuilder();
sb.Append('[');
for (int i = 0; i < count; i++)
{
if (i > 0)
{
sb.Append(',');
}
sb.Append(Ws).Append('{');
sb.Append(Ws).Append("\"id\":").Append(Ws).Append(i);
sb.Append(Ws).Append(",\"name\":").Append(Ws).Append("\"item").Append(i).Append('"');
sb.Append(Ws).Append(",\"active\":").Append(Ws).Append(i % 2 == 0 ? "true" : "false");
sb.Append(Ws).Append('}');
}
sb.Append(Ws).Append(']');
return Encoding.UTF8.GetBytes(sb.ToString());
}
}Note This comment was created with the assistance of GitHub Copilot. |
|
Change does not regress existing benchmarks while showing significant performance gains in a custom benchmark processing whitespace: EgorBot/Benchmarks#264 (comment) |
Summary
Applies the one transferable idea from the simdjson paper (arXiv:1902.08318) — vectorized scanning to the next "interesting" byte — to
Utf8JsonReader's whitespace-skipping hot path, without any source or binary breaking changes.simdjson's signature techniques (whole-buffer two-pass structural indexing,
clmulquote masking,vpshufbclassification, bulk UTF-8 pre-validation) are fundamentally incompatible withUtf8JsonReader's streaming, single-pass, zero-allocation, forward-only contract. But the spirit of its stage-1 classification — "advance to the next non-whitespace byte" — maps cleanly onto the runtime's portableSearchValues/IndexOfAnyExcept, which the reader already uses for its string scan.Approach
On .NET,
SkipWhiteSpacescans straight to the first non-whitespace byte with a vectorizedIndexOfAnyExcept(SearchValues), reproducing the exact_lineNumber/_bytePositionInLinebookkeeping via the existingJsonReaderHelper.CountNewLineshelper.SearchValuesalready handles short and long inputs efficiently, so there is no scalar pre-scan or threshold — the no-whitespace case (e.g. minified JSON) is handled byIndexOfAnyExceptitself at no measurable cost.All changes are internal and gated on
#if NET(netstandard keeps the pure scalar loop). No public API, no new instance fields, noref structlayout change ⇒ source- and binary-compatible. The multi-segment reader benefits for free, sinceSkipWhiteSpaceMultiSegmentdelegates toSkipWhiteSpace.A digit-scan vectorization (
ConsumeIntegerDigits) was also prototyped but dropped: real JSON numbers are short (≤ 17–19 digits), so the scalar loop already wins and a hybrid regressed the mid-range.Performance
Measured with BenchmarkDotNet (in-process, net11.0 host) using a faithful driver that marks each
SkipWhiteSpacevariant[MethodImpl(NoInlining)](matching the real reader, which is not inlined) and walks realistic serialized JSON token-by-token. A whole-document checksum (_consumed+_lineNumber+_bytePositionInLine) is byte-identical across all variants.An earlier revision of this PR gated the vectorized scan behind a 16-byte scalar prefix. Benchmarks showed that gate regressed the common shallow/medium pretty cases (1.03×/1.07× slower than baseline) because of per-byte counter overhead that is rarely amortized, so it was removed in favor of the unconditional scan above. Validation on real hardware via @EgorBot / the perf lab is recommended before merge, since the win is concentrated in whitespace-heavy / deeply-indented documents.
Correctness & tests
System.Text.Json.Testssuite passes on both target frameworks (net11.0andnet481), 0 failures (52,633 / 52,395).\r\nruns, leading tabs, and all-whitespace tails, verifyingLineNumber/BytePositionInLine._consumed+_lineNumber+_bytePositionInLine) is byte-identical between the scalar baseline and the vectorized path across minified, pretty, and deeply-nested documents.Notes
This is intentionally a small, focused, non-breaking change rather than an attempt to restructure the reader toward simdjson's architecture (which would require a whole-buffer, indexing parser and break streaming/positional contracts).
Note
This pull request was created with the assistance of GitHub Copilot.