Skip to content

Vectorize Utf8JsonReader.SkipWhiteSpace#129701

Open
eiriktsarpalis wants to merge 3 commits into
dotnet:mainfrom
eiriktsarpalis:eiriktsarpalis-simd-json-parsing
Open

Vectorize Utf8JsonReader.SkipWhiteSpace#129701
eiriktsarpalis wants to merge 3 commits into
dotnet:mainfrom
eiriktsarpalis:eiriktsarpalis-simd-json-parsing

Conversation

@eiriktsarpalis

@eiriktsarpalis eiriktsarpalis commented Jun 22, 2026

Copy link
Copy Markdown
Member

Summary

Applies the one transferable idea from the simdjson paper (arXiv:1902.08318)vectorized scanning to the next "interesting" byte — to Utf8JsonReader's whitespace-skipping hot path, without any source or binary breaking changes.

simdjson's signature techniques (whole-buffer two-pass structural indexing, clmul quote masking, vpshufb classification, bulk UTF-8 pre-validation) are fundamentally incompatible with Utf8JsonReader's streaming, single-pass, zero-allocation, forward-only contract. But the spirit of its stage-1 classification — "advance to the next non-whitespace byte" — maps cleanly onto the runtime's portable SearchValues/IndexOfAnyExcept, which the reader already uses for its string scan.

Approach

On .NET, SkipWhiteSpace scans straight to the first non-whitespace byte with a vectorized IndexOfAnyExcept(SearchValues), reproducing the exact _lineNumber / _bytePositionInLine bookkeeping via the existing JsonReaderHelper.CountNewLines helper. SearchValues already handles short and long inputs efficiently, so there is no scalar pre-scan or threshold — the no-whitespace case (e.g. minified JSON) is handled by IndexOfAnyExcept itself at no measurable cost.

All changes are internal and gated on #if NET (netstandard keeps the pure scalar loop). No public API, no new instance fields, no ref struct layout change ⇒ source- and binary-compatible. The multi-segment reader benefits for free, since SkipWhiteSpaceMultiSegment delegates to SkipWhiteSpace.

A digit-scan vectorization (ConsumeIntegerDigits) was also prototyped but dropped: real JSON numbers are short (≤ 17–19 digits), so the scalar loop already wins and a hybrid regressed the mid-range.

Performance

Measured with BenchmarkDotNet (in-process, net11.0 host) using a faithful driver that marks each SkipWhiteSpace variant [MethodImpl(NoInlining)] (matching the real reader, which is not inlined) and walks realistic serialized JSON token-by-token. A whole-document checksum (_consumed + _lineNumber + _bytePositionInLine) is byte-identical across all variants.

Document Baseline This PR Ratio
minified 45.4 µs 45.5 µs 1.00× — neutral
pretty, shallow (common) 1,137 µs 1,037 µs 0.91× — ~10% faster
pretty, medium (common) 849 µs 574 µs 0.68× — ~1.5× faster
pretty, deeply nested 1,248 µs 519 µs 0.42× — ~2.4× faster

An earlier revision of this PR gated the vectorized scan behind a 16-byte scalar prefix. Benchmarks showed that gate regressed the common shallow/medium pretty cases (1.03×/1.07× slower than baseline) because of per-byte counter overhead that is rarely amortized, so it was removed in favor of the unconditional scan above. Validation on real hardware via @EgorBot / the perf lab is recommended before merge, since the win is concentrated in whitespace-heavy / deeply-indented documents.

Correctness & tests

  • Full System.Text.Json.Tests suite passes on both target frameworks (net11.0 and net481), 0 failures (52,633 / 52,395).
  • Adds two targeted tests covering long whitespace runs with embedded newlines, \r\n runs, leading tabs, and all-whitespace tails, verifying LineNumber / BytePositionInLine.
  • An independent whole-document checksum (final _consumed + _lineNumber + _bytePositionInLine) is byte-identical between the scalar baseline and the vectorized path across minified, pretty, and deeply-nested documents.

Notes

This is intentionally a small, focused, non-breaking change rather than an attempt to restructure the reader toward simdjson's architecture (which would require a whole-buffer, indexing parser and break streaming/positional contracts).

Note

This pull request was created with the assistance of GitHub Copilot.

Apply the transferable idea from the simdjson paper (arXiv:1902.08318) -- vectorized scanning to the next interesting byte -- to the reader's whitespace-skipping hot path.

SkipWhiteSpace now uses a hybrid strategy: the existing scalar loop handles the first MaxScalarWhiteSpaceScanLength (16) whitespace bytes (the common case, at no added cost since the threshold check lives inside the whitespace branch), then hands longer runs to a vectorized IndexOfAnyExcept(SearchValues) scan, reproducing the exact _lineNumber/_bytePositionInLine bookkeeping via the existing CountNewLines helper. All changes are internal and gated on #if NET (netstandard keeps the pure scalar loop), so there is no public API or ref-struct layout change -- source- and binary-compatible.

End-to-end this is neutral on minified/shallow-pretty documents and ~20% faster on deeply-nested pretty JSON; the isolated whitespace scan is 2-7x faster on long runs. Adds targeted tests covering the scalar-to-vector boundary including embedded newlines.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dotnet-policy-service

Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @dotnet/area-system-text-json
See info in area-owners.md if you want to be subscribed.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes Utf8JsonReader.SkipWhiteSpace by adding a .NET-only hybrid scalar→vectorized whitespace scan for long whitespace runs, aiming to improve throughput on whitespace-heavy JSON while keeping common cases effectively unchanged.

Changes:

  • Adds a #if NET long-run fallback in Utf8JsonReader.SkipWhiteSpace that uses a vectorized “find first non-whitespace” search plus newline/byte-position bookkeeping via JsonReaderHelper.CountNewLines.
  • Introduces a SearchValues<byte>-backed IndexOfFirstNonWhiteSpace helper for .NETCoreApp builds.
  • Adds targeted reader tests for long whitespace runs and for correct LineNumber / BytePositionInLine reporting after long whitespace before an invalid token.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
src/libraries/System.Text.Json/src/System/Text/Json/Reader/Utf8JsonReader.cs Adds a .NET-only scalar threshold and then vectorized scan for long whitespace runs, preserving line/byte bookkeeping.
src/libraries/System.Text.Json/src/System/Text/Json/Reader/JsonReaderHelper.net8.cs Adds SearchValues-based whitespace classification and IndexOfFirstNonWhiteSpace helper for vectorized scanning.
src/libraries/System.Text.Json/src/System/Text/Json/JsonConstants.cs Introduces MaxScalarWhiteSpaceScanLength constant to control the scalar→vector threshold.
src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Utf8JsonReaderTests.cs Adds tests covering long whitespace runs and exception location reporting after long whitespace.

Comment on lines +2186 to +2189
Assert.True(json.Read());
Assert.Equal(JsonTokenType.Number, json.TokenType);
Assert.Equal(38, json.ValueSpan.Length);

@eiriktsarpalis

Copy link
Copy Markdown
Member Author

@EgorBot -amd -intel -arm64 --filter "System.Text.Json.Read"

Runs the existing dotnet/performance System.Text.Json reader benchmarks (Perf_Reader.*, ReadJson<T>.*, etc.) comparing this PR branch against main on AMD x64, Intel x64, and Apple Silicon arm64, to validate the SkipWhiteSpace vectorization.

Note

This comment was generated with the assistance of GitHub Copilot.

@eiriktsarpalis

Copy link
Copy Markdown
Member Author

@EgorBot -amd -intel -arm64 --filter "System.Text.Json.DeserializeFromString"

Benchmarks showed the scalar-prefix threshold (the hybrid gate) regressed the
common shallow/medium pretty-printed shapes (~1.03x/1.07x slower than baseline)
because of per-byte counter overhead that is rarely amortized, and it only
vectorized the run after the first 16 bytes. Always handing the run to the
SearchValues-based IndexOfAnyExcept is both simpler and faster on every
whitespace-bearing shape (0.91x shallow, 0.68x medium, 0.42x deep pretty) and
neutral on minified. Removes the MaxScalarWhiteSpaceScanLength constant and the
hybrid bookkeeping.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@eiriktsarpalis eiriktsarpalis changed the title Vectorize Utf8JsonReader.SkipWhiteSpace for long whitespace runs Vectorize Utf8JsonReader.SkipWhiteSpace Jun 23, 2026
@eiriktsarpalis

Copy link
Copy Markdown
Member Author

@EgorBot -amd -intel -arm64 --filter "System.Text.Json.Read"

Re-running after simplifying SkipWhiteSpace to an unconditional vectorized IndexOfAnyExcept(SearchValues) scan (the earlier 16-byte scalar gate was removed because it regressed common pretty-printed documents). This supersedes the previous benchmark run, which exercised the now-replaced gated implementation.

Note

This comment was created with the assistance of GitHub Copilot.

Copilot AI review requested due to automatic review settings June 23, 2026 11:38
@eiriktsarpalis

Copy link
Copy Markdown
Member Author

@EgorBot -amd -intel -arm64 --filter "System.Text.Json.DeserializeFromString"

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.

@eiriktsarpalis

Copy link
Copy Markdown
Member Author

Bespoke benchmark focused on Utf8JsonReader parsing of whitespace-heavy payloads (this is what SkipWhiteSpace dominates). It walks the reader token-by-token via Read() over four documents spanning the whitespace spectrum:

  • Minified — control, exercises the no-whitespace fast path (should stay neutral).
  • Pretty — standard pretty-printed object graph (2-space indent).
  • PrettyDeepNested — deeply nested pretty-printed graph → long indentation runs per token.
  • WhitespaceHeavy — synthetic doc with large mixed \n+tab whitespace runs inserted at every legal position (also stresses the newline-counting bookkeeping).

@EgorBot -amd -intel -arm64

using System;
using System.Buffers;
using System.Text;
using System.Text.Json;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(JsonWhitespaceReadBench).Assembly).Run(args);

[MemoryDiagnoser]
public class JsonWhitespaceReadBench
{
    private byte[] _payload = default!;

    [Params("Minified", "Pretty", "PrettyDeepNested", "WhitespaceHeavy")]
    public string Payload = "Pretty";

    [GlobalSetup]
    public void Setup()
    {
        _payload = Payload switch
        {
            "Minified"         => BuildDoc(depth: 4, breadth: 4, indented: false),
            "Pretty"           => BuildDoc(depth: 4, breadth: 4, indented: true),
            "PrettyDeepNested" => BuildDoc(depth: 8, breadth: 2, indented: true),
            "WhitespaceHeavy"  => BuildWhitespaceHeavy(count: 2000),
            _ => throw new ArgumentOutOfRangeException(nameof(Payload)),
        };
    }

    [Benchmark]
    public long Read()
    {
        var reader = new Utf8JsonReader(_payload, isFinalBlock: true, state: default);
        long tokens = 0;
        while (reader.Read())
        {
            tokens++;
        }
        return tokens;
    }

    private static byte[] BuildDoc(int depth, int breadth, bool indented)
    {
        var buffer = new ArrayBufferWriter<byte>();
        using (var writer = new Utf8JsonWriter(buffer, new JsonWriterOptions { Indented = indented }))
        {
            WriteObject(writer, depth, breadth);
        }
        return buffer.WrittenSpan.ToArray();
    }

    private static void WriteObject(Utf8JsonWriter w, int depth, int breadth)
    {
        w.WriteStartObject();
        w.WriteString("name", "some descriptive name value");
        w.WriteNumber("id", 1234567);
        w.WriteBoolean("enabled", true);
        w.WriteNull("optional");
        w.WriteString("timestamp", "2024-01-01T12:00:00Z");

        w.WriteStartArray("tags");
        for (int i = 0; i < breadth; i++)
        {
            w.WriteStringValue("tag-value-" + i);
        }
        w.WriteEndArray();

        if (depth > 0)
        {
            w.WriteStartArray("children");
            for (int i = 0; i < breadth; i++)
            {
                WriteObject(w, depth - 1, breadth);
            }
            w.WriteEndArray();
        }

        w.WriteEndObject();
    }

    private static byte[] BuildWhitespaceHeavy(int count)
    {
        // Valid JSON array with large mixed-whitespace runs (newline + tabs) inserted
        // at every legal position to stress the whitespace-skipping path.
        const string Ws = "\n\t\t\t\t\t\t\t\t";
        var sb = new StringBuilder();
        sb.Append('[');
        for (int i = 0; i < count; i++)
        {
            if (i > 0)
            {
                sb.Append(',');
            }
            sb.Append(Ws).Append('{');
            sb.Append(Ws).Append("\"id\":").Append(Ws).Append(i);
            sb.Append(Ws).Append(",\"name\":").Append(Ws).Append("\"item").Append(i).Append('"');
            sb.Append(Ws).Append(",\"active\":").Append(Ws).Append(i % 2 == 0 ? "true" : "false");
            sb.Append(Ws).Append('}');
        }
        sb.Append(Ws).Append(']');
        return Encoding.UTF8.GetBytes(sb.ToString());
    }
}

Note

This comment was created with the assistance of GitHub Copilot.

@eiriktsarpalis

Copy link
Copy Markdown
Member Author

Change does not regress existing benchmarks while showing significant performance gains in a custom benchmark processing whitespace: EgorBot/Benchmarks#264 (comment)

@eiriktsarpalis eiriktsarpalis requested a review from MihaZupan June 23, 2026 14:10
@eiriktsarpalis eiriktsarpalis added the tenet-performance Performance related issue label Jun 23, 2026
@eiriktsarpalis eiriktsarpalis added this to the 11.0.0 milestone Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants