ci(conformance): add 2026-07-28 carried-forward leg + bump harness to 0.2.0-alpha.4#2911
Conversation
… 0.2.0-alpha.4 Adds a third server step and second client step that run `--suite all --spec-version 2026-07-28` against a separate baseline file. The existing `--suite draft` step only runs scenarios introduced in 2026-07-28; this new leg re-runs every applicable scenario at the 2026 wire so regressions in 2025-era behaviour under the stateless protocol are caught. The 2026 leg needs its own expected-failures.2026-07-28.yml because baseline entries are keyed by scenario name only — a scenario that passes at its default version but fails when forced to 2026 cannot share a file with the 2025 legs (the passing leg flags it stale). Baseline derived empirically from a local alpha.4 build. The alpha.3 -> alpha.4 bump required no entry changes to the existing expected-failures.yml. client.py now reads MCP_CONFORMANCE_PROTOCOL_VERSION (set by the harness on every invocation) so handlers can branch on it once the stateless client path lands; today it is logged only.
There was a problem hiding this comment.
LGTM — CI-only change adding a 2026-07-28 conformance leg to the existing jobs plus a harness pin bump; no production code touched.
Extended reasoning...
Overview
This PR is confined to conformance CI infrastructure: it adds one extra step to each of the existing server/client conformance jobs running --suite all --spec-version 2026-07-28, introduces a dedicated baseline file (expected-failures.2026-07-28.yml) for that leg, bumps the pinned harness from 0.2.0-alpha.3 to 0.2.0-alpha.4, and adds a read-and-log of MCP_CONFORMANCE_PROTOCOL_VERSION in the conformance test client. No SDK/runtime code is modified.
Security risks
Minimal. The workflow keeps permissions: contents: read and persist-credentials: false, adds no secrets, and the only third-party execution change is a deliberate bump of an already-pinned npm package version within the existing npx invocation pattern. The new env-var read in client.py only affects the CI test harness client and is logged, not acted upon.
Level of scrutiny
Low. This is CI configuration following the established pattern of the existing draft-suite leg; run-server.sh already forwards extra arguments, so --spec-version flows through without script changes. The bulk of the diff is the new expected-failures baseline, which is data rather than logic — an incorrect entry surfaces directly in CI (the runner fails on both stale entries and unexpected failures), so mistakes are self-revealing rather than silently masking regressions long-term.
Other factors
The author reports running all five workflow legs locally against the new harness version with exit 0, and the baseline entries are thoroughly commented with the SEP/gap that unblocks each group. The bug-hunting pass found no issues, and there are no outstanding reviewer comments. The main judgment call (separate baseline file keyed by leg rather than a shared one) is well-justified in the PR description and matches the constraint that entries are keyed by scenario name only.
Adds a
--spec-version 2026-07-28carried-forward leg to the conformance workflow (one extra step in each of the existing server/client jobs) with its own baseline file, and bumps the harness pin to0.2.0-alpha.4. No new jobs, no second everything-server, no second client — same fixtures, samerun-server.sh.Motivation and Context
The existing
--suite draftstep only runs scenarios introduced in 2026-07-28. It doesn't re-run the 2025-era scenarios (tools-list,prompts-get-*,resources-*, …) at the 2026 wire, so a regression where a previously-working feature breaks under the stateless protocol wouldn't be caught.--suite all --spec-version 2026-07-28runs every applicable scenario at the 2026 wire and is the same shape typescript-sdk is adopting on itsv2-2026-07-28branch.The 2026 leg needs its own
expected-failures.2026-07-28.ymlbecause baseline entries are keyed by scenario name only: a scenario that passes at its default version in the 2025 legs but fails when forced to 2026 (almost all of them today, sincemainhas no stateless path) can't share a file with the 2025 legs — the passing leg would flag the entry stale.This lands the CI infra ahead of the stateless implementation work so those PRs can rebase onto it and burn the 2026 baseline down per milestone, instead of each one re-deriving the workflow shape.
How Has This Been Tested?
Ran all five workflow steps locally against a build of
@modelcontextprotocol/conformance@0.2.0-alpha.4(built from thev0.2.0-alpha.4release commit7a620cb):server --suite active --expected-failures expected-failures.ymlserver --suite draft --expected-failures expected-failures.ymlserver --suite all --spec-version 2026-07-28 --expected-failures expected-failures.2026-07-28.ymlclient --suite all --expected-failures expected-failures.ymlclient --suite all --spec-version 2026-07-28 --expected-failures expected-failures.2026-07-28.ymlThe alpha.3 → alpha.4 bump required no entry changes to the existing
expected-failures.yml(only the version string in the header comment).Breaking Changes
None — CI-only.
Types of changes
Checklist
Additional context
client.pynow readsMCP_CONFORMANCE_PROTOCOL_VERSION(always set by the harness; defaults to itsLATEST_SPEC_VERSIONwhen--spec-versionis omitted) into a module-levelPROTOCOL_VERSIONand logs it. No handler branches on it yet — the 2026 client leg is fully baselined until the stateless client path lands and adds the branch it actually needs.run-server.shalready forwards"$@"tonpx … server, so it needed no change for--spec-version.action.yml) doesn't expose aspec-versioninput; we invoke vianpxso that doesn't affect us, but it's worth a follow-up upstream.AI Disclaimer