Skip to content

ci(conformance): add 2026-07-28 carried-forward leg + bump harness to 0.2.0-alpha.4#2911

Merged
maxisbey merged 1 commit into
mainfrom
conformance-2026-carried-forward
Jun 19, 2026
Merged

ci(conformance): add 2026-07-28 carried-forward leg + bump harness to 0.2.0-alpha.4#2911
maxisbey merged 1 commit into
mainfrom
conformance-2026-carried-forward

Conversation

@maxisbey

Copy link
Copy Markdown
Contributor

Adds a --spec-version 2026-07-28 carried-forward leg to the conformance workflow (one extra step in each of the existing server/client jobs) with its own baseline file, and bumps the harness pin to 0.2.0-alpha.4. No new jobs, no second everything-server, no second client — same fixtures, same run-server.sh.

Motivation and Context

The existing --suite draft step only runs scenarios introduced in 2026-07-28. It doesn't re-run the 2025-era scenarios (tools-list, prompts-get-*, resources-*, …) at the 2026 wire, so a regression where a previously-working feature breaks under the stateless protocol wouldn't be caught. --suite all --spec-version 2026-07-28 runs every applicable scenario at the 2026 wire and is the same shape typescript-sdk is adopting on its v2-2026-07-28 branch.

The 2026 leg needs its own expected-failures.2026-07-28.yml because baseline entries are keyed by scenario name only: a scenario that passes at its default version in the 2025 legs but fails when forced to 2026 (almost all of them today, since main has no stateless path) can't share a file with the 2025 legs — the passing leg would flag the entry stale.

This lands the CI infra ahead of the stateless implementation work so those PRs can rebase onto it and burn the 2026 baseline down per milestone, instead of each one re-deriving the workflow shape.

How Has This Been Tested?

Ran all five workflow steps locally against a build of @modelcontextprotocol/conformance@0.2.0-alpha.4 (built from the v0.2.0-alpha.4 release commit 7a620cb):

Leg Result
server --suite active --expected-failures expected-failures.yml exit 0, 42/42
server --suite draft --expected-failures expected-failures.yml exit 0, 17 expected failures
server --suite all --spec-version 2026-07-28 --expected-failures expected-failures.2026-07-28.yml exit 0, 38 expected failures
client --suite all --expected-failures expected-failures.yml exit 0, 16 expected failures
client --suite all --spec-version 2026-07-28 --expected-failures expected-failures.2026-07-28.yml exit 0, 27 expected failures

The alpha.3 → alpha.4 bump required no entry changes to the existing expected-failures.yml (only the version string in the header comment).

Breaking Changes

None — CI-only.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

  • client.py now reads MCP_CONFORMANCE_PROTOCOL_VERSION (always set by the harness; defaults to its LATEST_SPEC_VERSION when --spec-version is omitted) into a module-level PROTOCOL_VERSION and logs it. No handler branches on it yet — the 2026 client leg is fully baselined until the stateless client path lands and adds the branch it actually needs.
  • run-server.sh already forwards "$@" to npx … server, so it needed no change for --spec-version.
  • The conformance composite action (action.yml) doesn't expose a spec-version input; we invoke via npx so that doesn't affect us, but it's worth a follow-up upstream.

AI Disclaimer

… 0.2.0-alpha.4

Adds a third server step and second client step that run
`--suite all --spec-version 2026-07-28` against a separate baseline file.
The existing `--suite draft` step only runs scenarios introduced in
2026-07-28; this new leg re-runs every applicable scenario at the 2026
wire so regressions in 2025-era behaviour under the stateless protocol
are caught.

The 2026 leg needs its own expected-failures.2026-07-28.yml because
baseline entries are keyed by scenario name only — a scenario that
passes at its default version but fails when forced to 2026 cannot
share a file with the 2025 legs (the passing leg flags it stale).

Baseline derived empirically from a local alpha.4 build. The
alpha.3 -> alpha.4 bump required no entry changes to the existing
expected-failures.yml.

client.py now reads MCP_CONFORMANCE_PROTOCOL_VERSION (set by the
harness on every invocation) so handlers can branch on it once the
stateless client path lands; today it is logged only.
@maxisbey maxisbey marked this pull request as ready for review June 19, 2026 13:47
@maxisbey maxisbey enabled auto-merge (squash) June 19, 2026 13:48
@maxisbey maxisbey merged commit 9e6d003 into main Jun 19, 2026
31 checks passed
@maxisbey maxisbey deleted the conformance-2026-carried-forward branch June 19, 2026 13:48

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — CI-only change adding a 2026-07-28 conformance leg to the existing jobs plus a harness pin bump; no production code touched.

Extended reasoning...

Overview

This PR is confined to conformance CI infrastructure: it adds one extra step to each of the existing server/client conformance jobs running --suite all --spec-version 2026-07-28, introduces a dedicated baseline file (expected-failures.2026-07-28.yml) for that leg, bumps the pinned harness from 0.2.0-alpha.3 to 0.2.0-alpha.4, and adds a read-and-log of MCP_CONFORMANCE_PROTOCOL_VERSION in the conformance test client. No SDK/runtime code is modified.

Security risks

Minimal. The workflow keeps permissions: contents: read and persist-credentials: false, adds no secrets, and the only third-party execution change is a deliberate bump of an already-pinned npm package version within the existing npx invocation pattern. The new env-var read in client.py only affects the CI test harness client and is logged, not acted upon.

Level of scrutiny

Low. This is CI configuration following the established pattern of the existing draft-suite leg; run-server.sh already forwards extra arguments, so --spec-version flows through without script changes. The bulk of the diff is the new expected-failures baseline, which is data rather than logic — an incorrect entry surfaces directly in CI (the runner fails on both stale entries and unexpected failures), so mistakes are self-revealing rather than silently masking regressions long-term.

Other factors

The author reports running all five workflow legs locally against the new harness version with exit 0, and the baseline entries are thoroughly commented with the SEP/gap that unblocks each group. The bug-hunting pass found no issues, and there are no outstanding reviewer comments. The main judgment call (separate baseline file keyed by leg rather than a shared one) is well-justified in the PR description and matches the constraint that entries are keyed by scenario name only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants