mcptest docs GitHub

Spec-version pinning and upgrade-breakage reports

The MCP spec ships in dated revisions: v2024-11-05, v2025-03-26, v2025-06-18, and a forward-looking draft. A compliance suite is always written against one of them. When a newer revision changes the expected behavior of a rule you already test, your suite can start failing without a single line of your config changing. Spec-version pinning records which revision a suite targets so that mismatch is visible, and the cross-version report tells you exactly which rules moved.

The pin is parsed in mcptest-config; the cross-version diff lives in mcptest-core::compliance::version_diff. The spec_version_check: block wires the two together into a gate that fails the compliance run on an upgrade breakage.

Run this example. examples/spec-version-pinning.yml pins a compliance suite to an MCP spec revision with spec_version: and gates an upgrade into a target revision with spec_version_check:.

mcptest compliance run --from-suite examples/spec-version-pinning.yml

The spec_version pin

Declare the revision a compliance suite was authored against inside the compliance: block:

server:
  command: ./my-mcp-server

compliance:
  spec_version: v2025-06-18
  tests:
    - name: PROTO-002
      # ...

The pin is optional. A suite with no spec_version parses exactly as it did before, and the bare-array form of compliance: (a list of tests with no surrounding object) has no place for a pin, so it is never required there.

Accepted values match the directory names under compliance/:

ValueMeaning
v2024-11-05First published revision.
v2025-03-26Second published revision.
v2025-06-18Third published revision.
draftForward-looking, not yet a published spec.

A typo (for example v2025-6-18) fails the load with unknown compliance spec_version: ... rather than being silently ignored, so a stale or misspelled pin cannot rot in place.

Why pin at all

The corpus under compliance/<version>/ is the source of truth for what each rule expects. Two revisions can carry the same rule with different expectations. For example, PROTO-002 ("initialize returns negotiated capabilities") gains an optional transport enum in v2025-06-18 that the v2024-11-05 corpus does not assert. A suite pinned to the older revision is testing the older expectation. Without a pin, mcptest cannot tell you that a newer corpus would assert something different.

Recording the pin lets mcptest:

The cross-version breakage report

mcptest_core::compliance::version_diff::diff_versions takes two loaded corpora and compares each shared rule's expect assertions, keyed by rule_id:

use mcptest_core::compliance::{VersionCorpus, diff_versions};

let from = VersionCorpus::load_dir("v2024-11-05", &dir.join("v2024-11-05"))?;
let to = VersionCorpus::load_dir("v2025-06-18", &dir.join("v2025-06-18"))?;
let report = diff_versions(&from, &to);

Each rule lands in one of four buckets:

Comparison is structural, not textual: the expect blocks are parsed YAML, so reordering keys does not count as a change. Only a real difference in the expected behavior is flagged.

The report mirrors the model-compatibility baseline diff (mcptest_core::model_compat::diff::BaselineDiff): a flat entry list ordered by rule_id plus a pre-computed summary. VersionDiff::is_clean() is true when nothing in the changed bucket fired, and VersionDiff::changed_rules() iterates just the breakage entries. render_pretty(color) prints a human-readable summary.

Worked example

Diffing v2024-11-05 into v2025-06-18 reports PROTO-002 as changed (the added transport enum), several v2025-06-18-only rules such as ELICIT-001, SAMPLE-001, ROOTS-001, and RESTPL-001 as added, and the many v2024-11-05 rules that the smaller delta corpus omits as removed. Diffing v2024-11-05 into draft reports PROTO-008 as unchanged, because the draft keeps the same transport_observation expectation and only moves the registry severity.

Gate on it: the spec_version_check: block

The diff above is a Rust API. To gate a CI run on it without writing code, add a spec_version_check: block to the object form of compliance:. It diffs the suite's pinned spec_version (the from) into a target revision you name with against: (the to), and exposes the result as assertable targets:

compliance:
  spec_version: v2024-11-05        # the revision the suite is authored against
  spec_version_check:
    against: v2025-06-18           # the revision you are considering upgrading to
    expect:                        # optional; the defaults below apply if omitted
      - target: spec_clean
        matcher: { exact: true }
      - target: spec_breaking_changes
        matcher: { schema: { maxItems: 0 } }
  tests:
    - name: PROTO-002
      # ...

Diffing v2024-11-05 into v2025-06-18 is the breaking case: the one rule the two revisions share, PROTO-002, changed its expectation, so spec_clean is false and the default gate fails. The shipped examples/spec-version-pinning.yml diffs into draft instead, which is clean (the shared rule there, PROTO-008, is unchanged), so its gate passes. Point against: at the revision you are actually upgrading toward.

The targets:

Omit expect: and the engine applies the defaults: spec_clean == true and spec_breaking_changes empty. The block runs under mcptest compliance run --from-suite <file> and folds one PASS/FAIL line into the exit code, exactly like the grade_delta: block.

The corpus it diffs against ships with mcptest under compliance/<version>/, discovered next to the rule registry (the --registry parent, default compliance/). Two bootstrap cases pass with a note rather than failing: a block with no against:, and a target revision with no shipped corpus. So adding the gate never breaks the build on the day it lands. A block present without a compliance.spec_version pin is an error, since there is no from revision to diff.

Reading the result

A clean upgrade has an empty changed bucket: added and removed rules are expected when revisions diverge, but a changed expectation means an existing assertion in your suite now targets stale behavior. Walk the changed rules, update the affected tests, then bump compliance.spec_version to the new revision. When the spec_version_check: gate is in place, that walk is forced: the run stays red until the changed rules are addressed or the gate's expect: is loosened deliberately.

Missing-resource error code: -32002 becomes -32602

One specific behavior change in the 2026-07-28 release candidate is the JSON-RPC error code a server must return for a resources/read that names a resource that does not exist. Before the RC, MCP specified a custom code, -32002, for an unknown resource. The 2026-07-28 RC drops that custom code and uses the standard JSON-RPC -32602 (Invalid Params) instead, because a missing resource is a bad parameter, not a transport fault. A server that still answers -32002 on the new target is out of conformance.

Concretely, for a resources/read of a uri the server does not have:

This is a distinct change from any iss / RFC 9207 authorization-server requirement; the two are unrelated, and a server can be out of conformance on one without the other.

The deterministic check half lives in mcptest_core::migration: the MigrationCorpus::check_resource_error_code helper takes the code a server returned for a missing resource and reports Fail on -32002 (naming -32602 as the replacement), Pass on -32602, and Skipped on any unrelated code. No model is consulted; it is an integer comparison against the target's required code. The helper is library-only this release: the mcptest doctor CLI does not yet probe a live server's missing-resource code and run this check end to end. The runnable companion is examples/migration-doctor/.