SDK-tier scoring (SEP-2484)
The 2026-07-28 spec ties the publicly visible SDK tier to the conformance suite's MUST / SHOULD pass counts. mcptest aggregates a run's results into a single tier so a CI badge can summarize the run with one letter and a percentage.
Tiers
| Tier | Badge | Rule |
|---|---|---|
| Tier 1 (gold) | T1 | Every MUST check passes and at least 95 % of SHOULD checks pass. |
| Tier 2 (silver) | T2 | Every MUST check passes and at least 70 % of SHOULD checks pass. |
| Tier 3 (bronze) | T3 | Every MUST check passes. SHOULD coverage below 70 % but no MUST regressions. |
| Fail | F | At least one MUST check failed. No tier awarded. |
The thresholds match the spec table exactly. MAY checks affect neither the tier nor the percentage; they are reported for the operator and never gate the badge.
Library surface
mcptest_core::conformance::tier:
| Item | Purpose |
|---|---|
TierInput { must_passed, must_total, should_passed, should_total } | Aggregated counts from a run. |
Tier { Tier1, Tier2, Tier3, Fail } | Verdict. |
score_tier(input) -> Tier | Pure function implementing the SEP-2484 rule. |
Tier::badge() | Short letter form (T1 / T2 / T3 / F) for CI badges. |
TIER1_SHOULD_THRESHOLD, TIER2_SHOULD_THRESHOLD | The two pass-rate boundaries as f64 constants. |
Edge cases
- A corpus with zero MUST checks lifts the run to Tier 1 (there is nothing to fail, so the floor is the highest passing tier).
- A corpus with zero SHOULD checks treats the SHOULD pass rate as
1.0(vacuously satisfied), so the run lifts to Tier 1 when MUST is clean.
Vendored corpus
The MCP conformance suite scenarios live under conformance-corpus/ in this repo, not as a submodule. The README in that directory documents the refresh procedure: locate the upstream working-group repository, copy the scenarios for the target spec revision, update the upstream tag in upstream.txt, and open a pull request that lists added, removed, and changed scenarios so a reviewer can decide whether the change crosses an SDK-tier boundary.
A weekly cron job opens the refresh PR automatically once the upstream repository is identified. Until then, the directory is intentionally empty.
Planned follow-up
mcptest conformance runandmcptest conformance refreshsubcommands. The full contract (flags, corpus resolution order, fetch transport, exit codes) lives in conformance-cli.md.- The refresh GitHub Action that opens a weekly PR against
conformance-corpus/. - A documented mapping table between mcptest's conformance packs (
mcptest-core::compliancerules) and the upstream SEP check ids so a failing requirement points the operator at the right pack.