Capability lanes
Compliance, security, conformance, scenarios, model compatibility, coverage, and performance are not separate tools. Each is a block you add to your suite, and mcptest run runs it as a lane alongside the tool and agent tests. One suite, one command, one report, one exit code. A CI pipeline runs the whole audit with one binary invocation instead of one per concern.

The YAML
A suite that declares capability blocks runs each as a lane. examples/full-audit/full-audit.yml declares seven concerns against the built-in mock, all offline and key-free:
# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json
servers:
mock:
command: ["mcptest", "mock", "--tools-from", "${CONFIG_DIR}/echo.yaml"]
# Behavioral tests: does the tool work.
tools:
- name: echo round-trips
server: mock
tool: echo
expect:
- target: result.content[0].text
matcher: { exact: "hello" }
# Compliance: protocol-level checks scored against the rubric.
compliance:
- name: PROTO-002 initialize returns capabilities
server: mock
check: initialize
expect:
- target: result.capabilities
matcher:
schema: { type: object }
# Security: the deterministic scan of the tool surface.
security:
scan: true
# Conformance: the SEP protocol probes against the live server.
conformance: {}
# Model compatibility: diff a recorded baseline against a candidate.
model_compat:
baseline: baseline.json
candidate: candidate.json
# Coverage: gate on how much of the tool surface the suite exercised.
coverage:
tools: 100
# Performance: a soft p95 latency budget over the run's per-test durations.
performance:
p95_latency_ms: 60000
The run
One command runs every block. The tool and agent tests run first, then each declared lane runs after the drive and folds its rows into the same report:
mcptest run examples/full-audit/full-audit.yml
| Block | Lane | Gates on |
|---|---|---|
tools: | behavioral tests | a failing assertion |
compliance: | the compliance rubric checks | a failed check |
security: { scan: true } | the deterministic tool-surface scan | a finding at or above High |
conformance: | the SEP protocol probes | a failed MUST |
model_compat: | a baseline-versus-candidate diff | a breaking change (drift is advisory) |
coverage: | tool-surface coverage instrumentation | a declared dimension below its minimum (exit 6) |
performance: | a p95 latency budget over per-test durations | nothing (advisory: a p95 over budget is highlighted, not gated) |
A breaking result in any lane fails the run; non-breaking model drift is reported but does not gate. The single exit code reconciles every lane, so a green run means the whole audit passed. The lanes only fire when the suite declares their block, so a suite that declares none behaves exactly as a plain tool run.
Or run one lane standalone
The standalone subcommands still exist for a focused run of one concern, without a suite:
mcptest compliance run --from-suite mcptest.yaml
mcptest security tools.json
mcptest conformance run --server-command "..."
mcptest model-compat diff --baseline a.json --candidate b.json
mcptest coverage --tools-from tools.json --threshold 80
Each returns its own report and its own exit code. Fold them into the suite when you want one run to cover everything; reach for the standalone command when you only care about one lane.