Capability lanes

Compliance, security, conformance, scenarios, model compatibility, coverage, and performance are not separate tools. Each is a block you add to your suite, and mcptest run runs it as a lane alongside the tool and agent tests. One suite, one command, one report, one exit code. A CI pipeline runs the whole audit with one binary invocation instead of one per concern.

One mcptest run over the compliance, security, conformance, model-compat, and performance lanes, finishing with a single summary and exit code.

The YAML

A suite that declares capability blocks runs each as a lane. examples/full-audit/full-audit.yml declares seven concerns against the built-in mock, all offline and key-free:

# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json

servers:
  mock:
    command: ["mcptest", "mock", "--tools-from", "${CONFIG_DIR}/echo.yaml"]

# Behavioral tests: does the tool work.
tools:
  - name: echo round-trips
    server: mock
    tool: echo
    expect:
      - target: result.content[0].text
        matcher: { exact: "hello" }

# Compliance: protocol-level checks scored against the rubric.
compliance:
  - name: PROTO-002 initialize returns capabilities
    server: mock
    check: initialize
    expect:
      - target: result.capabilities
        matcher:
          schema: { type: object }

# Security: the deterministic scan of the tool surface.
security:
  scan: true

# Conformance: the SEP protocol probes against the live server.
conformance: {}

# Model compatibility: diff a recorded baseline against a candidate.
model_compat:
  baseline: baseline.json
  candidate: candidate.json

# Coverage: gate on how much of the tool surface the suite exercised.
coverage:
  tools: 100

# Performance: a soft p95 latency budget over the run's per-test durations.
performance:
  p95_latency_ms: 60000

The run

One command runs every block. The tool and agent tests run first, then each declared lane runs after the drive and folds its rows into the same report:

mcptest run examples/full-audit/full-audit.yml

Block	Lane	Gates on
`tools:`	behavioral tests	a failing assertion
`compliance:`	the compliance rubric checks	a failed check
`security: { scan: true }`	the deterministic tool-surface scan	a finding at or above High
`conformance:`	the SEP protocol probes	a failed MUST
`model_compat:`	a baseline-versus-candidate diff	a breaking change (drift is advisory)
`coverage:`	tool-surface coverage instrumentation	a declared dimension below its minimum (exit 6)
`performance:`	a p95 latency budget over per-test durations	nothing (advisory: a p95 over budget is highlighted, not gated)

A breaking result in any lane fails the run; non-breaking model drift is reported but does not gate. The single exit code reconciles every lane, so a green run means the whole audit passed. The lanes only fire when the suite declares their block, so a suite that declares none behaves exactly as a plain tool run.

Or run one lane standalone

The standalone subcommands still exist for a focused run of one concern, without a suite:

mcptest compliance run --from-suite mcptest.yaml
mcptest security tools.json
mcptest conformance run --server-command "..."
mcptest model-compat diff --baseline a.json --candidate b.json
mcptest coverage --tools-from tools.json --threshold 80

Each returns its own report and its own exit code. Fold them into the suite when you want one run to cover everything; reach for the standalone command when you only care about one lane.

Capability lanes

The YAML

The run

Or run one lane standalone

Related