mcptest docs GitHub

Conformance invariants

A conformance invariant is a named property of a whole captured MCP session, not a shape match on one request/response pair. The compliance corpus expect DSL checks a single message at a time: it answers "does this tools/list response have the right shape." Some spec requirements span the session: the handshake has to come first, a server that answers tools/list must have advertised the tools capability, and a JSON-RPC error has to carry an integer code. mcptest lifts those into typed invariants, each a pure function over the captured exchange. The same capture always produces the same result, so the checks fit a CI gate.

Run this example. examples/official-conformance.yml runs the local conformance checks these invariants back.

mcptest run --config examples/official-conformance.yml

Running them

The invariants run against a JSON capture file, so the command is deterministic and contacts no server:

mcptest compliance invariants --capture session.json
mcptest compliance invariants --capture session.json --format json

The command exits 0 when every invariant passes and no composition hazard is found, and 1 otherwise, so it gates CI directly.

Capture format

A capture is a session object, or an array of session objects for the multi-server composition mode. Each session carries the server label, the negotiated capability block, and the ordered client exchanges:

{
  "server_label": "stdio://my-server",
  "server_capabilities": { "tools": {} },
  "exchanges": [
    {
      "request":  { "jsonrpc": "2.0", "id": 1, "method": "initialize" },
      "response": { "jsonrpc": "2.0", "id": 1, "result": {} }
    },
    {
      "request":  { "jsonrpc": "2.0", "id": 2, "method": "tools/list" },
      "response": { "jsonrpc": "2.0", "id": 2, "result": { "tools": [] } }
    }
  ]
}

A notification omits response. A session that never completed a handshake omits server_capabilities, which defaults to null. Recording the capture off a live server is the runner's job; this command is the scoring half.

The INV-NNN family

Invariants carry IDs in a dedicated INV-NNN family. The family is documented here, following the standardized rule-ID scheme. Like the SCHEMA-006, SEC, and DESC checks, invariants run in code rather than as compliance corpus rows, because each one reads the whole captured exchange, which the single request/response corpus assertions cannot express. Encoding them as Rust properties also keeps them out of the rule registry, so the rubric stats counts do not shift.

IDCategoryProperty
INV-001lifecycleinitialize is the first request the client sends.
INV-002lifecycleThe notifications/initialized notification follows the initialize response.
INV-003capabilityA server that answers tools/list, resources/list, or prompts/list advertised the matching capability at initialize.
INV-004capabilityA server that advertised a capability does not error on every call to it.
INV-005result-shapeA successful tools/call result carries a content array or a structuredContent object, and a boolean isError when present.
INV-006error-envelopeEvery JSON-RPC error envelope carries an integer code and a string message.
INV-007error-envelopeA method-not-found error uses JSON-RPC code -32601.

Multi-server composition safety

When the capture holds two or more sessions, the command runs each server's invariants individually and then re-asserts the properties that can fail only when servers coexist behind one client. These hazards are not visible in either capture alone:

The composition mode stays per-run and single-developer. Fleet aggregation, governance, and dashboards are out of scope here.