mcptest docs GitHub

mcptest run and exit codes

mcptest run is the command that executes your suite. It loads one mcptest.yaml, runs every block it declares, and returns one report and one exit code. This is the heart of the YAML-first model: the suite is the contract, and a single run reconciles every capability that rides on it. The capability lanes page covers how the blocks fold into one run; this page covers the run command itself and the exit codes it returns.

What a bare mcptest run resolves

Run it with no arguments from a directory that holds a suite:

mcptest run

With no path, mcptest run looks for mcptest.yaml, then mcptest.yml, in the current directory and runs the first one it finds. Pass an explicit path to run a different file:

mcptest run tests/smoke.yaml

mcptest run takes a suite file, not a directory. Point it at the exact file the run should execute.

What one run produces

A run produces two things: a report on the configured reporter, and a process exit code. The exit code is the machine signal CI branches on. The report is the human and machine record of what happened.

mcptest run's pass/fail outcome folds in every gate that rides on the run: failing tests, tool-selection floors, matrix assertion floors, timing-baseline breaches, and (when the suite opts in via run_options.inconclusive.treat_as: fail) inconclusive verdicts. A green run means every block that gates passed.

Reporter selection at a glance

--reporter <FORMAT> picks the output format and --output <PATH> names the sink (a file, or stdout when omitted):

mcptest run --reporter junit --output target/mcptest-junit.xml

Capturing json once lets a later mcptest report step re-render any format without re-running the suite:

mcptest run --reporter json --output run.json
mcptest report run.json --format sarif --output mcptest.sarif

The full format table and per-format notes are on the reporters page.

Exit codes

mcptest's process exit codes are a contract: CI pipelines branch on them, so the values only change with a breaking release.

CodeMeaningExample trigger
0Success. The command completed and everything it checked passed, or there was nothing to check.mcptest run against a suite where every test passes, or with no config file present.
1Failure. The command completed but at least one test, check, gate, or probe failed, or a run-level error prevented a passing result. Also the catch-all for internal errors at the CLI boundary.mcptest run with one failing assertion; mcptest doctor --url against a server whose probe fails.
2Invalid invocation. Conflicting or malformed flags, an invalid config, or a suite CI refuses to run. Nothing was executed.mcptest run --filter 're:(' (bad regex); a config that fails schema validation; --watch with a machine reporter such as --reporter json; a focused suite under CI=true; --deny-lints with lint findings.
3Not ready. --wait-for-ready exhausted its budget before every URL server accepted connections. Distinct from 1 so CI can retry a slow-starting deployment instead of failing the build.mcptest run --wait-for-ready 30s while the server never starts listening.
5Refused by policy. The invocation is well-formed, but a policy gate said no.mcptest run --update-snapshots under CI=true without --allow-update-in-ci; mcptest eval --max-cost 0.10 when the judge spend trips the budget.
6Quality gate not met. The command ran cleanly but a measured floor or drift gate tripped.mcptest coverage --tools-from tools.json --threshold 80 on a suite covering 60% of the catalog; mcptest model-compat reporting DRIFT; mcptest generate stubs --check finding generated stubs that differ from the checked-in files.
7No tests selected. The run discovered zero tests (an empty suite, or a --filter, --shard, or --last-failed selection that matched nothing) and --pass-with-no-tests was not passed.mcptest run --filter nomatch; a CI shard that drew an empty slice without --pass-with-no-tests.
130Interrupted. The run received SIGINT (Ctrl-C); in-flight work was cancelled cleanly, partial results were written, and the process exited. The value follows the shell 128 + signal convention (SIGINT is 2). Distinct from 1 so a deliberate Ctrl-C is never read as a test regression.mcptest run against a slow suite, then Ctrl-C before it finishes.

Code 4 is reserved and currently unused.

Notes: