What is mcptest
mcptest is an open-source CLI for testing Model Context Protocol (MCP) servers. It runs declarative YAML test files against any MCP server over stdio, streamable HTTP, or legacy SSE, and checks the whole surface: tool, resource, and prompt calls; the agent loop a real model drives; schema drift; spec compliance; and a deterministic security scan of your tool definitions. It runs on every commit, fails the build on a regression, and reports in seven formats. Single static binary, Apache-2.0 licensed, by Soap Bucket LLC.
This page is the canonical short definition that the rest of the site, the README, the GitHub repo description, package metadata, and the og:description meta tag all derive from. If the wording here changes, update those surfaces in the same commit.
In one sentence
mcptest tests Model Context Protocol servers in CI by running declarative YAML against any MCP server and checking the responses, the agent loop a real model drives, schema drift, spec compliance, and the security of your tool definitions, each behind a stable exit code.
In one paragraph
mcptest is the CLI for testing Model Context Protocol (MCP) servers. You write YAML test files that name a server, a tool, the arguments, and the responses you expect. mcptest connects over stdio, streamable HTTP, or legacy SSE, performs the MCP initialize handshake, and runs each test. The same tool drives a real model through the agent loop and asserts on the trace, diffs the tool catalog against a baseline to catch breaking drift, grades the server against a compliance corpus, and scans your tool definitions for prompt-injection and other deterministic security findings. It captures performance numbers and emits a report in any of seven formats (pretty, JSON, JUnit, Markdown, HTML, SARIF, GitLab Code Quality). Single binary, Apache-2.0, by Soap Bucket LLC.
What mcptest is
mcptest is:
- A spec runner. Tests live in YAML, validated at load time by a published JSON Schema at
https://mcptest.sh/schema/v1.json. Write once, run anywhere mcptest runs. - CI-grade. Exit codes are stable, reporters cover every format CI already understands, and a single run produces machine-readable artifacts CI can store. First-class integrations for GitHub Actions, GitLab CI, and CircleCI.
- Deterministic. Cassette record and replay capture real protocol exchanges to JSON; the replay path uses the same normalization pass as the record path so a recording from today and one from next week diff cleanly.
- Single-developer scope. Everything a single developer needs on their own machine ships in the open-source binary.
- Apache-2.0. No commercial restrictions.
What mcptest is not
mcptest is not:
- A debugger. Use your editor's MCP tooling for interactive inspection.
- A fuzzer. Use a property-testing library for randomized input generation.
- A load tester. Use a dedicated load tool (
k6,wrk,vegeta) for throughput testing. - An MCP server. mcptest talks to servers; it does not implement the server side of the protocol (except a small fixture server bundled for smoke tests).
- A linter for MCP server source code. mcptest validates runtime behavior over the wire, not the implementation language.
What mcptest replaces
Before mcptest, teams testing MCP servers wrote bespoke harnesses: shell scripts that piped JSON-RPC frames into a subprocess, ad-hoc HTTP clients that compared response bodies with jq and grep, and one-off Node or Python integration suites that no two teams shared. mcptest replaces all of that with one YAML schema, one binary, and nine reporters.
Key capabilities
- Transports: stdio, streamable HTTP, legacy SSE.
- Matchers:
exact,contains,regex,subset, JSONschema,snapshot, the string matchers (contains-all,contains-any,icontains,starts-with,levenshtein),is-json,is-valid-tools-call, the LLMllm-judgeandllm-jurymatchers, and thenotcombinator. Latency and token budgets are separate per-test fields, not matchers. - Cassettes: deterministic record/replay, content-addressed cache, diff-friendly JSON.
- Auth: OAuth 2.1 + PKCE, bearer tokens, custom headers, environment redaction on every reporter.
- Agent loop: drive a real model across one or more servers and assert on the trace (tool choice, arguments, tokens, latency), with a matrix across models and an
llm-judgeorllm-juryon the final answer. - Drift:
mcptest diffcompares the tool catalog and response shapes against a recorded baseline and classifies each change as breaking or not. - Compliance corpus: PROTO, SCHEMA, SEQ, TOOL, RES, EDGE categories, graded A+ through F.
- Security:
mcptest securityruns deterministic checks over your tool, prompt, and resource definitions (description injection, hidden unicode, secrets, toxic-flow pairings, integrity drift against a baseline) and reports as SARIF. - Reporters: pretty, JSON, JUnit, Markdown, HTML, SARIF, GitLab Code Quality.
- Baselines: expected-failures file lets adoption proceed without blocking PRs on known-flaky tests.
- Auto-discovery: detects servers configured in Claude Desktop, Cursor, and Claude Code, then scaffolds a starter suite.
Who builds mcptest
Soap Bucket LLC (two words, capital S and B) at soapbucket.com. Legal contact: legal@soapbucket.com. The open-source project is Apache-2.0 licensed and accepts contributions through GitHub at github.com/soapbucket/mcptest.
License
Apache-2.0. See LICENSE and NOTICE at the repo root.
Distribution
- Homebrew:
brew install soapbucket/tap/mcptest. - Install script:
curl -sSL https://download.mcptest.sh/install.sh | sh. - Docker:
soapbucket/mcptest:lateston Docker Hub. - Cargo:
cargo install mcptest. - GitHub Actions: install with the script in a workflow step (a
soapbucket/mcptest-actionis staged as an example but not yet published). - Releases: signed tarballs at github.com/soapbucket/mcptest/releases.
See also
- Getting started for the five-minute install and first-test loop.
- Concepts for the mental model.
- YAML reference for every field.
- CLI reference for every flag.
- FAQ for short factual answers.