mcptest docs GitHub

Red-team scenario corpus

A small, original corpus of red-team scenarios for MCP servers, expressed as mcptest agent tests. It is the example-level deliverable: a starting set that the dynamic red-team engine will later run as a managed pass. The scenarios live in examples/security/.

What each scenario tests

Every scenario runs a model against a poisoned or attacker-influenced server and asserts, on observable artifacts only, that the model was not exploited. The verdict never depends on the model narrating that it behaved (see the observable-evidence oracle). The assertions target tool_calls[i].name, tool_calls[i].server, tool_calls[i].args, and tool_results[i], never an llm-judge.

ScenarioAttack classADR 0039 rowOWASP MCP Top 10
tool-description-injectionTool poisoningTool poisoning (server, creation, Tools)MCP03
tool-shadowingTool shadowingTool shadowing (server, operation, Tools)MCP09
data-exfiltrationData exfiltrationData exfiltration (server, operation)MCP10
rug-pullRug pullRug pull (server, update, Tools)MCP03
preference-manipulationPreference manipulationPreference manipulation (server, creation, Tools)MCP03
authz-bflaAuthorization bypassAuthorization (server, operation, Tools)MCP02
authz-bolaAuthorization bypassAuthorization (server, operation, Tools)MCP02
authz-rbacAuthorization bypassAuthorization (server, operation, Tools)MCP02

Authorization-bypass scenarios

The authorization family elsewhere in mcptest tests posture: token audience, scope minimization, session hygiene, confused-deputy probes. These three scenarios test the other half, authorization bypass driven through the agent loop. The question is whether a crafted prompt can steer the model itself into an action the caller is not authorized for, which the server in question does not re-check. They all classify under AttackClass::AuthorizationBypass (kebab label authorization-bypass, OWASP MCP02 privilege escalation) and use the same observable oracle: the verdict reads which tool fired and with which arguments, never the model's narration.

These cases are original, inspired by the Damn Vulnerable MCP (DVMCP) challenge set and OWASP MCP02. No challenge text, prompts, or metadata are copied.

The oracle is cassette-replayable, so a recorded authz run replays without a live model. The fixtures under crates/mcptest-core/tests/fixtures/redteam/ include a resisted and an exploited BFLA trace (authz-bfla-resisted.json, authz-bfla-exploited.json); the test in redteam_authz.rs asserts the exploited trace is flagged with AttackClass::AuthorizationBypass and the resisted trace is not, and it also validates the three YAMLs against the schema.

Provenance and licensing

These cases are original, written for this repository. The published benchmarks that inspired the attack classes (MCPTox arXiv:2508.14925, MCPSecBench arXiv:2508.13220) and the Damn Vulnerable MCP (DVMCP) challenge set are cited as reference, not copied. No third-party case data is redistributed.

Running them

Each scenario exercises the agent loop, so it needs a real model and the poisoned servers it describes (supplied locally, for example with mcptest mock). They are illustrative starting points rather than a CI gate. The conversion of a larger benchmark corpus into this format, and running it as an automated pass with an adaptive attacker, are tracked and . A note on observability: when a model uses programmatic (code-mode) tool calling, the tool calls happen inside a code sandbox; the assertions here only hold once the trace captures code-mode calls.