mcptest docs GitHub

Scenario 9: scan a poisoned catalog for attacks

A malicious MCP server does not need to exploit a memory bug. It can attack the model directly through the text it is allowed to publish: a tool description that injects instructions, a directive that steers the model toward another tool, an exfiltration sink, a name with hidden unicode, a credential left in a description, or a destructive tool that declares no annotation. None of these show up when you run the tools. They live in the catalog itself.

mcptest security scans that catalog. It reads a tools/list-style JSON snapshot and runs a set of deterministic checks over the tool, prompt, and resource definitions. No model is in the verdict path: every finding is a regex or structural predicate with a stable SEC-NNN rule ID, so a server cannot talk its way out of a finding by narrating safety.

The hosted test server ships a deliberately poisoned catalog for exactly this walkthrough. This scenario captures it and scans it.

Capture the catalog

The endpoint POST https://test.mcptest.sh/mcp?scenario=insecure serves the poisoned catalog. Ask it for tools/list and save the response to a file:

curl -s -X POST 'https://test.mcptest.sh/mcp?scenario=insecure' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' > insecure.json

insecure.json now holds a JSON-RPC response with a result.tools array. The scanner reads the tools, prompts, and resources arrays from a snapshot, so this file is a valid input as captured.

For contrast, the conformant endpoint POST https://test.mcptest.sh/mcp (no scenario) serves a clean catalog. Capture it the same way and scan it to confirm a clean run reports no findings.

Scan it

mcptest security insecure.json --fail-on high

What is happening here:

Each poisoned definition maps to a rule. The injecting description fires SEC-001 (description-injection). The cross-tool directive fires SEC-002 (cross-tool-directive). The exfiltration sink fires SEC-003 (exfiltration-directive). The hidden unicode in the tool name fires SEC-005 (hidden-unicode). The credential in a description fires SEC-008 (secret-in-definition). The destructive tool with no destructiveHint fires SEC-009 (unannotated-destructive-tool). And because an untrusted-content source and an exfil-or-destructive sink coexist in the same catalog, the toxic-flow lane fires SEC-035 (toxic-flow-pairing): a prompt injection delivered through the first tool can drive the second.

Expected output

mcptest security insecure.json --fail-on high

  SEC-001  high    description-injection         fetch_notes
           description carries an imperative aimed at the model
           ("ignore all previous instructions and ...")
  SEC-002  high    cross-tool-directive          fetch_notes
           description steers the model to call send_message
  SEC-003  high    exfiltration-directive        read_file
           description points the model at an external sink
  SEC-005  high    hidden-unicode                send_messa<U+200B>ge
           tool name contains an invisible / bidirectional character
  SEC-008  high    secret-in-definition          sync_account
           description appears to contain a credential
  SEC-009  medium  unannotated-destructive-tool  delete_record
           implies a destructive action but declares no destructiveHint
  SEC-035  high    toxic-flow-pairing            read_file + send_message
           an untrusted-content source and an exfil sink coexist

  7 findings: 6 high, 1 medium
  posture coverage map printed above the verdict

  FAIL  6 findings at or above high (--fail-on high)

exit code: 1

The six high findings cross the --fail-on high floor, so the run exits 1. The medium SEC-009 finding is reported but does not gate at this floor; lower the floor to --fail-on medium to gate on it too. The posture coverage map prints alongside the findings as a signal report and never changes the verdict.

SARIF for code scanning

The findings render as SARIF 2.1.0 so they drop into GitHub or GitLab code scanning the same way other scanner output does. Each rule carries its SEC-NNN ID, a level mapped from severity, and a help URI:

mcptest security insecure.json --format sarif > security.sarif

The --format flag also accepts json for the scorecard and CI, and pretty (the default) for the console summary. SARIF is verdict-only: it carries the deterministic findings, which is what code scanning consumes.

Troubleshooting

See also