mcptest docs GitHub

Scenario 14: rate limiting and backoff

You want to know what your test run does when the server pushes back with an HTTP 429. A real MCP server fronted by a gateway, an API quota, or a load shedder will eventually answer "too many requests" with a Retry-After header, and you would rather see that behavior in a controlled run than discover it for the first time in CI at 2am.

The hosted test server makes this easy to exercise. Point a URL target at https://test.mcptest.sh/mcp?scenario=ratelimit and every request comes back as HTTP 429 with a Retry-After: 1 header, before any JSON-RPC is handled. There is no happy path to fall through to; the endpoint exists only to push back. That gives mcptest's transport-level backoff something real to chew on and lets you watch how the failure finally surfaces.

What mcptest actually does here is worth being precise about. The Streamable HTTP transport retries a 429 (and 503 and other 5xx) a small, fixed number of times, honoring Retry-After when the server sends a whole-number-of-seconds value. There is no --retry-style knob for this: the backoff is built into the transport. The per-test --retry flag is a different thing (it re-runs a failing test, for flaky third-party services), and it does not come into play here because the 429 lands during connect, before any test step runs. So this scenario is about observing the built-in backoff and the clean transport error it ends with, not about configuring a retry policy.

The YAML

Save this as tests/ratelimit.yml:

# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json

servers:
  ratelimit:
    url: "https://test.mcptest.sh/mcp?scenario=ratelimit"
    http:
      timeout: 30s
      connect_timeout: 5s

tools:
  - name: "lists tools"
    server: ratelimit
    tool: "tools/list"
    args: {}
    expect:
      - target: "error"
        matcher:
          exact: null
        message: "tools/list should not return an error"

What is happening here:

A note on the Retry-After value: mcptest parses it as a whole number of seconds. The HTTP spec also allows an HTTP-date form of Retry-After; the transport does not parse that form and falls back to its built-in backoff step when the value is not a plain integer. The hosted endpoint sends the integer form (1) so the wait is honored.

Run it

mcptest run --config tests/ratelimit.yml

If you want to watch the retries happen, turn on transport debug logging. The Streamable HTTP transport logs each non-2xx response and each retry decision:

mcptest --log-level "mcptest_core::transport::streamable_http=debug" \
        run --config tests/ratelimit.yml

To probe the endpoint on its own before wiring it into a suite, point the layered network diagnostic at it:

mcptest doctor --url "https://test.mcptest.sh/mcp?scenario=ratelimit"

The [AUTH] / [MCP-INIT] rows will not reach OK here, because the server answers 429 to the initialize probe too. That is the expected result for this scenario, not a misconfiguration on your side.

Expected output

A run against the always-429 endpoint exhausts the transport's retries and then fails the connect, since the initialize handshake never completes:

mcptest run --config tests/ratelimit.yml

  FAIL  ratelimit  connect failed
        initialize handshake failed: transport closed
        last HTTP status: 429 Too Many Requests (Retry-After: 1)

0 passed, 1 failed in 4.3s

exit code: 1

With transport debug logging on, the retries are visible before the final failure:

mcptest --log-level "mcptest_core::transport::streamable_http=debug" run --config tests/ratelimit.yml

WARN mcptest_core::transport::streamable_http: non-2xx HTTP response  status=429 Too Many Requests
DEBUG mcptest_core::transport::streamable_http: post attempt failed   transient=true
WARN mcptest_core::transport::streamable_http: non-2xx HTTP response  status=429 Too Many Requests
DEBUG mcptest_core::transport::streamable_http: post attempt failed   transient=true
WARN mcptest_core::transport::streamable_http: non-2xx HTTP response  status=429 Too Many Requests
DEBUG mcptest_core::transport::streamable_http: post attempt failed   transient=true
...
  FAIL  ratelimit  connect failed

The wall-clock time (about four seconds in the example) reflects the Retry-After: 1 waits stacked across the retry attempts. A server that sent no Retry-After would fail faster, because the default backoff steps are sub-second.

Other transport endpoints on the same host are useful for the same kind of "see how a status surfaces" check:

Troubleshooting

See also