mcptest docs GitHub

CI integration patterns

This guide shows how to run mcptest in continuous integration. It covers three patterns (stdio, HTTP service container, deployed environment) across three platforms (GitHub Actions, GitLab CI, CircleCI), so nine worked examples in total. Every snippet is copy-pasteable. Adjust the version pins, paths, and secret names for your own repository.

Where a flag or feature is still in flight (for example the full --wait-for-ready readiness-polling behavior), the snippet shows the intended call site and notes the status.

The guide assumes you already have at least one passing test file locally. If mcptest run tests/smoke.yaml works on your laptop, the snippets below take it from there.


How to read this guide (decision tree)

Start at the top. The first answer that fits routes you to the right snippet.

                  ┌──────────────────────────────┐
                  │ How does your MCP server run? │
                  └──────────────┬───────────────┘
                                 │
        ┌────────────────────────┼────────────────────────┐
        │                        │                        │
        ▼                        ▼                        ▼
  Local subprocess          HTTP listener          Already deployed
  (stdio, command,           you can boot           (staging URL,
   one binary)                inside the              auth in env)
        │                     CI job                       │
        │                        │                         │
        ▼                        ▼                         ▼
  Pattern 1: stdio        Pattern 2: HTTP            Pattern 3: deployed
                          service container          environment
        │                        │                         │
        ▼                        ▼                         ▼
  GitHub Actions:         GitHub Actions:           GitHub Actions:
    section 2.1             section 3.1               section 4.1
  GitLab CI:              GitLab CI:                GitLab CI:
    section 2.2             section 3.2               section 4.2
  CircleCI:               CircleCI:                 CircleCI:
    section 2.3             section 3.3               section 4.3

If you want both fast feedback on every commit and one end-to-end run against a real environment, skip to section 5 (combining patterns).

Numbered jump list, in case the ASCII tree above is too cramped:

  1. The server is a local binary you launch with a command. Go to Pattern 1 (stdio) in section 2.
  2. The server is an HTTP service. You will start it as a sidecar inside the CI job. Go to Pattern 2 (HTTP service container) in section 3.
  3. The server is already running somewhere (staging, preview, a VM). Go to Pattern 3 (deployed environment) in section 4.
  4. You want a tight smoke loop on every push and one slow integration run on pull requests or nightly. Go to section 5 for the combined recipe.
  5. You want to make any of the above faster. Go to section 6 (caching).
  6. Something is failing in CI and you cannot reproduce it locally. Go to section 8 (debugging) before guessing.

The 30-second rule: if you cannot find the snippet you need in half a minute, the decision tree is broken. File a docs issue and reference this paragraph.


2. Pattern 1: stdio servers

A stdio server is a binary that speaks MCP over standard input and output. mcptest launches the binary as a child process for the duration of the test run. This is the simplest pattern and usually the fastest, because nothing listens on a port and there is no readiness race.

The test file looks like this:

# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json

servers:
  local:
    command: ["./target/release/my-mcp-server"]

tools:
  - name: "lists tools without error"
    server: local
    tool: "list_directory"
    args:
      path: "/tmp"
    expect:
      - target: "result.content"
        matcher:
          schema:
            type: array
            minItems: 1

The snippets below build the server, then run mcptest. They all:

2.1 GitHub Actions (stdio)

name: mcptest-stdio
on:
  push:
    branches: [main]
  pull_request:

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - name: Check out code
        uses: actions/checkout@v4

      - name: Install Rust toolchain
        uses: dtolnay/rust-toolchain@stable

      - name: Cache cargo registry and build output
        uses: actions/cache@v4
        with:
          path: |
            ~/.cargo/registry
            ~/.cargo/git
            target
          key: ${{ runner.os }}-cargo-${{ hashFiles('Cargo.lock') }}
          restore-keys: |
            ${{ runner.os }}-cargo-

      - name: Build server in release mode
        run: cargo build --release --bin my-mcp-server

      - name: Install mcptest
        run: curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=1.0.0 sh

      - name: Run mcptest
        run: |
          mcptest run tests/ \
            --wait-for-ready \
            --reporter json --output target/mcptest-run.json \
            --verbose

      - name: Render the JUnit report
        if: always()
        run: mcptest report target/mcptest-run.json --format junit --output target/mcptest-junit.xml

      - name: Upload JUnit report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: mcptest-junit
          path: target/mcptest-junit.xml

      - name: Publish test summary
        if: always()
        uses: mikepenz/action-junit-report@v4
        with:
          report_paths: "target/mcptest-junit.xml"

Notes for this snippet:

2.2 GitLab CI (stdio)

default:
  image: rust:1.81

stages:
  - build
  - test

variables:
  CARGO_HOME: "${CI_PROJECT_DIR}/.cargo"
  CARGO_TARGET_DIR: "${CI_PROJECT_DIR}/target"

cache:
  key:
    files:
      - Cargo.lock
  paths:
    - .cargo/registry
    - .cargo/git
    - target

build-server:
  stage: build
  script:
    - cargo build --release --bin my-mcp-server
  artifacts:
    paths:
      - target/release/my-mcp-server
    expire_in: 1 day

mcptest-stdio:
  stage: test
  needs: ["build-server"]
  script:
    - curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=1.0.0 sh
    - export PATH="$HOME/.local/bin:$PATH"
    - mcptest run tests/
        --wait-for-ready
        --reporter json --output target/mcptest-run.json
        --verbose
    - mcptest report target/mcptest-run.json --format junit --output target/mcptest-junit.xml
    - mcptest report target/mcptest-run.json --format gitlab --output target/mcptest-codequality.json
  artifacts:
    when: always
    reports:
      junit: target/mcptest-junit.xml
      codequality: target/mcptest-codequality.json
    paths:
      - target/mcptest-junit.xml
      - target/mcptest-codequality.json
    expire_in: 1 week

Notes:

2.3 CircleCI (stdio)

version: 2.1

orbs:
  rust: circleci/rust@1.6.1

jobs:
  mcptest-stdio:
    docker:
      - image: cimg/rust:1.81
    resource_class: medium
    steps:
      - checkout
      - rust/install
      - restore_cache:
          keys:
            - v1-cargo-{{ checksum "Cargo.lock" }}
            - v1-cargo-
      - run:
          name: Build server
          command: cargo build --release --bin my-mcp-server
      - save_cache:
          key: v1-cargo-{{ checksum "Cargo.lock" }}
          paths:
            - ~/.cargo/registry
            - ~/.cargo/git
            - target
      - run:
          name: Install mcptest
          command: |
            curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=1.0.0 sh
            echo 'export PATH="$HOME/.local/bin:$PATH"' >> $BASH_ENV
      - run:
          name: Run mcptest
          command: |
            mcptest run tests/ \
              --wait-for-ready \
              --reporter json --output target/mcptest-run.json \
              --verbose
      - run:
          name: Render reports
          when: always
          command: |
            mcptest report target/mcptest-run.json --format junit --output target/mcptest-junit.xml
            mcptest report target/mcptest-run.json --format gitlab --output target/mcptest-codequality.json
      - store_test_results:
          path: target/mcptest-junit.xml
      - store_artifacts:
          path: target/mcptest-junit.xml
      - store_artifacts:
          path: target/mcptest-codequality.json

workflows:
  test:
    jobs:
      - mcptest-stdio

Notes:


3. Pattern 2: HTTP service container

When the server runs as an HTTP service, the CI job needs to start it alongside the test step. Every major platform has a service-container feature for this. The pattern is always the same:

  1. Pull (or build) a server image.
  2. Declare it as a service on the job.
  3. Point mcptest at the service hostname.
  4. Use --wait-for-ready so the test waits for /health before the first tool call.

The test file references the server by URL:

# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json

servers:
  remote:
    url: "http://mcp-server:8080/mcp"

tools:
  - name: "lists tools without error"
    server: remote
    tool: "list_directory"
    args:
      path: "/tmp"
    expect:
      - target: "result.content"
        matcher:
          schema:
            type: array
            minItems: 1

The hostname mcp-server is the service name on each platform's network. On GitHub it is the job-level service name. On GitLab it is the alias. On CircleCI it is the secondary image's network name (default localhost).

3.1 GitHub Actions (HTTP service container)

name: mcptest-http
on:
  pull_request:

jobs:
  test:
    runs-on: ubuntu-latest

    services:
      mcp-server:
        image: ghcr.io/example/my-mcp-server:0.7.3
        ports:
          - 8080:8080
        options: >-
          --health-cmd="curl -fsS http://localhost:8080/health || exit 1"
          --health-interval=5s
          --health-timeout=2s
          --health-retries=10

    steps:
      - name: Check out code
        uses: actions/checkout@v4

      - name: Cache mcptest install
        uses: actions/cache@v4
        with:
          path: ~/.local/bin/mcptest
          key: mcptest-${{ runner.os }}-1.0.0

      - name: Install mcptest
        run: |
          if [ ! -x "$HOME/.local/bin/mcptest" ]; then
            curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=1.0.0 sh
          fi
          echo "$HOME/.local/bin" >> $GITHUB_PATH

      - name: Run mcptest
        env:
          MCP_SERVER_URL: "http://mcp-server:8080/mcp"
        run: |
          mcptest run tests/http/ \
            --wait-for-ready \
            --reporter json --output target/mcptest-run.json \
            --verbose

      - name: Render reports
        if: always()
        run: |
          mcptest report target/mcptest-run.json --format junit --output target/mcptest-junit.xml
          mcptest report target/mcptest-run.json --format gitlab --output target/mcptest-codequality.json

      - name: Upload reports
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: mcptest-reports
          path: target/mcptest-*

Notes:

3.2 GitLab CI (HTTP service container)

default:
  image: alpine:3.20

stages:
  - test

variables:
  MCPTEST_VERSION: "1.0.0"
  MCP_SERVER_URL: "http://mcp-server:8080/mcp"

mcptest-http:
  stage: test
  services:
    - name: ghcr.io/example/my-mcp-server:0.7.3
      alias: mcp-server
      command: ["serve", "--port", "8080"]
  before_script:
    - apk add --no-cache curl bash
    - curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION="$MCPTEST_VERSION" sh
    - export PATH="$HOME/.local/bin:$PATH"
  script:
    - mcptest run tests/http/
        --wait-for-ready
        --reporter json --output mcptest-run.json
        --verbose
    - mcptest report mcptest-run.json --format junit --output mcptest-junit.xml
    - mcptest report mcptest-run.json --format gitlab --output mcptest-codequality.json
  artifacts:
    when: always
    reports:
      junit: mcptest-junit.xml
      codequality: mcptest-codequality.json
    paths:
      - mcptest-junit.xml
      - mcptest-codequality.json
    expire_in: 1 week
  cache:
    key: "mcptest-${MCPTEST_VERSION}"
    paths:
      - $HOME/.local/bin/mcptest

Notes:

3.3 CircleCI (HTTP service container)

version: 2.1

jobs:
  mcptest-http:
    docker:
      - image: cimg/base:stable
      - image: ghcr.io/example/my-mcp-server:0.7.3
        name: mcp-server
        command: ["serve", "--port", "8080"]
    resource_class: medium
    environment:
      MCP_SERVER_URL: "http://localhost:8080/mcp"
    steps:
      - checkout
      - restore_cache:
          keys:
            - v1-mcptest-1.0.0
      - run:
          name: Install mcptest
          command: |
            if [ ! -x "$HOME/.local/bin/mcptest" ]; then
              curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=1.0.0 sh
            fi
            echo 'export PATH="$HOME/.local/bin:$PATH"' >> $BASH_ENV
      - save_cache:
          key: v1-mcptest-1.0.0
          paths:
            - ~/.local/bin/mcptest
      - run:
          name: Run mcptest
          command: |
            mcptest run tests/http/ \
              --wait-for-ready \
              --reporter json --output target/mcptest-run.json \
              --verbose
      - run:
          name: Render reports
          when: always
          command: |
            mcptest report target/mcptest-run.json --format junit --output target/mcptest-junit.xml
            mcptest report target/mcptest-run.json --format gitlab --output target/mcptest-codequality.json
      - store_test_results:
          path: target/mcptest-junit.xml
      - store_artifacts:
          path: target/mcptest-junit.xml
      - store_artifacts:
          path: target/mcptest-codequality.json

workflows:
  test:
    jobs:
      - mcptest-http

Notes:


4. Pattern 3: deployed environment

When the server is already running (staging, a preview environment, a long- lived VM), the CI job does not boot anything. It just authenticates and runs tests against the live URL. The pattern is the same on every platform: read the URL and token from environment variables, pass --wait-for-ready so a deploying server has a moment to become healthy, and store the reports.

The test file looks identical to Pattern 2, except the URL points at the deployed environment and includes an auth header:

# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json

servers:
  staging:
    url: "${MCP_STAGING_URL}"
    headers:
      Authorization: "Bearer ${MCP_STAGING_TOKEN}"

tools:
  - name: "responds to list_directory in staging"
    server: staging
    tool: "list_directory"
    args:
      path: "/tmp"
    expect:
      - target: "result.content"
        matcher:
          schema:
            type: array
            minItems: 1

The two environment variables come from each platform's secrets store. Never embed a token literal in YAML. See section 7 pitfall 2.

4.1 GitHub Actions (deployed environment)

name: mcptest-staging
on:
  workflow_dispatch:
  schedule:
    - cron: "0 6 * * *"

jobs:
  test:
    runs-on: ubuntu-latest
    environment: staging
    steps:
      - name: Check out code
        uses: actions/checkout@v4

      - name: Cache mcptest install
        uses: actions/cache@v4
        with:
          path: ~/.local/bin/mcptest
          key: mcptest-${{ runner.os }}-1.0.0

      - name: Install mcptest
        run: |
          if [ ! -x "$HOME/.local/bin/mcptest" ]; then
            curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=1.0.0 sh
          fi
          echo "$HOME/.local/bin" >> $GITHUB_PATH

      - name: Run mcptest against staging
        env:
          MCP_STAGING_URL: ${{ vars.MCP_STAGING_URL }}
          MCP_STAGING_TOKEN: ${{ secrets.MCP_STAGING_TOKEN }}
        run: |
          mcptest run tests/staging/ \
            --wait-for-ready \
            --reporter json --output target/mcptest-run.json \
            --verbose

      - name: Render reports
        if: always()
        run: |
          mcptest report target/mcptest-run.json --format junit --output target/mcptest-junit.xml
          mcptest report target/mcptest-run.json --format gitlab --output target/mcptest-codequality.json

      - name: Upload reports
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: mcptest-staging-reports
          path: target/mcptest-*

Notes:

4.2 GitLab CI (deployed environment)

default:
  image: alpine:3.20

stages:
  - test

variables:
  MCPTEST_VERSION: "1.0.0"

mcptest-staging:
  stage: test
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"
    - if: $CI_PIPELINE_SOURCE == "web"
  environment:
    name: staging
    url: $MCP_STAGING_URL
  before_script:
    - apk add --no-cache curl bash
    - curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION="$MCPTEST_VERSION" sh
    - export PATH="$HOME/.local/bin:$PATH"
  script:
    - mcptest run tests/staging/
        --wait-for-ready
        --reporter json --output mcptest-run.json
        --verbose
    - mcptest report mcptest-run.json --format junit --output mcptest-junit.xml
    - mcptest report mcptest-run.json --format gitlab --output mcptest-codequality.json
  artifacts:
    when: always
    reports:
      junit: mcptest-junit.xml
      codequality: mcptest-codequality.json
    paths:
      - mcptest-junit.xml
      - mcptest-codequality.json
    expire_in: 1 week
  cache:
    key: "mcptest-${MCPTEST_VERSION}"
    paths:
      - $HOME/.local/bin/mcptest

Notes:

4.3 CircleCI (deployed environment)

version: 2.1

parameters:
  staging-only:
    type: boolean
    default: false

jobs:
  mcptest-staging:
    docker:
      - image: cimg/base:stable
    resource_class: small
    steps:
      - checkout
      - restore_cache:
          keys:
            - v1-mcptest-1.0.0
      - run:
          name: Install mcptest
          command: |
            if [ ! -x "$HOME/.local/bin/mcptest" ]; then
              curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=1.0.0 sh
            fi
            echo 'export PATH="$HOME/.local/bin:$PATH"' >> $BASH_ENV
      - save_cache:
          key: v1-mcptest-1.0.0
          paths:
            - ~/.local/bin/mcptest
      - run:
          name: Run mcptest against staging
          command: |
            mcptest run tests/staging/ \
              --wait-for-ready \
              --reporter json --output target/mcptest-run.json \
              --verbose
      - run:
          name: Render reports
          when: always
          command: |
            mcptest report target/mcptest-run.json --format junit --output target/mcptest-junit.xml
            mcptest report target/mcptest-run.json --format gitlab --output target/mcptest-codequality.json
      - store_test_results:
          path: target/mcptest-junit.xml
      - store_artifacts:
          path: target/mcptest-junit.xml
      - store_artifacts:
          path: target/mcptest-codequality.json

workflows:
  scheduled:
    when:
      and:
        - equal: [<< pipeline.schedule.name >>, "nightly"]
    jobs:
      - mcptest-staging:
          context: mcptest-staging

Notes:


5. Combining patterns

A common shape is: fast stdio smoke on every push, plus a deployed-env integration run on pull requests or nightly. The smoke run gives a sub-minute red/green signal. The deployed run catches issues that only show up against a real network and real auth.

The example below uses GitHub Actions. The same shape works on the other two platforms with the obvious renames (workflow becomes pipeline, etc.).

name: mcptest

on:
  push:
    branches: [main]
  pull_request:
  schedule:
    - cron: "0 6 * * *"

jobs:
  smoke-stdio:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
      - uses: actions/cache@v4
        with:
          path: |
            ~/.cargo/registry
            ~/.cargo/git
            target
          key: ${{ runner.os }}-cargo-${{ hashFiles('Cargo.lock') }}
      - run: cargo build --release --bin my-mcp-server
      - run: curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=1.0.0 sh
      - run: |
          mcptest run tests/smoke/ \
            --wait-for-ready \
            --reporter json --output target/mcptest-smoke-run.json \
            --verbose
      - if: always()
        run: mcptest report target/mcptest-smoke-run.json --format junit --output target/mcptest-smoke-junit.xml
      - if: always()
        uses: actions/upload-artifact@v4
        with:
          name: smoke-junit
          path: target/mcptest-smoke-junit.xml

  integration-staging:
    if: github.event_name == 'pull_request' || github.event_name == 'schedule'
    needs: smoke-stdio
    runs-on: ubuntu-latest
    environment: staging
    steps:
      - uses: actions/checkout@v4
      - run: curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=1.0.0 sh
      - env:
          MCP_STAGING_URL: ${{ vars.MCP_STAGING_URL }}
          MCP_STAGING_TOKEN: ${{ secrets.MCP_STAGING_TOKEN }}
        run: |
          mcptest run tests/integration/ \
            --wait-for-ready \
            --reporter json --output target/mcptest-integration-run.json \
            --verbose
      - if: always()
        run: |
          mcptest report target/mcptest-integration-run.json --format junit --output target/mcptest-integration-junit.xml
          mcptest report target/mcptest-integration-run.json --format gitlab --output target/mcptest-codequality.json
      - if: always()
        uses: actions/upload-artifact@v4
        with:
          name: integration-reports
          path: target/mcptest-*

The split has three benefits:

  1. The smoke job fails fast if the server cannot even start. You do not spend a slot on staging when the binary is broken.
  2. The integration job is gated by needs: smoke-stdio, so staging only sees commits that already passed the local-process tests.
  3. Staging-only flakiness no longer blocks every push, because the gating happens on PRs and on the nightly cron, not on every push to a feature branch.

The smoke and integration test sets should not overlap. Put readiness checks, schema-shape assertions, and tool surface coverage in tests/smoke/. Put authentication, network egress, real data, and slower flows in tests/integration/.


6. Caching strategy

Every platform has at least one cache layer. Picking the right one for each pattern is the difference between a 90-second CI run and a 6-minute one.

PatternBest cacheKeyEffect (expected, not yet measured)
Pattern 1 (stdio)Cargo registry + target/Cargo.lock hashSkips dependency rebuild on every commit. Expected to save the bulk of CI time on a clean Rust project.
Pattern 2 (HTTP)Docker image layer cache + mcptest binaryimage tag + mcptest versionSkips image pull, skips re-downloading the mcptest release.
Pattern 3 (deployed)mcptest binary onlymcptest versionSkips the install step. The job is otherwise network-bound.

The numbers above are deliberately labeled "expected." Measure on your own project before claiming a specific speedup in a release note.

What to put in the cache key

What not to put in the cache key

Cross-job restore

GitHub Actions and GitLab CI both restore from any matching key (most recent wins). CircleCI restores in the order listed and stops at the first hit. Order your restore_cache keys from most specific to most general so a hit on the exact Cargo.lock hash wins over a hit on the prefix.


7. Common pitfalls

Six failures show up over and over. Each one has a one-line symptom, a one- line cause, and a fix.

7.1 Missing --wait-for-ready against an HTTP target

Symptom: the first tool call returns connection refused or 404 on the first run after the server image changed, then passes on a retry.

Cause: the service container is listed as a job-level service, but the healthcheck either is not configured or only checks the TCP port, not the MCP initialize handshake. The test runs before the server is fully ready.

Fix: pass --wait-for-ready to every mcptest run that targets an HTTP server. The flag polls the configured readiness probe and gates the first tool call. For platform-level healthchecks, also configure them on the service block so the runner does not even start the step until the container reports healthy.

7.2 Secrets in YAML instead of env vars

Symptom: a test file like

servers:
  staging:
    url: "https://staging.example.com/mcp"
    headers:
      Authorization: "Bearer sk_live_abcd1234..."

ends up committed to a public repo, GitHub flags a secret scan alert, and the on-call gets paged.

Cause: tokens were pasted into the YAML instead of interpolated from environment variables.

Fix: always use ${VAR} interpolation and store the token in the platform's secret store (GitHub Secrets, GitLab CI/CD variables, CircleCI Contexts). Rotate any token that has ever appeared in a tracked file. The schema accepts plain strings in the Authorization header for local convenience, but the linter prints a warning when it sees one that looks like a real token. Treat the warning as an error in CI.

7.3 Transport mismatch (cassette recorded against stdio, replayed against URL)

Symptom: cassette replay fails with no matching interaction or method mismatch for requests that look almost identical to the recorded ones.

Cause: a cassette captures the wire-level traffic including the transport. A cassette recorded against a stdio server contains JSON-RPC frames over the stdio framing convention. A cassette recorded against an HTTP server contains HTTP requests with headers. The two are not interchangeable.

Fix: record one cassette per transport, name them as such (fixtures/list_dir.stdio.cassette.json, fixtures/list_dir.http.cassette.json), and reference the matching one in each test file. If your CI runs both patterns, replay against the cassette that matches the transport of the run. The cassette format records the transport in its header so mcptest can refuse to replay across transports.

7.4 Exit-code interpretation

Symptom: a CI step shows green even though tests failed, or red even though all tests passed.

Cause: the shell wrapper around mcptest run swallowed the exit code. Common offenders are bash -c "mcptest run ... | tee log.txt" (uses the exit code of tee, not mcptest) and set +e left over from debugging.

Fix: invoke mcptest directly as the last command of the step, with no pipe. If you must tee, use set -o pipefail first, or use the platform's log capture (every example above pipes nothing). The exit codes mcptest returns are documented in the troubleshooting guide. Treat anything non- zero as a failure unless you have a specific reason not to.

7.5 Cache key not invalidating when tests change

Symptom: a test file change does not change the result, because CI is running an older test set out of the cache.

Cause: the cache key covers source dependencies (e.g., Cargo.lock) but not the test directory. The binary is rebuilt, but the test fixtures are restored from cache and overwrite the new ones.

Fix: do not put test fixtures inside the cached path. Cache target/, the Cargo registry, and the mcptest binary, but not tests/ or examples/. If you must cache derived test artifacts (snapshots, golden files), key the cache on the hash of the source that produced them, for example hashFiles('tests/**').

7.6 Different runner OS surfacing different test results

Symptom: the test suite is green on Ubuntu, red on macOS. The failure is a path comparison or a line-ending mismatch.

Cause: Ubuntu's tmpfs is case-sensitive, macOS's HFS+ default is case- insensitive. Ubuntu line-ends with \n, Windows line-ends with \r\n. A matcher that asserts exact equality on a path or on a stdout buffer will disagree across runners.

Fix: avoid asserting exact equality on values that differ by platform. For a path or a multi-line string, use regex (anchor only the parts you care about, and write \r?\n where a line ending appears) or contains instead of exact. For environment-dependent values (temp dir, hostname), interpolate the actual environment with ${VAR} from the test runtime rather than hard-coding a literal.


8. Debugging failing CI runs

The first step is always the same: read the JUnit output. Every snippet above writes it to a known path and uploads it as an artifact. Open the file locally, find the failing case, copy the symptom, and search the troubleshooting guide.

If that does not resolve it, the steps below escalate in order.

8.1 Re-run with --verbose

Every snippet above already passes --verbose. Pull the job log and search for level=DEBUG. The verbose output includes:

If --verbose is not enough, add --debug to a one-off CI run. --debug prints raw wire bytes (with secrets redacted) and is far too noisy for default CI but is the right setting for a forensic run.

8.2 Pull the report artifacts

Every snippet above writes the reports it asks for to target/:

target/mcptest-run.json          # --reporter json --output ...
target/mcptest-junit.xml         # mcptest report ... --format junit
target/mcptest-codequality.json  # mcptest report ... --format gitlab

The JSON run file is the source of truth: it carries the full run envelope, so you can re-render any reporter format from it after the fact without re-running the suite. Every snippet uploads the target/mcptest-* glob, so all three are one click away.

For a forensic run, add --debug to the failing job. --debug prints the resolved config and raw wire bytes (with secrets redacted) to the job log. Capture the JSON run file from a passing run and from the failing run, then diff them. The first divergence usually points at the bug.

8.3 Reproduce locally with the same env vars

The reason CI fails and your laptop passes is almost always the environment. Reproduce by copying the env block out of the workflow:

export MCP_STAGING_URL="https://staging.example.com/mcp"
export MCP_STAGING_TOKEN="$(pass show mcptest/staging)"
export CI=true
export RUST_LOG=mcptest=debug
mcptest run tests/staging/ --wait-for-ready --verbose

Three rules for this loop:

  1. Match the runner OS. If CI runs on Ubuntu and you run on macOS, use a container: docker run --rm -it -v "$PWD:/app" -w /app rust:1.81 bash.
  2. Match the mcptest version. If CI is pinned to 1.0.0, install 1.0.0 locally with curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=1.0.0, sh not whatever Homebrew has.
  3. Match the test file. Do not run tests/; run the exact subdirectory the failing job runs.

If the local run passes with the same versions, the same OS, and the same env, the next suspect is the network. Run with --debug (or RUST_LOG=mcptest=trace) to log every connection attempt and the raw wire bytes with secrets redacted.

8.4 When to file a bug

Open an issue against the mcptest project when:

For everything else, the troubleshooting guide entry plus the verbose log is usually enough.


Appendix: snippet index

If you got here from another page, this is the shortest path to each worked example.

Open follow-up items, as of this writing:


9. Jenkins

Jenkins is the most common platform in enterprise shops, where a Jenkinsfile already lives next to the repo and the build server is on-prem. The patterns mirror the platforms above: stdio subprocess, HTTP service container, deployed environment. Each pattern fits into both the declarative and the scripted pipeline syntax.

The Jenkinsfile snippets assume:

9.1 Declarative Jenkinsfile (stdio)

pipeline {
  agent {
    docker {
      image 'rust:1.81'
      args '-v $HOME/.cargo:/root/.cargo'
    }
  }

  environment {
    MCPTEST_VERSION = '1.0.0'
    PATH = "$HOME/.local/bin:$PATH"
  }

  stages {
    stage('Build server') {
      steps {
        sh 'cargo build --release --bin my-mcp-server'
      }
    }

    stage('Install mcptest') {
      steps {
        sh 'curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=$MCPTEST_VERSION' sh
      }
    }

    stage('Run mcptest') {
      steps {
        sh '''
          mcptest run tests/ \
            --wait-for-ready \
            --reporter json --output target/mcptest-run.json \
            --verbose
          mcptest report target/mcptest-run.json --format junit --output target/mcptest-junit.xml
          mcptest report target/mcptest-run.json --format gitlab --output target/mcptest-codequality.json
        '''
      }
    }
  }

  post {
    always {
      junit testResults: 'target/mcptest-junit.xml', allowEmptyResults: false
      archiveArtifacts artifacts: 'target/mcptest-*', allowEmptyArchive: true
    }
  }
}

Notes:

9.2 Declarative Jenkinsfile (HTTP localhost)

pipeline {
  agent {
    docker {
      image 'docker:24'
      args '--privileged -v /var/run/docker.sock:/var/run/docker.sock'
    }
  }

  environment {
    MCPTEST_VERSION = '1.0.0'
    MCP_SERVER_URL = 'http://mcp-server:8080/mcp'
  }

  stages {
    stage('Boot server') {
      steps {
        sh '''
          docker network create mcptest-net || true
          docker run -d --rm \
            --network mcptest-net \
            --name mcp-server \
            -p 8080:8080 \
            ghcr.io/example/my-mcp-server:0.7.3
        '''
      }
    }

    stage('Run mcptest') {
      steps {
        sh '''
          docker run --rm \
            --network mcptest-net \
            -e MCP_SERVER_URL=$MCP_SERVER_URL \
            -v $WORKSPACE:/workspace \
            -w /workspace \
            --entrypoint sh \
            soapbucket/mcptest:$MCPTEST_VERSION -c '
              mcptest run tests/http/ \
                --wait-for-ready \
                --reporter json --output target/mcptest-run.json \
                --verbose
              mcptest report target/mcptest-run.json --format junit --output target/mcptest-junit.xml
              mcptest report target/mcptest-run.json --format gitlab --output target/mcptest-codequality.json
            '
        '''
      }
    }
  }

  post {
    always {
      sh 'docker stop mcp-server || true'
      sh 'docker network rm mcptest-net || true'
      junit testResults: 'target/mcptest-junit.xml', allowEmptyResults: false
      archiveArtifacts artifacts: 'target/mcptest-*', allowEmptyArchive: true
    }
  }
}

Notes:

9.3 Declarative Jenkinsfile (deployed URL)

pipeline {
  agent any

  environment {
    MCPTEST_VERSION = '1.0.0'
    MCP_STAGING_URL = credentials('mcp-staging-url')
    MCP_STAGING_TOKEN = credentials('mcp-staging-token')
  }

  triggers {
    cron('H 6 * * *')
  }

  stages {
    stage('Install mcptest') {
      steps {
        sh 'curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=$MCPTEST_VERSION' sh
      }
    }

    stage('Run mcptest against staging') {
      steps {
        sh '''
          mcptest run tests/staging/ \
            --wait-for-ready \
            --reporter json --output target/mcptest-run.json \
            --verbose
          mcptest report target/mcptest-run.json --format junit --output target/mcptest-junit.xml
          mcptest report target/mcptest-run.json --format gitlab --output target/mcptest-codequality.json
        '''
      }
    }
  }

  post {
    always {
      junit testResults: 'target/mcptest-junit.xml', allowEmptyResults: false
      archiveArtifacts artifacts: 'target/mcptest-*', allowEmptyArchive: true
    }
  }
}

Notes:

9.4 Scripted Jenkinsfile

For legacy Jenkins installations that still use scripted pipelines, the same stdio pattern looks like this:

node('docker') {
  def mcptestVersion = '1.0.0'

  docker.image('rust:1.81').inside {
    stage('Checkout') {
      checkout scm
    }

    stage('Build server') {
      sh 'cargo build --release --bin my-mcp-server'
    }

    stage('Install mcptest') {
      sh "curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=${mcptestVersion}" sh
    }

    stage('Run mcptest') {
      try {
        sh """
          mcptest run tests/ \\
            --wait-for-ready \\
            --reporter json --output target/mcptest-run.json \\
            --verbose
          mcptest report target/mcptest-run.json --format junit --output target/mcptest-junit.xml
          mcptest report target/mcptest-run.json --format gitlab --output target/mcptest-codequality.json
        """
      } finally {
        junit testResults: 'target/mcptest-junit.xml', allowEmptyResults: false
        archiveArtifacts artifacts: 'target/mcptest-*', allowEmptyArchive: true
      }
    }
  }
}

The try { ... } finally { ... } block is the scripted equivalent of post { always }. It guarantees the JUnit publisher runs even when the test step fails.

9.5 SARIF via Warnings Next Generation

Surface findings in Jenkins through the Warnings Next Generation plugin. Render SARIF from the JSON run file, then add a post step:

stage('Render SARIF') {
  steps {
    sh 'mcptest report target/mcptest-run.json --format sarif --output target/mcptest.sarif'
  }
}

// ... in the post block:
post {
  always {
    junit testResults: 'target/mcptest-junit.xml', allowEmptyResults: false
    recordIssues(
      enabledForFailure: true,
      tools: [
        sarif(pattern: 'target/mcptest.sarif')
      ]
    )
  }
}

The plugin's UI groups findings by rule ID and surfaces them on the build page. Quality-gate rules (fail the build if more than N high-severity findings appear) live in the plugin's configuration, not in the Jenkinsfile.

9.6 Shared library: mcptestStage()

Larger Jenkins shops with many repos converge on a shared library that exposes reusable steps. Once examples/ci-templates/ exists, we will ship a vars/mcptestStage.groovy there. The intended call site:

@Library('soapbucket-shared') _

pipeline {
  agent any
  stages {
    stage('Build') { steps { sh 'cargo build --release --bin my-mcp-server' } }
    stage('mcptest') {
      steps {
        mcptestStage(
          version: '1.0.0',
          testDir: 'tests/',
          formats: ['junit', 'gitlab']
        )
      }
    }
  }
}

The helper resolves to the three stages in section 9.1 with the inputs parameterized. It hides the curl install and the post-step plumbing so each consuming pipeline reads as one line.

The shared library source is not yet published; the snippet above is the intended call site for documentation purposes.


10. Buildkite

Buildkite pipelines are YAML files at .buildkite/pipeline.yml. The agent queue routes each step to a matching agent pool, so the same pipeline can run a Rust build on a build agent and a deployed-environment test on a network-egress agent.

10.1 Buildkite (stdio)

steps:
  - label: ":rust: Build server"
    key: build
    agents:
      queue: builders
    plugins:
      - docker#v5.10.0:
          image: rust:1.81
          mount-checkout: true
          environment:
            - CARGO_HOME=/workdir/.cargo
    commands:
      - cargo build --release --bin my-mcp-server
    artifact_paths:
      - "target/release/my-mcp-server"

  - label: ":test_tube: mcptest stdio"
    key: mcptest
    depends_on: build
    agents:
      queue: builders
    plugins:
      - artifacts#v1.9.4:
          download: "target/release/my-mcp-server"
      - docker#v5.10.0:
          image: rust:1.81
          mount-checkout: true
    commands:
      - chmod +x target/release/my-mcp-server
      - curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=1.0.0 sh
      - export PATH="$HOME/.local/bin:$PATH"
      - |
        mcptest run tests/ \
          --wait-for-ready \
          --reporter json --output target/mcptest-run.json \
          --verbose
      - mcptest report target/mcptest-run.json --format junit --output target/mcptest-junit.xml
      - mcptest report target/mcptest-run.json --format gitlab --output target/mcptest-codequality.json
    artifact_paths:
      - "target/mcptest-*"

Notes:

10.2 Buildkite (HTTP localhost via docker-compose)

steps:
  - label: ":test_tube: mcptest http"
    agents:
      queue: builders
    plugins:
      - docker-compose#v5.10.0:
          run: mcptest
          config: .buildkite/docker-compose.yml
    artifact_paths:
      - "target/mcptest-*"

With .buildkite/docker-compose.yml:

services:
  mcp-server:
    image: ghcr.io/example/my-mcp-server:0.7.3
    ports:
      - "8080:8080"
    healthcheck:
      test: ["CMD", "curl", "-fsS", "http://localhost:8080/health"]
      interval: 5s
      timeout: 2s
      retries: 10

  mcptest:
    image: soapbucket/mcptest:1.0.0
    depends_on:
      mcp-server:
        condition: service_healthy
    environment:
      MCP_SERVER_URL: http://mcp-server:8080/mcp
    volumes:
      - .:/workspace
    working_dir: /workspace
    entrypoint: ["sh", "-c"]
    command:
      - |
        mcptest run tests/http/ \
          --wait-for-ready \
          --reporter json --output target/mcptest-run.json \
          --verbose
        mcptest report target/mcptest-run.json --format junit --output target/mcptest-junit.xml
        mcptest report target/mcptest-run.json --format gitlab --output target/mcptest-codequality.json

Notes:

10.3 Buildkite (deployed URL)

steps:
  - label: ":test_tube: mcptest staging"
    if: build.source == "schedule" || build.message =~ /\[staging\]/
    agents:
      queue: egress
    plugins:
      - docker#v5.10.0:
          image: soapbucket/mcptest:1.0.0
          entrypoint: sh
          environment:
            - MCP_STAGING_URL
            - MCP_STAGING_TOKEN
          mount-checkout: true
    commands:
      - |
        mcptest run tests/staging/ \
          --wait-for-ready \
          --reporter json --output target/mcptest-run.json \
          --verbose
      - mcptest report target/mcptest-run.json --format junit --output target/mcptest-junit.xml
      - mcptest report target/mcptest-run.json --format gitlab --output target/mcptest-codequality.json
    retry:
      automatic:
        - exit_status: -1
          limit: 2
    artifact_paths:
      - "target/mcptest-*"

Notes:

10.4 Annotate the build with JUnit summary

Buildkite's annotation API attaches Markdown to the build page. To surface mcptest results inline (without paying for Test Analytics):

  - label: ":memo: Annotate mcptest results"
    depends_on: mcptest
    allow_dependency_failure: true
    agents:
      queue: builders
    commands:
      - buildkite-agent artifact download "target/mcptest-junit.xml" .
      - |
        if grep -q 'failures="0"' target/mcptest-junit.xml; then
          buildkite-agent annotate --style success "mcptest passed."
        else
          FAIL_COUNT=$(grep -oE 'failures="[0-9]+"' target/mcptest-junit.xml | head -n1 | grep -oE '[0-9]+')
          buildkite-agent annotate --style error "mcptest failed ($FAIL_COUNT test(s)). See artifacts."
        fi

allow_dependency_failure: true runs the annotation step even when mcptest exited non-zero, so failed builds still get the inline summary.

10.5 Agent queue routing

Three queues are typically enough:

QueueUsed for
buildersCompile-heavy work (cargo build, npm install).
egressSteps that need outbound network to staging or prod URLs.
mcptestSteps that need the mcptest binary preinstalled.

The mcptest queue is optional; the snippets above install mcptest into the step on demand. A dedicated queue saves the install step on every run at the cost of an extra agent pool to maintain. For low-volume pipelines the install-on-demand pattern is simpler.


11. Azure DevOps

Azure DevOps pipelines live at azure-pipelines.yml. The platform's test-results UI consumes JUnit through PublishTestResults@2 and the SARIF surface through PublishCodeAnalysisResults@1.

11.1 Azure DevOps (stdio)

trigger:
  branches:
    include: [main]

pool:
  vmImage: ubuntu-latest

variables:
  MCPTEST_VERSION: 1.0.0
  RUST_VERSION: 1.81

steps:
  - checkout: self

  - task: Cache@2
    inputs:
      key: 'cargo | "$(Agent.OS)" | Cargo.lock'
      path: |
        $(HOME)/.cargo
        target
      restoreKeys: |
        cargo | "$(Agent.OS)"

  - script: |
      curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain $(RUST_VERSION)
      echo "##vso[task.prependpath]$HOME/.cargo/bin"
    displayName: Install Rust toolchain

  - script: cargo build --release --bin my-mcp-server
    displayName: Build server

  - script: |
      curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=$(MCPTEST_VERSION) sh
      echo "##vso[task.prependpath]$HOME/.local/bin"
    displayName: Install mcptest

  - script: |
      mcptest run tests/ \
        --wait-for-ready \
        --reporter json --output $(Build.ArtifactStagingDirectory)/mcptest-run.json \
        --verbose
      mcptest report $(Build.ArtifactStagingDirectory)/mcptest-run.json --format junit --output $(Build.ArtifactStagingDirectory)/mcptest-junit.xml
      mcptest report $(Build.ArtifactStagingDirectory)/mcptest-run.json --format gitlab --output $(Build.ArtifactStagingDirectory)/mcptest-codequality.json
    displayName: Run mcptest

  - task: PublishTestResults@2
    condition: succeededOrFailed()
    inputs:
      testRunner: JUnit
      testResultsFiles: "$(Build.ArtifactStagingDirectory)/mcptest-junit.xml"
      testRunTitle: mcptest stdio
      failTaskOnFailedTests: true

  - task: PublishBuildArtifacts@1
    condition: succeededOrFailed()
    inputs:
      pathToPublish: $(Build.ArtifactStagingDirectory)
      artifactName: mcptest-reports

Notes:

11.2 Azure DevOps (HTTP localhost via container resource)

resources:
  containers:
    - container: mcp-server
      image: ghcr.io/example/my-mcp-server:0.7.3
      ports:
        - 8080:8080

services:
  mcp-server: mcp-server

variables:
  MCPTEST_VERSION: 1.0.0
  MCP_SERVER_URL: http://mcp-server:8080/mcp

pool:
  vmImage: ubuntu-latest

steps:
  - checkout: self

  - script: |
      curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=$(MCPTEST_VERSION) sh
      echo "##vso[task.prependpath]$HOME/.local/bin"
    displayName: Install mcptest

  - script: |
      mcptest run tests/http/ \
        --wait-for-ready \
        --reporter json --output $(Build.ArtifactStagingDirectory)/mcptest-run.json \
        --verbose
      mcptest report $(Build.ArtifactStagingDirectory)/mcptest-run.json --format junit --output $(Build.ArtifactStagingDirectory)/mcptest-junit.xml
      mcptest report $(Build.ArtifactStagingDirectory)/mcptest-run.json --format gitlab --output $(Build.ArtifactStagingDirectory)/mcptest-codequality.json
    displayName: Run mcptest

  - task: PublishTestResults@2
    condition: succeededOrFailed()
    inputs:
      testRunner: JUnit
      testResultsFiles: "$(Build.ArtifactStagingDirectory)/mcptest-junit.xml"
      testRunTitle: mcptest http
      failTaskOnFailedTests: true

Notes:

11.3 Azure DevOps (deployed URL with service connection)

schedules:
  - cron: "0 6 * * *"
    displayName: Nightly staging tests
    branches:
      include: [main]
    always: true

pool:
  vmImage: ubuntu-latest

variables:
  MCPTEST_VERSION: 1.0.0
  - group: mcptest-staging   # Variable Group with MCP_STAGING_URL, MCP_STAGING_TOKEN

steps:
  - checkout: self

  - script: |
      curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=$(MCPTEST_VERSION) sh
      echo "##vso[task.prependpath]$HOME/.local/bin"
    displayName: Install mcptest

  - script: |
      mcptest run tests/staging/ \
        --wait-for-ready \
        --reporter json --output $(Build.ArtifactStagingDirectory)/mcptest-run.json \
        --verbose
      mcptest report $(Build.ArtifactStagingDirectory)/mcptest-run.json --format junit --output $(Build.ArtifactStagingDirectory)/mcptest-junit.xml
      mcptest report $(Build.ArtifactStagingDirectory)/mcptest-run.json --format gitlab --output $(Build.ArtifactStagingDirectory)/mcptest-codequality.json
    displayName: Run mcptest against staging
    env:
      MCP_STAGING_URL: $(MCP_STAGING_URL)
      MCP_STAGING_TOKEN: $(MCP_STAGING_TOKEN)

  - task: PublishTestResults@2
    condition: succeededOrFailed()
    inputs:
      testRunner: JUnit
      testResultsFiles: "$(Build.ArtifactStagingDirectory)/mcptest-junit.xml"
      testRunTitle: mcptest staging
      failTaskOnFailedTests: true

Notes:

11.4 SARIF via PublishCodeAnalysisResults

Render SARIF from the JSON run file, then publish it:

  - script: |
      mcptest run tests/ \
        --reporter json --output $(Build.ArtifactStagingDirectory)/mcptest-run.json \
        --verbose
      mcptest report $(Build.ArtifactStagingDirectory)/mcptest-run.json --format sarif --output $(Build.ArtifactStagingDirectory)/mcptest.sarif
      mcptest report $(Build.ArtifactStagingDirectory)/mcptest-run.json --format junit --output $(Build.ArtifactStagingDirectory)/mcptest-junit.xml
    displayName: Run mcptest

  - task: PublishCodeAnalysisResults@1
    condition: succeededOrFailed()
    inputs:
      codeAnalysisResultsFiles: "$(Build.ArtifactStagingDirectory)/mcptest.sarif"
      codeAnalysisResultsType: SARIF

11.5 YAML templates for reuse

For orgs with many repositories, factor the mcptest steps into a YAML template. Add a mcptest.yml template at the org's shared-templates repo:

# templates/mcptest.yml
parameters:
  - name: testDir
    type: string
    default: tests/
  - name: mcptestVersion
    type: string
    default: 1.0.0
  - name: formats
    type: object
    default:
      - junit
      - gitlab

steps:
  - script: |
      curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=${{ sh parameters.mcptestVersion }}
      echo "##vso[task.prependpath]$HOME/.local/bin"
    displayName: Install mcptest

  - script: |
      mcptest run ${{ parameters.testDir }} \
        --wait-for-ready \
        --reporter json --output $(Build.ArtifactStagingDirectory)/mcptest-run.json \
        --verbose
      ${{ each f in parameters.formats }}:
        mcptest report $(Build.ArtifactStagingDirectory)/mcptest-run.json --format ${{ f }} --output $(Build.ArtifactStagingDirectory)/mcptest-${{ f }}-report
    displayName: Run mcptest

  - task: PublishTestResults@2
    condition: succeededOrFailed()
    inputs:
      testRunner: JUnit
      testResultsFiles: "$(Build.ArtifactStagingDirectory)/mcptest-junit-report"
      failTaskOnFailedTests: true

Consumed from a downstream pipeline:

resources:
  repositories:
    - repository: templates
      type: git
      name: shared/templates
      ref: refs/tags/v1.0.0

steps:
  - template: mcptest.yml@templates
    parameters:
      testDir: tests/integration/
      mcptestVersion: 1.0.0

Pin the template repo to a tag, not to main. A template change silently rolls out to every consuming pipeline if the consumer points at a branch.


12. Self-hosted and air-gapped environments

Enterprise installations often run CI on isolated networks with no outbound HTTP. Every artifact (Docker image, mcptest binary, SARIF schema) has to live inside the perimeter. The snippets below adapt the patterns above for that environment.

The shape is the same on every CI platform; the difference is sourcing.

12.1 Offline install: docker save and tarball

For mcptest itself, save the Docker image on an internet-connected host and ship the tarball through the same channel you use for other controlled artifacts (S3 with bucket policies, an internal artifact store, sneakernet via removable media for the strictest shops):

# On an internet-connected host
docker pull soapbucket/mcptest:1.0.0
docker save soapbucket/mcptest:1.0.0 -o mcptest-1.0.0.tar

# Compute a checksum the receiving side can verify
sha256sum mcptest-1.0.0.tar > mcptest-1.0.0.tar.sha256

# On the air-gapped CI agent
sha256sum -c mcptest-1.0.0.tar.sha256
docker load -i mcptest-1.0.0.tar
docker tag soapbucket/mcptest:1.0.0 internal-registry.example.com/mcptest:1.0.0
docker push internal-registry.example.com/mcptest:1.0.0

For the standalone binary, mirror the GitHub release artifact to an internal artifact store and adjust the install command:

# Replaces the curl https://download.mcptest.sh/install.sh path
INTERNAL_BASE="https://artifacts.example.com/mcptest/1.0.0"
curl -fsSL "$INTERNAL_BASE/mcptest-linux-x86_64.tar.gz" -o mcptest.tar.gz
sha256sum -c <(echo "$(curl -fsSL "$INTERNAL_BASE/SHA256SUMS")")
tar -xzf mcptest.tar.gz
sudo install mcptest /usr/local/bin/

The official install script (install.sh) accepts an MCPTEST_DOWNLOAD_BASE environment variable that points at the internal mirror. The install flow is otherwise identical.

12.2 Internal registry mirroring

For server images, the same docker save / docker load pattern applies. Most enterprise registries (Harbor, Artifactory, Nexus, ECR behind a private endpoint) accept the saved tarball directly:

docker pull ghcr.io/example/my-mcp-server:0.7.3
docker save ghcr.io/example/my-mcp-server:0.7.3 -o my-mcp-server-0.7.3.tar

# Transfer through the controlled channel, then on the air-gapped side:
docker load -i my-mcp-server-0.7.3.tar
docker tag ghcr.io/example/my-mcp-server:0.7.3 \
  internal-registry.example.com/mcptest/my-mcp-server:0.7.3
docker push internal-registry.example.com/mcptest/my-mcp-server:0.7.3

Update the CI snippets to reference the internal registry hostname:

# GitHub Actions / Jenkins / Buildkite / Azure pattern
services:
  mcp-server:
    image: internal-registry.example.com/mcptest/my-mcp-server:0.7.3

The image digest is more robust than the tag, because tags are mutable and registries do not always enforce immutability:

services:
  mcp-server:
    image: internal-registry.example.com/mcptest/my-mcp-server@sha256:abcdef...

12.3 No outbound HTTP

mcptest's default behavior already aligns with air-gapped environments:

If your CI agent enforces a strict deny-by-default egress policy, the only outbound calls a normal mcptest run makes are to the deployed MCP server URL (in Pattern 3) or to nothing at all (in Patterns 1 and 2, which run against local processes or sidecars).

12.4 HTTPS_PROXY and HTTP_PROXY support

For environments where outbound HTTP is allowed only through a corporate proxy, mcptest's HTTP transport honors the standard environment variables:

VariableEffect
HTTPS_PROXYRoutes HTTPS requests through the named proxy.
HTTP_PROXYRoutes HTTP requests through the named proxy.
NO_PROXYComma-separated list of hostnames to bypass.

Example for a deployed-URL pattern behind a corporate proxy:

# GitHub Actions / Jenkins / Buildkite / Azure pattern
env:
  HTTPS_PROXY: http://proxy.example.com:3128
  NO_PROXY: localhost,127.0.0.1,internal-registry.example.com
  MCP_STAGING_URL: https://staging.example.com/mcp
  MCP_STAGING_TOKEN: ${{ secrets.MCP_STAGING_TOKEN }}

The reqwest-based HTTP transport reads these variables at startup. mcptest also exposes explicit proxy flags (--proxy, --http-proxy, --https-proxy, --no-proxy, and --noproxy HOSTLIST) that override the environment when a single run needs different routing.

12.5 Internal certificate authorities

When the deployed environment uses a private certificate authority, mount the CA bundle into the agent's trust store. The standard Linux path is /etc/ssl/certs/ca-certificates.crt. mcptest's HTTP client respects SSL_CERT_FILE and SSL_CERT_DIR, so the simplest path is:

env:
  SSL_CERT_FILE: /etc/internal-ca/bundle.pem

For the Docker image, bake the CA bundle into the base image:

FROM soapbucket/mcptest:1.0.0
COPY internal-ca.pem /usr/local/share/ca-certificates/internal-ca.crt
RUN update-ca-certificates

Republish the resulting image to the internal registry and use it in place of soapbucket/mcptest:1.0.0.


13. TeamCity (stub)

Full TeamCity integration is not yet documented. For now, TeamCity is supported via the generic Docker image and the standard JUnit publisher.

A minimal TeamCity build step (Command Line runner):

docker run --rm \
  -v %teamcity.build.checkoutDir%:/workspace \
  -w /workspace \
  --entrypoint sh \
  soapbucket/mcptest:1.0.0 -c '
    mcptest run tests/ \
      --wait-for-ready \
      --reporter json --output target/mcptest-run.json \
      --verbose
    mcptest report target/mcptest-run.json --format junit --output target/mcptest-junit.xml
    mcptest report target/mcptest-run.json --format gitlab --output target/mcptest-codequality.json
  '

Then add an XML Report Processing build feature with:

The TeamCity tests tab consumes the JUnit report and surfaces failures on the build page. The GitLab Code Quality JSON is archived as an artifact through the Artifact Paths setting on the build configuration.

What is missing from this stub:

These items land in v1.2 if there is demand. The current Docker + JUnit path is enough to run mcptest in a TeamCity pipeline today.


14. Reusable templates

Reusable templates under examples/ci-templates/ are planned but not yet published. Once that directory exists, it will ship:

Until the directory exists, treat the snippets in this guide as the canonical source.