mcptest docs GitHub

GitLab Code Quality reporter

mcptest emits a JSON array compatible with GitLab Code Quality so failures surface as inline annotations on the merge request "Changes" tab. The same array can be consumed by any tool that speaks the GitLab Code Quality format (CodeClimate engines, GitLab CI custom widgets).

Run this example. Run the reference server suite, then render the saved run as a GitLab Code Quality report:

mcptest run --config examples/reference-server/tests/smoke.yml --reporter json --output run.json
mcptest report run.json --format gitlab --output gl-code-quality-report.json

Note. CI is currently disabled for this repo during heavy development. The snippet below is written to be drop-in when CI is re-enabled. The reporter itself is exercised by cargo test --workspace, including the schema validation tests under crates/mcptest-core/tests/gitlab_validation.rs.

What the reporter emits

Every failing test becomes one finding. Skipped tests render as minor findings so reviewers see them surface without flipping the pipeline status. Passing tests are omitted because the format models a defect stream.

FieldSource
description<test name>: <failure message>.
check_nameTestResult::rule_id when set (PROTO-001, EDGE-005, ...). Falls back to mcptest.assertion.
fingerprintSHA-256 hex of rule_id + file + line (or name + file + line when no rule is attached). Excludes the failure message and test name so cosmetic edits do not churn fingerprints across runs.
severityMUST -> critical, SHOULD -> major, MAY -> minor. Test failure with no rule attached -> major. Skipped tests -> minor.
location.pathTestResult::file.
location.lines.beginTestResult::line.
properties.duration_ms, properties.cache_hitPer-test telemetry. GitLab ignores unknown keys, so this is safe to attach.

Severity mapping rationale

GitLab orders severities info < minor < major < critical < blocker. Mapping RFC 2119 severity directly to that ladder keeps the merge request badge meaningful: a MUST regression is critical (block the merge), a SHOULD regression is major (review required), and a MAY regression is minor (informational). When the test is not a compliance check, the safe default is major so reviewers still see the finding.

Fingerprint stability

Fingerprints stay stable across reruns of the same test even when the failure message changes. Two distinct rules at the same location still produce different fingerprints. The hash inputs are documented in the reporter source so external tools can reproduce them if needed.

GitLab CI

stages:
  - test

mcptest:
  stage: test
  image: rust:1.85
  script:
    - curl -sSL https://download.mcptest.sh/install.sh | sh
    - mcptest run
        --config tests/mcp.yaml
        --reporter gitlab
        --output gl-code-quality-report.json
  artifacts:
    when: always
    reports:
      codequality: gl-code-quality-report.json
    paths:
      - gl-code-quality-report.json
    expire_in: 1 week
  allow_failure: true

reports.codequality is the well-known artifact GitLab consumes to populate the merge request widget. allow_failure: true lets the job upload the report even when mcptest exits non-zero; remove it once you want a failing run to block the merge.

Local rendering from a saved JSON run

If you already capture the JSON run envelope, re-render it without a second run:

mcptest report run.json --format gitlab --output gl-code-quality-report.json

The redaction policy is re-applied at the dispatch site, so secrets sealed by the JSON reporter stay sealed when GitLab Code Quality emits.

Validation

crates/mcptest-core/tests/gitlab_validation.rs validates every fixture render (sample, sample_full, sample_all_pass) against the GitLab Code Quality required-field schema. Extend the test fixture when the reporter starts emitting a new field; the format is documented at gitlab.com/ee/ci/testing/code_quality.html.