mcp-assert Blackwell Systems

cli developer-tools mcp model-context-protocol testing

Use this command to install mcp-assert:

winget install --id=BlackwellSystems.mcp-assert -e

mcp-assert: Ensuring Robust MCP Server Compatibility

Primary Purpose:
mcp-assert is a tool designed to test Model Context Protocol (MCP) servers against real-world protocols. It ensures compatibility with various clients like Claude or Cursor, providing reliable testing without mocks.

Key Features:

Cross-Language Support: Works seamlessly with servers written in Go, TypeScript, Python, Rust, Java, and more.
YAML-Based Assertions: Define tests easily in YAML and integrate them into Continuous Integration (CI) pipelines for consistent results.
Real Transport Testing: Mimics actual client connections using stdio, SSE, or HTTP to ensure accurate testing environments.
CI Tool Integration: Compatible with popular CI tools like GitHub Actions, pytest, Jest, and more, enabling seamless setup without additional toolchains.

Audience & Benefits:
Tailored for developers and teams building MCP servers, mcp-assert ensures robustness and compatibility across all MCP clients. By testing against real protocols, it helps identify issues that mocks might miss, leading to fewer errors and improved reliability. This tool is essential for ensuring your MCP server functions correctly and consistently across different platforms.

Installation:
mcp-assert can be installed via winget on Windows, making setup straightforward and efficient.

README

Test your MCP server against the real protocol. No mocks. No imports. No language lock-in.

mcp-assert connects to your server exactly like Claude, Cursor, or any MCP client would: real stdio/SSE/HTTP transport, full initialize handshake, actual tool calls. It checks responses against expectations you define in YAML. If it passes mcp-assert, it works with every MCP client.

> [!WARNING] > We scanned 102 MCP servers and found 4,794 schema issues (2,239 errors) across 55 servers including AWS, Serena, and Grafana. The most common failure: parameters missing type definitions cause agents to send wrong value types. See the scorecard.

Your YAML        ──→  mcp-assert  ──→  MCP Server
(inputs + assertions)    (client)        (any language)
                            │
                        Pass / Fail

Your server can't tell the difference

mcp-assert speaks the full MCP protocol: initialize handshake, tools/list discovery, tools/call with real arguments. It finds bugs that unit tests miss because it tests over the wire, not in-process.

Adopted in production

Wyre Technology: 25 MCP servers tested via shared baseline workflow using mcp-assert-action
Ant Group (AntV): integrated into CI within 3 days of launch
Vera: recommended test harness on project roadmap (#529)
Fix PRs merged: Google, Grafana, LangChain, official MCP SDKs

Command	What it does	Setup required
`audit --server "..."`	Scan any server, classify every tool as healthy/crashed/timed out	None
`fuzz --server "..."`	Throw adversarial inputs at every tool, find crashes and hangs	None
`init --server "..."`	Generate a complete test suite from tools/list + capture snapshots	None
`run --suite evals/`	Run YAML assertions, report pass/fail	YAML files
`ci --suite evals/`	Run with thresholds, baselines, JUnit XML, GitHub Step Summary	YAML files
`coverage --suite evals/ --server "..."`	Report which tools have assertions and which don't	YAML files
`snapshot --suite evals/ --update`	Capture responses as golden files for regression detection	YAML files
`watch --suite evals/`	Re-run on YAML changes, show diffs when status flips	YAML files
`matrix --languages go:gopls,ts:tsserver`	Same suite across multiple language servers	YAML files
`intercept --server "..." --trajectory t.yaml`	Proxy between agent and server, capture live tool call trace	Trajectory YAML
`lint --server "..."`	Check tool schemas for missing descriptions, untyped params, and agent usability issues	None

Dimension	LLM-as-judge eval frameworks	mcp-assert
Best for	Subjective outputs (prose, creative content)	Deterministic outputs (data, state, validation)
Grading	Language model scoring (flexible, costly)	Assertion-based (exact, free)
Speed	Seconds per test (LLM round-trip)	Milliseconds per test (no LLM)
CI cost	API calls on every run	Zero external dependencies
Reliability	Not measured	pass@k / pass^k per assertion
Regression	Not supported	Baseline comparison, fail on backslide
Multi-language	Not supported	Same assertion across N language servers

mcp-assert Blackwell Systems

README

Your server can't tell the difference

Adopted in production

Install

Quick Start

Audit any MCP server in seconds. No setup.

Write assertions from scratch

Already using Vitest, Jest, Bun, PHPUnit, or pytest?

Everything you can do

Zero-Effort Coverage

How It Differs From LLM-as-Judge Frameworks

Why not just write tests?

CI Integration

pytest Integration

Vitest Integration

Documentation

License