THREAT MODEL

What mcpwall Does
and Doesn’t
Protect Against

transparencySecurity tools that hide their limitations aren’t security tools. Here’s what mcpwall covers, what it doesn’t, and what we’re building next.
01 / The Short Version

Where mcpwall sits

mcpwall is a transparent stdio proxy. It intercepts every JSON-RPC message from your AI coding tool to the MCP server. Rules are YAML, evaluated top-to-bottom, first match wins.

The key word is bidirectional firewall. Since v0.2.0, mcpwall inspects both what your AI agent asks to do and what the server sends back. Secrets in responses are redacted, prompt injection patterns are blocked.

Inbound: inspected & filtered
Claude CodemcpwallMCP Server
Outbound: inspected since v0.2.0
Claude Codemcpwall (passthrough)MCP Server
02 / What’s Covered

8 attack classes blocked out of the box

No configuration needed. These default rules apply automatically and scan every argument value recursively.

Covered
SSH key theft
Blocks .ssh/, id_rsa, id_ed25519, id_ecdsa in any argument.
Covered
.env file access
Blocks .env and all variants (.env.local, .env.production).
Covered
Credential files
AWS credentials, .npmrc, Docker config, kube config, .gnupg.
Covered
Browser data
Chrome, Firefox, Safari profile paths, cookies, login data stores.
Covered
Destructive commands
rm -rf, mkfs, dd if=, format C: and variants.
Covered
Pipe-to-shell
curl/wget/fetch piped to bash/sh/python/node.
Covered
Reverse shells
netcat, /dev/tcp/, bash -i, mkfifo, socat.
Covered
Secret / API key leakage
10 patterns (AWS, GitHub, OpenAI, Stripe, etc.) + Shannon entropy.

Plus: JSON-RPC batch bypass fixed (C1), ReDoS mitigation at config load, symlink resolution for path traversal, crash protection with fail-open behavior.

03 / What’s Not Covered

Known limitations

These are attack classes that mcpwall does not yet mitigate. We’re publishing them because hiding limitations is worse than having them.

HIGH
Response-side attacks
Server responses are now scanned by outbound rules. Secrets are redacted, prompt injection patterns are blocked, and suspicious content is flagged. Shipped in v0.2.0.
HIGH
Base64 / URL encoding bypass
Rules match literal strings only. Base64-encoded secrets or URL-encoded commands pass through. Decoding before matching would add latency and complexity.
HIGH
Rate limiting / DoS
No throttling on tool call volume. A runaway agent can make unlimited calls. Planned: v0.4.0.
MEDIUM
Tool description poisoning / rug pulls
mcpwall does not inspect tool metadata from tools/list. A server can change descriptions after trust is established. Planned: v0.3.0.
MEDIUM
Prompt injection
mcpwall can’t detect semantic manipulation of the LLM. It sees the resulting tool call, not the manipulation — but it may still catch the dangerous arguments.
MEDIUM
Shell metacharacter bypass
Pipes (|) caught. Semicolons (;), &&, backticks, and $() not covered by default rules. Custom rules can address this.
MEDIUM
Unicode / DNS exfiltration / env leakage
Homoglyph attacks, DNS subdomain encoding, and inherited environment variables are out of scope.
04 / Defense in Depth

One layer, not the whole stack

mcpwall is not a complete security solution. It’s one layer in a defense-in-depth strategy. We recommend combining it with:

LAYER 1
Install-time scanning
Check tool descriptions for suspicious content before you use a server. (mcp-scan, etc.)
LAYER 2
Runtime firewall
Enforce policy on every tool call as it happens. Catch runtime arguments, secrets, dangerous commands. (mcpwall)
LAYER 3
Container isolation
Limit blast radius by running MCP servers in containers or sandboxes.
05 / What’s Next

Closing the gaps

Every “not covered” item above has a plan:

v0.2.0Response inspection — outbound rules scan responses for secrets, injection, and suspicious content (shipped)
v0.3.0Tool integrity / rug pull detection — hash descriptions, detect changes
v0.3-4HTTP/SSE proxy mode — support remote MCP servers
v0.4.0Rate limiting — throttle excessive tool calls
The full threat model includes component-by-component analysis, all default rule details, severity ratings, and the complete list of assumptions.
Read the full threat model →