🐱 PrismCat

GitHub Release License

> You never know how much junk your SDK silently injects into your prompts — until you use PrismCat.

PrismCat is a self-hosted, transparent proxy and debugging console for LLM APIs. Change one line — your base_url — and instantly see every request and response between your app and OpenAI / Claude / Gemini / Ollama / any LLM API, including streaming (SSE).

PrismCat Dashboard

⚡ Get Started in 30 Seconds

1. Launch

Grab the binary for your system from Releases.

Platform	How to Start
Windows	Run `prismcat.exe` — it lives in your system tray
Linux / macOS	Run `./prismcat`
Docker	See Docker Deployment

Open http://localhost:8080 in your browser.

2. Add an Upstream

In the Settings page, add an upstream. For example:

Name	Target
`openai`	`https://api.openai.com`

PrismCat gives you a proxy address: http://openai.localhost:8080

3. Change One Line, Start Capturing

from openai import OpenAI

client = OpenAI(
    base_url="http://openai.localhost:8080/v1",  # ← change only this
    api_key="sk-..."
)

# everything else stays exactly the same
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

Go back to the dashboard. Your full request and response are already there. That's it.

🧩 How It Works

PrismCat uses subdomain routing for truly transparent proxying. When you add an upstream named openai:

Your App                     PrismCat                      OpenAI
   │                           │                             │
   │  openai.localhost:8080    │   api.openai.com            │
   │ ─────────────────────────&gt;│ ────────────────────────────&gt;│
   │                           │       logs request ✓         │
   │&lt;─────────────────────────│&lt;────────────────────────────│
   │                           │       logs response ✓        │

Why subdomains? Because they make the proxy truly transparent — your request paths (like /v1/chat/completions) stay exactly the same. No path rewriting, no SDK quirks. Any language, any SDK, any LLM — as long as it lets you set a base_url, it just works. You can even chain proxies (App → PrismCat → relay → OpenAI) with zero friction.

> 💡 About *.localhost: Modern browsers and most operating systems automatically resolve *.localhost to 127.0.0.1 — no hosts file editing required. If your environment doesn't support this, see Path Routing Mode or add a hosts entry manually.

✨ Key Features

📊 Full Traffic Observability

Complete request/response headers and bodies
SSE streaming captured in full — view raw chunks or the merged result
Auto-formatted JSON, smart Base64 folding (no more drowning in image data) with one-click image preview

Image Preview

🎮 One-Click Replay (Playground)

See a failed request? Hit Replay, tweak the prompt or parameters right in your browser, and resend instantly. No need to re-run your Python/Node script.

🛠️ Request Override (Opt-In)

Mutate outbound JSON request bodies with JSON Patch rules — cap max_tokens globally, swap models, strip fields a framework auto-injected, all without touching your code. Each rule is matched by method / path / JSON content; the log detail page shows a side-by-side diff of the original vs. final request.

> 🔒 Strictly opt-in. PrismCat is a transparent proxy by default and never touches your requests unless you (1) flip the master switch, (2) define rules, and (3) bind them to specific upstreams. Skip any of those steps and every byte is forwarded untouched.

🔐 Privacy & Security

Fully local — data stays in local SQLite + filesystem, no third-party servers
Automatic masking of sensitive headers (Authorization, api-key)

🏷️ Log Tagging

Add X-PrismCat-Tag: my-tag to any request header to categorize logs in the UI. Perfect for shared proxies with multiple users or projects.

📦 Dead-Simple Deployment

Single binary, zero dependencies. Windows system tray support. Native Docker image available.

🔄 Always-On, Always Reviewable

PrismCat is designed to run as a silent, 24/7 LLM black box. You don't need to "remember to start capturing" when a bug happens — it's already recording. Automatic log retention cleanup and large-body offloading keep storage healthy over months of continuous operation. Perfect for monitoring autonomous Agents that you can't fully predict — just go back and review what they actually sent and received, days after the fact.

🎯 Who Needs PrismCat?

Your Problem	How PrismCat Helps
"Why is my token usage so high? My prompt is short!"	See the hidden system prompts and few-shot examples your SDK/framework silently injects
"Function Calling keeps returning broken JSON"	Capture the raw model output, tweak your prompt in the Playground, and retry instantly
"Streaming output sometimes freezes or gets truncated"	Every SSE chunk is recorded — pinpoint whether the issue is the model, gateway, or client
"I run local models with Ollama, want to inspect the traffic"	Add an upstream pointing to `http://localhost:11434` — it's a universal HTTP proxy
"Multiple people share one API key — whose request failed?"	Use `X-PrismCat-Tag` to tag by user, find the culprit in seconds
"My Agent went rogue and I have no idea what it did"	PrismCat silently logs every API call — review the full behavior chain anytime
"I want to cap `max_tokens` globally / strip a field LangChain auto-injects"	Write a JSON Patch rule in Request Override (opt-in; transparent by default)

🤔 PrismCat vs. Alternatives

	PrismCat	mitmproxy	Langfuse / Helicone
Deployment	Single binary / Docker	Local install + certs	SaaS or complex self-host
LLM-Optimized	✅ JSON formatting, Base64 folding, SSE merge	❌ Generic HTTP inspector	✅ But geared toward production monitoring
One-Click Replay	✅ Built-in Playground	❌	Partial
Integration	Change `base_url`	System-wide proxy / certs	Instrument SDK code
Data Ownership	Fully local	Fully local	Third-party dependent
Stream Playback	✅ Raw + merged view	Poor UX	Partial
Long-Term Running	✅ Auto-cleanup, silent background	Ad-hoc debugging tool	✅ But requires external infra

🐳 Docker Deployment

Docker Compose

Create a docker-compose.yml:

services:
  prismcat:
    image: ghcr.io/paopaoandlingyia/prismcat:latest
    container_name: prismcat
    ports:
      - "8080:8080"
    environment:
      # Dashboard hosts. Use localhost locally; use your domain or IP on a server.
      - PRISMCAT_UI_HOSTS=localhost,127.0.0.1
      # Base domain for subdomain routing. For bare-IP deployments, enable path routing instead.
      - PRISMCAT_PROXY_DOMAINS=localhost
      # For bare IP / no wildcard domain deployments: set PRISMCAT_UI_HOSTS to your IP and enable path routing.
      # - PRISMCAT_UI_HOSTS=YOUR_IP
      # - PRISMCAT_ENABLE_PATH_ROUTING=true
      # Recommended for public-facing deployments; leave empty to set it on first UI access
      - PRISMCAT_UI_PASSWORD=your_strong_password
      - PRISMCAT_RETENTION_DAYS=30
    volumes:
      - ./data:/app/data
    restart: always

docker compose up -d

Docker Run

docker run -d --name prismcat \
  -p 8080:8080 \
  -e PRISMCAT_UI_HOSTS=localhost,127.0.0.1 \
  -e PRISMCAT_PROXY_DOMAINS=localhost \
  -e PRISMCAT_UI_PASSWORD=your_strong_password \
  -e PRISMCAT_RETENTION_DAYS=30 \
  -v ./data:/app/data \
  --restart always \
  ghcr.io/paopaoandlingyia/prismcat:latest

🔀 Fallback: Path Routing Mode

If your environment can't resolve *.localhost, or you're deploying to a bare IP without a wildcard domain, enable path routing mode in Settings to route by URL path instead of subdomain:

# Path routing mode — no subdomain resolution needed
client = OpenAI(
    base_url="http://localhost:8080/_proxy/openai/v1",  # On a server: http://YOUR_IP:8080/_proxy/openai/v1
    api_key="sk-..."
)

Enable via config or environment variable:

# config.yaml
server:
  enable_path_routing: true
  path_routing_prefix: "/_proxy"

# or via environment variable
PRISMCAT_ENABLE_PATH_ROUTING=true

> Note: Path routing adds a prefix to your request URL (e.g., /_proxy/openai/...), which may require extra care with how some SDKs construct paths. Subdomain mode doesn't have this caveat.

🌐 Production Deployment (Nginx + Wildcard Domain)

For public-facing deployments, use a wildcard domain (e.g., *.prismcat.example.com) with Nginx:

server {
    listen 80;
    server_name prismcat.example.com *.prismcat.example.com;

    location / {
        proxy_pass http://127.0.0.1:8080;
        proxy_set_header Host $host;  # Required: pass original Host for subdomain routing

        # Required for SSE / streaming
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_buffering off;

        client_max_body_size 50M;
    }
}

Then add prismcat.example.com to PrismCat's proxy_domains. The dashboard is available at prismcat.example.com, and your upstream openai is available at openai.prismcat.example.com.

⚙️ Configuration Reference

The config file lives at data/config.yaml and is created on first launch. Most settings can also be changed from the Settings page in the UI.

Full config example

server:
  port: 8080
  ui_password: ""           # Console password; leave empty to set it on first UI access
  proxy_domains:            # Base domains for subdomain routing
    - localhost

logging:
  max_request_body: 5242880       # Save request content up to 5MB
  max_response_body: 33554432     # Save response content up to 32MB
  sensitive_headers:              # Headers to auto-mask
    - Authorization
    - api-key
    - x-api-key
  detach_body_over_bytes: 2097152 # Load bodies &gt; 2MB on demand
  early_request_body_snapshot: false

storage:
  retention_days: 30              # Log retention in days; 0 = keep forever

upstreams:
  openai:
    target: "https://api.openai.com"
    timeout: 120
    outbound_proxy: "env"          # env, direct, or a proxy URL such as http://127.0.0.1:7890
  gemini:
    target: "https://generativelanguage.googleapis.com"
    timeout: 120
    outbound_proxy: "http://127.0.0.1:7890"

🧩 FAQ

Q: openai.localhost doesn't work?

Most modern systems resolve *.localhost to 127.0.0.1 automatically. If yours doesn't:

Add 127.0.0.1 openai.localhost to your hosts file
Or enable Path Routing Mode as a workaround
Or use your own wildcard domain (see Nginx Deployment)

Q: Streaming feels "stuck"?

If you're behind a reverse proxy (e.g., Nginx), make sure you have:

proxy_buffering off;
proxy_http_version 1.1;

Nginx buffers entire responses by default, making streaming look like it's hanging.

Q: Which LLM services are supported?

PrismCat is a generic HTTP proxy — it's not tied to any specific LLM provider. Any HTTP/HTTPS API works, including:

OpenAI / Azure OpenAI
Anthropic Claude
Google Gemini
Ollama / LM Studio (local models)
API relay services / aggregators

Q: Does it add latency?

PrismCat uses asynchronous log writing. The proxy overhead is typically under 1ms. Logging never blocks request forwarding.

❤️ Support PrismCat

If PrismCat helps you debug LLM apps or saves you time, you can support the project here:

Support PrismCat on Afdian

🛡️ License

MIT License

PrismCat Etgpao

README