OpenClaw Guardrails icon

OpenClaw Guardrails Coming Soon

Prompt injection detection for autonomous AI agents

OpenClaw Guardrails scans every web page for prompt injection attacks before an autonomous AI agent can read it. It detects instruction overrides, role manipulation, delimiter injection, system prompt leaks, and encoding attacks — then warns or blocks the agent in real time. All detection runs locally in your browser using the vard library. No data ever leaves your device.

Features

How to Use

  1. Install the extension from the Chrome Web Store and pin it to your toolbar.
  2. Open the popup by clicking the OpenClaw Guardrails icon. The extension is enabled by default and will start scanning pages immediately.
  3. Configure thresholds. Adjust the warning and block severity thresholds to control how aggressively the extension responds. Lower values catch more potential threats.
  4. Set up your lists. Add trusted domains to the whitelist (never scanned) and dangerous domains to the blocklist (always blocked). Enter one hostname per line.
  5. Toggle features. Enable or disable specific threat types (instruction override, role manipulation, etc.) and the email button hiding feature.
  6. Let your agent browse. When the agent visits a page, the extension scans it automatically. Warnings and blocks are injected into the DOM so the agent reads them as part of the page content.
  7. Review scan history. Check the popup to see a log of recent scans, which threats were detected, and what action was taken.
Tip: The extension injects warnings and blocks directly into the page DOM. This means AI agents that read page content will see the warning text and can act accordingly — for example, stopping and asking the user for guidance before proceeding.