How AI-Guardian Works
Four layers of protection — all running locally on your device. Here's the full technical picture.
Architecture Overview
Browser Extension
Chrome / Firefox / Edge content script + service worker
Desktop Agent
Electron-based OS-level input hook (macOS, Windows)
Detection Engine
Compiled regex + entropy analysis + lightweight on-device ML
Audit Pipeline
Anonymised event metadata → Supabase (no raw text ever transmitted)
Step-by-Step Breakdown
AI-Guardian's browser extension injects a lightweight content script into every AI web interface (ChatGPT, Claude, Gemini, Perplexity, Cursor, etc.). The script hooks into the browser's native input events — beforeinput, paste, and submit — to capture the full text of any prompt before the browser constructs the outbound HTTP request.
For native desktop AI applications (Claude Desktop, Cursor, VS Code extensions), the companion desktop agent operates at the OS input layer using platform-specific accessibility APIs (macOS Accessibility API, Windows UI Automation). This ensures coverage extends beyond the browser to every application on the device.
Why local interception matters
Network-level DLP tools inspect traffic after it has left the endpoint — often encrypted and already en route. By intercepting at the input layer, AI-Guardian catches sensitive data before any TLS handshake occurs, making it effective even when employees use personal hotspots, VPNs, or connections that bypass corporate proxies.
Captured text is passed through a multi-layer detection stack that runs entirely on-device. All computation completes in under 20 milliseconds for typical prompt lengths, keeping the user experience seamless.
Regex Pattern Library
40+ compiled patterns covering AWS/GCP/Azure keys, GitHub tokens, Stripe secrets, database DSNs, PEM headers, and more.
Entropy Analysis
High-entropy string detection catches bespoke credentials that don't match known formats — e.g. internal secrets or custom auth tokens.
Context-Aware PII
Natural language analysis identifies PII in prose (names, emails, phone numbers, national IDs) even when not in structured fields.
Custom Rules
Enterprise teams define organisation-specific patterns: internal codenames, customer account prefixes, proprietary data formats.
Detection categories include: API keys and credentials, PII (GDPR-defined personal data), financial account identifiers, internal network topology, and custom enterprise patterns. Each detected item is tagged with a category label used in redaction and audit events.
When sensitive content is detected, AI-Guardian does not block the request. Blocking creates friction that drives Shadow AI adoption — employees simply route around the control. Instead, sensitive spans are replaced with typed redaction tokens in the prompt before it is transmitted.
Before redaction
"Debug this: AKIAIOSFODNN7EXAMPLE connecting to prod-db.internal:5432 pw=s3cr3t"
After redaction (what the AI receives)
"Debug this: [REDACTED:AWS_KEY] connecting to [REDACTED:HOSTNAME]:5432 pw=[REDACTED:PASSWORD]"
The AI model receives a structurally intact prompt and can still provide useful assistance. The engineer sees the redacted version in their browser UI, giving them immediate feedback that sensitive content was detected. No interruption, no blocked workflow — just silent, automatic protection.
Redaction tokens are typed and category-labelled ([REDACTED:AWS_KEY], [REDACTED:EMAIL], [REDACTED:PII]) so the AI understands what was redacted and can respond appropriately.
The single most important design principle behind AI-Guardian: we never see your sensitive data. The entire detection and redaction pipeline runs locally on the endpoint. Our servers receive only anonymised telemetry.
What stays on your device
- ·The full text of every prompt (pre and post-redaction)
- ·The raw content of any detected sensitive item
- ·User identifiers linked to specific inputs
- ·Any data subject to GDPR, HIPAA, or SOC 2 controls
What reaches our servers
- ·Anonymised event counts (e.g. '3 AWS_KEY events today')
- ·Hashed user ID (no name, email, or identifiable data)
- ·Detection category labels (not the detected values)
- ·Timestamp and extension version for analytics
This architecture means AI-Guardian can be deployed in environments with the strictest data residency requirements — including EU-only data processing under GDPR, HIPAA-covered healthcare environments, and regulated financial services. We are GDPR-compliant, SOC 2 Type II in progress, and fully compatible with EU AI Act requirements for AI system governance.
Built for Regulated Industries
AI-Guardian's privacy-first architecture satisfies the data minimisation and purpose limitation requirements of GDPR, the system governance mandates of the EU AI Act, and the access control requirements of SOC 2. Detailed compliance documentation is available for enterprise customers.