Security

Deterministic boundaries around non-deterministic AI.

Your AI assistant has access to your most sensitive data — email, calendar, contacts, and messages. Unlike standard AI assistants, Frida Kàiló runs on your dedicated VPS with enforcement boundaries that protect you from AI mistakes, manipulation, and data leakage.

Defense Matrix

Threats & Mitigations

▸

Irreversible Actions

AI assistants can make catastrophic mistakes — deleting entire inboxes, wiping calendars, or mass-unsubscribing from important services. Most assistants operate with unrestricted access, trusting that prompts alone will prevent disasters.

MITIGATION

$ Bulk destructive operations are automatically blocked. Actions like mass deletion or exports require explicit confirmation, with approval requests sent to your active messaging channels. You stay in control of high-impact decisions.

▸

Data Exfiltration

An AI could forward your personal data to unauthorized recipients, either through manipulation or by following malicious instructions hidden in emails and messages you've received.

MITIGATION

$ Outbound data transfers are monitored for anomalies. Sending to new recipients triggers confirmation requests. Data from external sources is automatically marked as untrusted, escalating scrutiny on any action that uses it.

▸

Prompt Injection

Malicious content in emails, messages, or web pages can manipulate AI assistants into violating their own instructions. Attackers embed hidden commands that the AI follows — a vulnerability that prompt-based safety alone cannot prevent.

MITIGATION

$ Code-level enforcement boundaries operate independently of AI beliefs. Even if manipulation succeeds at the prompt level, dangerous actions are blocked before execution. Your protection doesn't rely on the AI "remembering" to be safe.

▸

Social Engineering

Attackers can build trust over multiple conversations, then exploit that relationship to request sensitive actions. Authority impersonation and emotional manipulation are common tactics that have succeeded against frontier AI models in security research.

MITIGATION

$ Identity verification happens outside the AI's decision space. High-impact operations require nonce-based challenges sent to your authenticated channels — impossible for the AI to self-approve, even under manipulation.

▸

Credential Theft

AI assistants with filesystem access can read configuration files containing API keys, tokens, and credentials. Once exfiltrated, these secrets enable full account takeover.

MITIGATION

$ Secrets are stored outside the AI's reach, injected at runtime rather than stored in readable files. Access to configuration paths is blocked at the code level. Audit logs record all file operations for operator visibility.

▸

Sandbox Escape

Even containerized agents can attempt to escape their isolation through policy bypass or race conditions. Without proper containment, an escaped agent gains full host access.

MITIGATION

$ Minimal bind mounts, network isolation, and container security options prevent escape. File operations run inside containers, not on the host. Critical files are root-owned and read-only.

▸

Malicious Plugins

Community plugins or skills can introduce backdoors, exfiltration channels, or bypass tool restrictions. Malicious code executes with the agent's privilege level.

MITIGATION

$ Community plugins require explicit confirmation for all actions. Plugin integrity is verified via checksums. Source-aware classification separates bundled from community tools.

▸

Gateway Exploitation

Compromised credentials or tokens enable attackers to control your AI infrastructure remotely. A single leaked token can expose your entire VPS and all connected services.

MITIGATION

$ Quarterly token rotation, loopback-only binding, and device registration monitoring prevent unauthorized access. All gateway requests are logged with anomaly detection.

Differentiation

Why This Matters

Standard AI Assistants

Prompt-only safety that can be manipulated. Your data lives on shared infrastructure. No visibility into operations. One breach affects millions.

Frida Kàiló

Code-level enforcement that cannot be bypassed. Your dedicated VPS with isolated execution. Complete audit trail. Protection that doesn't trust the AI itself.

Infrastructure

What We Implemented So Far

▸

Dedicated VPS per client

Every client runs on an isolated Hetzner Cloud VPS — no shared runtime, no cross-tenant data access. The AI agent and all credentials are contained within your machine.

▸

SSH hardening at boot

Password authentication and root password login are disabled by policy — not just by convention. SSH key-only access with strict attempt limits is enforced via a hardening config drop-in, applied before the server is reachable.

▸

Brute-force protection

fail2ban runs with aggressive rate limiting — repeated failed attempts trigger automatic IP bans. Automated scanners are blocked before they can probe the surface.

▸

Dual-layer firewall

UFW enforces deny-all inbound by default — only SSH and VPN traffic are allowed through minimal exceptions. A Hetzner Cloud Firewall is attached at the hypervisor level before cloud-init begins, closing the boot window before UFW activates.

▸

Cloud metadata endpoint blocked

The Hetzner instance metadata service (169.254.169.254) is dropped via iptables on every VPS. Even if an agent achieves code execution, it cannot reach infrastructure credentials through IMDS.

▸

Tailscale-only gateway access

All external traffic to the AI gateway is routed through Tailscale Funnel — end-to-end encrypted over a WireGuard mesh. The gateway process binds to localhost only; no port is directly exposed to the public internet.

▸

Secrets scrubbed after provisioning

Cloud-init user-data — which contains API keys, tokens, and the Tailscale auth key — is deleted from disk immediately after provisioning completes. Credential files are stored with restricted permissions under a dedicated unprivileged user, inaccessible to other processes.

▸

Unprivileged runtime user

The AI gateway runs as a dedicated unprivileged user — never as root. Automatic security updates keep the OS patched without operator intervention.

/ai-sec-report

Deep Dive: Full Threat Model & Mitigations

Read the complete IT security report with all 8 attack vectors, research sources, and detailed technical mitigations.

Read Full Report

← Back to home