Skip to content
All tags

#security

11 posts
ai deep-dive

The Single Crack in Agent Security: From Prompt Injection to Trust Boundaries to Multi-Agent Worms

Three seemingly distinct agent security problems — tool output injection, trust boundaries, malicious agents — share the same root cause: LLMs flatten instructions and data into a single token stream, making them architecturally unable to distinguish between the two. Understand this through-line and you can trace every attack from EchoLeak (CVE-2025-32711, zero-click) to the Morris II AI worm, and see why 'making the model behave' doesn't work — only architectural constraints (six design patterns, CaMeL) do.

tech deep-dive

Bumblebee: A Design Teardown of Perplexity's Read-Only Supply Chain Endpoint Scanner

A Go read-only scanner open-sourced by Perplexity in May 2026 (v0.1.1, zero non-stdlib dependencies). It inventories npm/PyPI/Go/RubyGems/Composer/MCP/editor and browser extensions into NDJSON, matches against a custom exposure catalog, and answers the question 'which machines in my fleet are currently affected' the moment a supply chain incident hits. It deliberately never invokes any package manager and is not an EDR.

ai

OpenAI's Codex Secure Deployment Strategy: Sandboxing, Auto-review, and Enterprise Governance

In May 2026, OpenAI published its internal Codex deployment practices: sandboxes define technical boundaries, approval policies determine when to pause, Auto-review delegates approval decisions to a sub-agent instead of a human, and Managed configuration lets enterprise admins enforce policies top-down. The core philosophy: zero friction for low-risk actions, mandatory review for high-risk ones.

ai guide

Lessons from the Trenches: What AI Native Teams Must Get Right

Not everyone should use a coding agent to modify code directly. AI Native teams need interface specs, test-first development, monorepo, security guardrails, human-in-the-loop, and token budget controls. Building an agent platform layer on top of coding agents and clearly redefining developer roles is the right path forward.

ai guide

OpenClaw Access Control: Authentication, Secrets, and OAuth

API Key is the most stable option; OAuth uses PKCE + token sink pattern; SecretRef supports env/file/exec sources; Trusted Proxy delegates authentication to a reverse proxy.

ai guide

OpenClaw Sandbox Mechanism: Docker, SSH, and OpenShell

OpenClaw's sandbox has three layers of control: Sandbox determines where code runs (Docker/SSH/OpenShell), Tool Policy determines which tools are available, and Elevated is the host escape hatch for exec.

ai guide

OpenClaw Threat Model: MITRE ATLAS Security Analysis and Formal Verification

OpenClaw uses the MITRE ATLAS framework to analyze AI system threats, identifying three Critical risks (prompt injection, malicious skills, credential theft), and employs TLA+ formal verification for security properties.

tech guide

False positives in Node.js image vulnerability scans? Separate app packages from npm built-ins first

When reviewing vulnerability scan results for a Node.js Docker image, you can't just look at package names. First distinguish between project dependencies and the packages bundled with npm inside the base image — otherwise you'll fix the wrong thing.

tech guide

What Is Vulnerability Scanning? A Quick Intro to Docker and Package Scanning with Trivy

Vulnerability scanning isn't just about generating reports — it helps you discover known risks in your system before they become incidents. This post uses Trivy as a hands-on example to explain what scanners actually look for, how to read the results, and how to get started.

Claude Code Permission Modes Explained: Five Modes from Default to Auto

Claude Code has five permission modes: default (confirm each step), acceptEdits (auto-accept edits), plan (read-only planning), auto (background AI classifier review), and bypassPermissions (YOLO, skip everything). Switch with Shift+Tab or configure via settings.json. Auto mode is the sweet spot — no step-by-step confirmations, but with safety guardrails.

ai guide

RAG Guardrails: Adding a Defense Layer to Inputs and Outputs

The attacks RAG systems face go beyond the technical level — Prompt Injection and Jailbreak are real threats. Both inputs and outputs need independent protection layers.