Loop Engineering is the practice of designing systems that automatically prompt AI agents, rather than prompting them manually. Boris Cherny runs hundreds of agents, Addy Osmani coined the term, and Blake Crosley identified verification cost as the real bottleneck — this article covers primary sources, the five building blocks, applicability boundaries, and criticisms.
AI agents running tests are non-reproducible; hand-written Playwright is hard to maintain. Four tools that emerged in 2024-2025 each tackle this dilemma with very different design philosophies.
CodeGraph uses tree-sitter to extract a codebase into a local SQLite/FTS5 knowledge graph, letting AI coding agents query the graph instead of scanning files. The official end-to-end benchmark (7 repos, median of 4 runs) averages 35% cost savings and 70% fewer tool calls -- but only if the agent actually walks the graph. Delegating exploration to a file-reading subagent that ignores CodeGraph turns it into pure overhead.
Anthropic shipped Claude Design on 2026-04-17. On 4-28, nexu-io/open-design went public -- same artifact-first loop, Apache-2.0, runs on the 16 coding-agent CLIs you already have. Two weeks from 0.1 to 0.7, 40k+ stars. A paradigm shift that flattens AI design tools from vertical SaaS into a skill bundle.
AI agents can operate video generation tools through three approaches — Skills, MCP Connectors, and direct APIs. Choosing the right integration method matters more than choosing the right tool.
Spin up a local OpenAI-compatible endpoint at localhost:20128 that automatically routes requests from Claude Code / Cursor / Cline / Codex / Copilot through a Subscription → Cheap → Free 3-tier fallback to 40+ providers. Built-in RTK compresses tool_result (saving 20–40% input tokens), Caveman mode compresses output, OAuth auto-refresh, multi-account round-robin — install with npm install -g 9router and two commands.
Stripe Minions says 'The walls matter more than the model,' but the case studies from four Silicon Valley companies never explained how to actually build those walls. This post breaks down the 15 walls we implemented in the daodao auto-dev agent: what each wall prevents, where the files live, and what the tradeoffs are. Tier 1 is mandatory, Tier 2 strengthens governance, Tier 3 is serious governance.
Build a Notion task → GitHub issue → spec PR → code PR auto-dev agent from scratch. Using the daodao case as a template, this guide walks through every step — what to do, what to verify, and how to handle problems. Notion DB schema → bin/ scaffold → two Claude Code routines → cloud env vars → staging tests.
5 rounds of consensus to write the plan, then team mode with 5 workers running 12 tasks in parallel — with plenty of pitfalls along the way. Writing it down for my future self and anyone else trying the same thing.
/loop is Claude Code's native cron feature — set schedules in plain English and let Claude monitor, auto-fix PRs, and run recurring tasks in the background. Session-scoped and expires after 7 days; for cross-session scheduling, use Routines or Desktop scheduled tasks.
Routines is Claude Code's cloud automation system (formerly Cloud Scheduled Tasks). Beyond cron scheduling, you can trigger runs via API endpoint or GitHub events — scan issues, review PRs, run checks, open PRs — all while your computer is off.
When using AI agents like Claude Code or Cursor, built-in WebFetch / WebSearch often gets blocked by Cloudflare, geo-restrictions, or rate limits. Connecting a search MCP server is the most direct fix. This post compares the options actually available in 2026.
goose is an open-source AI Agent maintained by the Linux Foundation's AAIF, supporting 15+ LLM providers and 70+ MCP extensions, built with Rust as a Desktop App + CLI + API. It positions itself as a vendor-neutral, self-hostable alternative to Claude Code.
Cloudflare ran a Multi-Agent Code Review system internally for 30 days — 131K reviews, median 3 minutes. This post breaks down their architecture and compares it with solutions from Anthropic, GitHub, CodeRabbit, Greptile, and others.
AI models rationalize their own code when reviewing it. Using three different CLIs for independent review effectively catches blind spots -- this post covers the design philosophy and practical workflow patterns behind the approach.
Agentic AI is not just autocomplete — it is an AI system capable of autonomously executing multi-step tasks. This article breaks down the five phases of the SDLC, explaining where to plug in agents at each phase, how to progress from CLI tools to full-pipeline automation, and the most valuable external resources to track right now.
Encyclopedia of Agentic Coding Patterns catalogues 190 patterns to help you make the right software decisions in the age of AI-written code — and the book itself is autonomously written and maintained by an AI agent.
MCP is not going away, but its effective scope is narrower than most people think. For local development, CLI and raw API almost always beat MCP. MCP's truly irreplaceable niche is the narrow gap of 'cross-agent shared local tool layer.'
Different AI engines process web pages in vastly different ways. Some only read the body; others rely on pre-built indexes. JSON-LD and schema markup are not universally effective — body content quality and structure are the only cross-platform foundations that hold.
Claude Octopus is a Claude Code plugin that simultaneously calls Codex, Gemini, Copilot, Qwen, Ollama, Perplexity, OpenRouter, and Claude to review the same code, using a 75% consensus threshold to catch single-model blind spots. It ships with 32 personas, 48 /octo:* slash commands, 51 skills, and a Dark Factory fully autonomous spec-to-code pipeline.
Better Agent Terminal (BAT) is an Electron desktop app that unifies multiple project workspaces, terminals, and Claude Code Agents into a single window — solving the everyday pain of exploding iTerm tabs and the lack of a proper GUI container for agents. MIT License, available on macOS, Windows, and Linux.
Graphify uses tree-sitter AST to extract code structure, then applies LLM semantic analysis to documents and images, compressing an entire project into a queryable knowledge graph. It claims to save 71.5x tokens per query compared to reading raw files.
Claw Code is a from-scratch Rust rewrite of the Claude Code CLI, featuring 48K lines of code, 40 tools, and MIT licensing. Most remarkably, the entire project was built by multiple AI agents collaborating over just 5 days, surpassing 170K GitHub stars within a week of launch.
oh-my-claudecode (OMC) adds 8 collaboration modes, 19 specialized agents, and cross-model orchestration (Claude + Codex + Gemini) on top of Claude Code, transforming a single-user CLI tool into a multi-agent development platform. Features include Deep Interview for requirement clarification, Smart Model Routing that saves 30-50% on tokens, and automatic rate limit recovery.
Claude Code only reads CLAUDE.md; Codex only reads AGENTS.md. Teams using both end up maintaining two identical files. Fix: make CLAUDE.md a symlink pointing to AGENTS.md — one source of truth.
There are already 6,400+ .claude/agents/*.md files on GitHub. We dissected 4 representative projects — ChemistryTimes (content production pipeline), claude-sub-agent (document-driven development pipeline), agentic (Temporal.io DAG parallel execution), and vs-copilot-multi-agent (hook-enforced memory persistence) — plus ruflo's enterprise-grade swarm architecture, distilling 6 design patterns and 5 practical trends.
Andrej Karpathy proposed a framework for compiling personal knowledge wikis with LLMs — collect raw data, have the LLM compile it into .md wiki pages, run Q&A against the wiki, and file outputs back. This post compares three practical approaches: Karpathy's knowledge vault model, the community's experience vault model, and quidproquo's blog model.
After dissecting Claude Code's 18+ caching mechanisms, I found that you can't touch provider-level prompt cache, but embedding cache, tool result cache, and entity cache are not only within your reach — they deliver even better results. Includes a complete AgentCache interface design and per-tool TTL strategy.
Every one of Claude Code's 45 tools uses a prompt() method that dynamically adjusts based on user type, feature flags, and system capabilities. Applying this pattern to a ReAct Agent, tool descriptions are dynamically generated along three dimensions: orchestrator model capability, locale, and available tools. Small models automatically get few-shot examples; large models save tokens.
From $20/mo Pro to $200/mo Max 20x, Claude Code's Opus 4.6 delivers the strongest reasoning depth in the industry, and its Max plan's unlimited pricing saves heavy users over 90% compared to API costs.
Comparing six major Agent CLI subscription plans in 2026 (Claude Code, Cursor CLI, Codex, Kiro, Gemini CLI, OpenCode), and exploring multi-model routing patterns — routing simple tasks to cheaper models and complex tasks to flagship models, with real-world savings of 40-85%.
Skill paths are almost always runtime-specific. AGENTS.md is the reliable way to share rules across agents. Put personal reusable capabilities in each agent's supported global directory; put project workflows inside the repo.
Agent CLIs are not smarter autocomplete tools -- they are AI agents that can read your codebase, execute multi-step tasks, and operate in real environments. Claude Code, Codex CLI, Gemini CLI, OpenCode, Aider, Pi, Kiro, Amp, Cursor CLI... the tools keep multiplying, but they all share a common set of design principles -- understanding these principles is how you actually get good at using them.
Use Claude Code as an orchestrator to chain Playwright screenshots, catbox.moe image hosting, Meta Graph API publishing, and Telegram notifications — generate and publish an IG carousel from a single sentence.
Claude Code is Anthropic's agentic coding tool that runs in the terminal, IDEs, Slack, GitHub, and on the web. Its core extension system has six layers: CLAUDE.md (persistent context), Skills (on-demand workflows), Hooks (deterministic automation), Subagents (isolated delegation), MCP (external tool connections), and Agent Teams (multi-agent collaboration).
A Skill is a prompt template you invoke manually. A Subagent is an independent agent that Claude routes to automatically. They look similar, but differ completely in trigger mechanism, tool isolation, and context management.
When processing requests, Claude Code randomly displays one of 185 built-in verbs (like Thinking, Brewing, Clauding), then picks one of 8 completion verbs with elapsed time. You can customize these via spinnerVerbs in settings.json, using either replace or append mode. All data in this post is verified directly from cli.js source code.
gstack is Garry Tan's open-source Claude Code skills toolkit. Its 20 specialized skills transform a solo developer into an entire engineering team — automating everything from product planning and design review to code review, QA, and deployment.
The model is the CPU, the harness is the operating system, and the agent is the application. No matter how powerful a model is, without a good harness it's just a demo. Phil Schmid argues that harness is the most critical infrastructure in AI engineering for 2026.
Global skills live in ~/.claude/skills/, but they go missing in new sessions or the Desktop App? The problem usually isn't a missing file — it's that the skill descriptions aren't being loaded into context. This post clarifies the CLI vs Desktop App differences, the role of settings.json, and the most reliable fix.
Use OpenSpec to break requirements into engineering tasks, Claude Code to implement them, hooks to auto-format and protect, local review before committing, three AI reviewers running in parallel on PR, and auto-deploy after merge. This entire workflow lets one person maintain quality across six sub-projects.
Hooks are Claude Code's event system. They trigger shell commands, HTTP requests, or LLM evaluations automatically before/after tool execution, when a prompt is submitted, or when a task ends. Use them to block dangerous operations, run automated reviews, inject context, or write audit logs.
A Skill is an SOP written for AI. Define the steps in a Markdown file and Claude follows them. No coding required, no frameworks to learn — just write down what an experienced person would do.
Stuck mid-debug and can't fix it right now? Use /file-bug-issue to package the error analysis, reproduction steps, and attempted fixes from your conversation into a well-structured GitHub issue. Pair it with a Remote Agent to let AI automatically take over the fix.
Using Claude Code's Scheduled Remote Agent, automatically scan GitHub issues every 2 hours, implement features, open PRs, and address review feedback — no human intervention required. Humans only write issues and click merge. Pair it with the custom /publish-tasks skill to push OpenSpec engineering tasks directly to GitHub issues.
Hooks are automated safety nets (blocking bad commits), Skills are interactive workflows (running checks + auto-fixing), and instruction files (CLAUDE.md / AGENTS.md) are behavioral guidelines. Each layer operates independently, but together they enable an AI agent to automatically run lint, typecheck, and build checks before every commit.
A complete study guide for Claude's official architect certification: five exam domains, six scenario types, common anti-patterns, and hands-on preparation strategies.
The MCP tool was returning a description field that caused 1,033 job listings to exceed the token limit. The fix: exclude description by default and add pagination.
Claude Code has five permission modes: default (confirm each step), acceptEdits (auto-accept edits), plan (read-only planning), auto (background AI classifier review), and bypassPermissions (YOLO, skip everything). Switch with Shift+Tab or configure via settings.json. Auto mode is the sweet spot — no step-by-step confirmations, but with safety guardrails.
After finishing a debug session, just say 'write this up as a post' — Claude Code extracts content from the conversation, applies a template, generates frontmatter, and commits it to the repo. No extra writing required.
To consolidate scattered notes and showcase diverse interests, I built a personal blog using Astro + Cloudflare Workers D1, paired with a Claude post skill for zero-friction writing.