Skip to content
All tags

#agent-skills

7 posts
ai deep-dive

The Skill Management Revolution for LLM Agents: A Complete Landscape of Skill Lifecycle from Voyager to MUSE-Autoskill

MUSE-Autoskill (2026) introduces a five-stage skill lifecycle framework. Self-created skills achieve 60.35% (+7.16%) on SkillsBench overall, and an impressive 87.94% on tasks where skill generation succeeds — surpassing the human-authored skill ceiling. This post synthesizes six arXiv papers to map the full landscape of skill evolution research.

ai deep-dive

browse.sh: Turning What Browser Agents Learn into a Skill Catalog

browse.sh, launched by Browserbase in May 2026, is two things: a browser skill catalog and the Browse CLI. The core thesis: the bottleneck for browser agents isn't reasoning — it's amnesia. By storing learned site-specific workflows as plain-text SKILL.md files, Autobrowse cut Craigslist task costs from ~$0.22 to ~$0.12 by their own metrics. Note: this has nothing to do with the 2018 Browsh text-mode browser.

ai deep-dive

How Claude Reads and Writes PDF / DOCX / PPTX: Deconstructing the Three-Layer Architecture of Skills + Sandbox

Claude has no docx_tool or pdf_tool -- it relies on bash + file tools, plus SKILL.md instructions and pre-installed libraries like pdfplumber / python-pptx inside the container, assembling file handling capabilities from three layers.

ai deep-dive

How Others Use LLMs to Write: Trade-off Notes from Karpathy's LLM-wiki to Multi-Agent Pipelines

A survey of 11 public LLM writing pipelines, distilled into three dominant patterns: multi-agent (researcher -> writer -> critic), Karpathy LLM-wiki (raw + wiki + LLM writes, humans don't), and quality guardrails (technical verifier + never fabricate + brief gate). The Princeton GEO paper (KDD 2024) quantifies the impact: inline citations +28%, adding statistics +33%, quoting source text +41%, keyword stuffing -9%.

ai guide

Agent Skills: A Skill Framework That Makes AI Agents Work Like Senior Engineers

Agent Skills is Addy Osmani's open-source collection of 19 production-grade engineering skills that drive AI agents to follow senior engineering discipline through /spec → /plan → /build → /test → /review → /ship commands, instead of cutting corners.

ai guide

OpenClaw Tools (Part 2): Skills System and Sub-Agents

Skills are AgentSkills-compatible SKILL.md folders with a 6-tier loading priority. ClawHub is the public marketplace. Sub-agents can nest up to 5 levels deep.

Claude Code Skills: A Complete Guide to Turning Repetitive Workflows into Single Commands

A Skill is an SOP written for AI. Define the steps in a Markdown file and Claude follows them. No coding required, no frameworks to learn — just write down what an experienced person would do.