#agent-skills

7 posts

ai deep-dive Jun 6, 2026

The Skill Management Revolution for LLM Agents: A Complete Landscape of Skill Lifecycle from Voyager to MUSE-Autoskill

MUSE-Autoskill (2026) introduces a five-stage skill lifecycle framework. Self-created skills achieve 60.35% (+7.16%) on SkillsBench overall, and an impressive 87.94% on tasks where skill generation succeeds — surpassing the human-authored skill ceiling. This post synthesizes six arXiv papers to map the full landscape of skill evolution research.

#agent-skills #ai-agent #llm #self-refinement #memory #arxiv #paper-review

ai deep-dive May 23, 2026

browse.sh: Turning What Browser Agents Learn into a Skill Catalog

browse.sh, launched by Browserbase in May 2026, is two things: a browser skill catalog and the Browse CLI. The core thesis: the bottleneck for browser agents isn't reasoning — it's amnesia. By storing learned site-specific workflows as plain-text SKILL.md files, Autobrowse cut Craigslist task costs from ~$0.22 to ~$0.12 by their own metrics. Note: this has nothing to do with the 2018 Browsh text-mode browser.

#browse-sh #browser-agent #agent-skills #browserbase #autobrowse

ai deep-dive May 19, 2026

How Claude Reads and Writes PDF / DOCX / PPTX: Deconstructing the Three-Layer Architecture of Skills + Sandbox

Claude has no docx_tool or pdf_tool -- it relies on bash + file tools, plus SKILL.md instructions and pre-installed libraries like pdfplumber / python-pptx inside the container, assembling file handling capabilities from three layers.

#claude #agent-skills #anthropic #code-interpreter #sandbox #document-skills

ai deep-dive May 10, 2026

How Others Use LLMs to Write: Trade-off Notes from Karpathy's LLM-wiki to Multi-Agent Pipelines

A survey of 11 public LLM writing pipelines, distilled into three dominant patterns: multi-agent (researcher -> writer -> critic), Karpathy LLM-wiki (raw + wiki + LLM writes, humans don't), and quality guardrails (technical verifier + never fabricate + brief gate). The Princeton GEO paper (KDD 2024) quantifies the impact: inline citations +28%, adding statistics +33%, quoting source text +41%, keyword stuffing -9%.

#llm-writing #content-pipeline #claude-code #agent-skills #llm-wiki #geo #multi-agent #harness-engineering

ai guide Apr 10, 2026

Agent Skills: A Skill Framework That Makes AI Agents Work Like Senior Engineers

Agent Skills is Addy Osmani's open-source collection of 19 production-grade engineering skills that drive AI agents to follow senior engineering discipline through /spec → /plan → /build → /test → /review → /ship commands, instead of cutting corners.

#agent-skills #ai-agent #harness-engineering #claude-code #cursor #gemini-cli #development-workflow

ai guide Mar 28, 2026

OpenClaw Tools (Part 2): Skills System and Sub-Agents

Skills are AgentSkills-compatible SKILL.md folders with a 6-tier loading priority. ClawHub is the public marketplace. Sub-agents can nest up to 5 levels deep.

#openclaw #skills #clawhub #sub-agents #skill-md #agent-skills

tech guide Claude Code Automation Guide Mar 27, 2026

Claude Code Skills: A Complete Guide to Turning Repetitive Workflows into Single Commands

A Skill is an SOP written for AI. Define the steps in a Markdown file and Claude follows them. No coding required, no frameworks to learn — just write down what an experienced person would do.

#claude-code #skill #ai-agent #dx #automation #workflow #agent-skills