Skip to content

Agent Skills: A Skill Framework That Makes AI Agents Work Like Senior Engineers

Apr 10, 2026 1 min
TL;DR Agent Skills is Addy Osmani's open-source collection of 19 production-grade engineering skills that drive AI agents to follow senior engineering discipline through /spec → /plan → /build → /test → /review → /ship commands, instead of cutting corners.

🌏 中文版

AI coding agents are powerful, but they share a common tendency: taking the shortest path. If it runs, that’s good enough — tests can wait, error handling can be skipped. This is fine during prototyping, but it’s a disaster in production.

Addy Osmani’s (Google Chrome team) Agent Skills attempts to solve this problem: encoding senior engineering discipline into structured Markdown skills, so AI agents follow production-grade standards at every step.


Core Concept: Skills Are Not Prompts — They Are Workflows

Agent Skills are different from typical system prompts or coding guidelines. Each skill is a complete workflow that includes concrete steps, quality gates, verification requirements, and even lists of “excuses you might use to skip this step.”

This is closer to the harness engineering philosophy — rather than just telling an agent to “write good code,” you design an environment where it’s hard to write bad code.


Seven Development Stage Commands

The entire framework is designed around the development lifecycle with seven slash commands:

/spec  →  Define requirements, clarify boundaries
/plan  →  Break down tasks, mark dependencies
/build →  Incremental implementation, step-by-step verification
/test  →  Browser testing, debugging
/review → Code review, quality gates
/code-simplify → Simplify code
/ship  →  Deploy to production

The key point is that this doesn’t just suggest you “should plan before coding” — each command has a corresponding skill document that defines exactly what to do, how to verify completion, and what constitutes a red flag.


19 Core Skills

Categorized by development stage:

Define (2)

  • Idea Refinement: Guides exploration and idea clarification, preventing you from diving straight into implementation
  • Specification-Driven Development: Write specs before code, defining inputs, outputs, and boundary conditions

Plan (1)

  • Task Breakdown: Decompose specs into atomic tasks, arrange dependency order, and keep each change around 100 lines

Build (5)

  • Incremental Implementation: Don’t write everything at once — each step has a verification checkpoint
  • Test-Driven Development: Write tests first, following the test pyramid (80% unit / 15% integration / 5% E2E)
  • Context Engineering: Manage the agent’s context to ensure sufficient background information at every step
  • Frontend UI Engineering: A frontend-specialized build workflow
  • API and Interface Design: Incorporates design principles like Hyrum’s Law

Verify (2)

  • Browser Testing: Browser testing with Chrome DevTools
  • Debugging and Error Recovery: A structured debugging process, not trial and error

Review (4)

  • Code Review and Quality Gates: Review by senior engineer standards
  • Code Simplification: Apply Chesterton’s Fence principle — understand why something exists before deciding whether to remove it
  • Security and Hardening: Security scanning and hardening
  • Performance Optimization: Performance targets and measurement

Ship (5)

  • Git Workflow: Trunk-based development, feature flags
  • CI/CD and Automation: Automated pipelines
  • Deprecation and Migration: Deprecation and migration strategies
  • Documentation and ADR: Documentation and Architecture Decision Records
  • Shipping Procedures: Go-live checklists

Three Expert Roles

Beyond skills, Agent Skills provides three switchable review perspectives:

RolePerspectiveFocus Areas
Code ReviewerSenior Staff EngineerArchitecture, readability, maintainability
Test EngineerQA ExpertTest coverage, edge cases, test pyramid
Security AuditorSecurity EngineerOWASP Top 10, injection attacks, access control

This lets you perform multi-angle reviews on the same code, rather than relying on a single perspective.


Design Philosophy: Making It Hard for Agents to Cut Corners

Several design details in Agent Skills are particularly noteworthy:

Anti-Rationalization Tables

Each skill has a built-in table listing common “excuses for skipping this step” along with corresponding rebuttals. For example:

ExcuseRebuttal
”This is just a small change, no need for tests”Small changes account for the majority of regressions
”I’ll add documentation later”Later never comes
”I’m in a rush, let’s ship first”Fixing production bugs takes far longer than writing tests

This design directly targets a weakness of LLMs: they’re excellent at rationalizing their own shortcuts. With explicit rebuttals, agents find it harder to convince themselves to skip critical steps.

Verification Requirements Are Mandatory

Each skill ends with verification checkpoints that require concrete evidence — test results, build output, runtime data. Not “I think it’s done,” but “here’s the output proving it’s done.”

Progressive Disclosure

The modular design allows each skill to operate independently — you don’t need to load all 19 at once. This controls token usage and prevents the context window from being filled with irrelevant instructions.


Platform Support

Agent Skills are essentially structured Markdown, so they work with virtually any AI coding tool:

PlatformInstallation
Claude CodeMarketplace install or --plugin-dir for local loading
CursorPlace in .cursor/rules/
Gemini CLIgemini skills install
WindsurfCopy to rules settings
GitHub CopilotPlace in .github/copilot-instructions.md
OthersAny agent that accepts Markdown instructions

Overall Assessment

Agent Skills addresses a clear problem: AI agents optimize for speed by default, but production environments require reliability.

This framework is a good fit for:

  • Teams already using AI agents for development but dissatisfied with output quality
  • Those who want to maintain engineering discipline in AI-assisted development, rather than fixing things after the fact
  • Those who need a standardized process that’s cross-platform and incrementally adoptable

Not a good fit for:

  • Pure prototyping or hackathons where speed matters more than quality
  • Teams with mature internal engineering standards (potential conflicts)

The core trade-off is exchanging tokens and step count for quality assurance. If your AI agents frequently produce code that “runs but isn’t production-ready,” Agent Skills is worth trying.

References