TL;DR

Anthropic published lessons from running hundreds of Claude Code Skills across its engineering organization, describing Skills as folders that bundle instructions, scripts, references, templates and guardrails. The report says verification-focused Skills had the biggest measured effect on output quality, though broader adoption practices are still developing.

Anthropic has published lessons from running hundreds of Claude Code Skills across its engineering organization, saying the reusable units helped move teams away from repeated ad-hoc prompting and toward shared, versioned workflows for AI coding agents.

The post, titled “Lessons from building Claude Code: How we use skills” and attributed to Thariq Shihipar, describes a Skill as a folder the agent can discover, read and run, rather than a saved prompt. Anthropic says such folders can include instructions, scripts, references, templates, configuration files and hooks.

According to the source material, Anthropic grouped its internal Skills into nine categories: library and API reference, product verification, data fetching and analysis, business-process automation, code scaffolding and templates, code quality and review, CI/CD and deployment, runbooks, and infrastructure operations.

The strongest claim in the material is attributed to Anthropic’s own measurement: verification Skills, which check whether work is correct, had the largest effect on output quality. The exact measurement method, sample size and benchmark design were not included in the provided source text.

At a glance

reportWhen: Published June 3, 2026; discussed in Th…

The developmentAnthropic published a June 3, 2026 Claude blog post detailing what it learned from using hundreds of reusable Skills across its engineering organization.

AI Dispatch · Insights · 1 July 2026

A Skill is a folder, not a prompt

Anthropic published what it learned running hundreds of Skills across its own engineering org. Read as a business memo, the point is bigger than a coding trick: this is how ad-hoc prompting becomes durable institutional capability — the SOPs your agents actually follow, versioned and shared.

✕ The misconception

“A Skill is just a clever markdown prompt you save in a file.”

✓ What it actually is

A folder the agent can discover, read & run — instructions, scripts, references, templates, config & on-demand hooks.

Anatomy of a Skill — the file system is context engineering

my-skill/the unit you share & version

├─ SKILL.mdroot instructions + a description written for the model (its trigger)

├─ references/deep detail pulled in only when needed — progressive disclosure

├─ scripts/real code, so the agent composes instead of rebuilding boilerplate

├─ assets/templates & files to copy into the output

├─ config.jsonsetup the agent asks for if it’s missing (e.g. which Slack channel)

└─ hooks + memoryon-demand guardrails + an append-only log so it remembers

Why it matters: the folder itself is the knowledge base. The agent reads the root, then reaches deeper only when the task demands it — the same way you’d hand a new hire a one-pager that points to the detailed docs.

The nine types — a gap-analysis map for your own library

1Library / API reference

2Product verification ★ top impact

3Data fetching & analysis

4Business-process automation

5Code scaffolding & templates

6Code quality & review

7CI/CD & deployment

8Runbooks

9Infrastructure operations

By Anthropic’s own measurement, verification Skills — the ones that check the work — moved output quality the most. If you build one category well, build that one.

The craft — what separates a good Skill from a useless one

Gotchas = highest-signal section Describe for the model, not humans (it’s the trigger) Don’t state the obvious Ship scripts, not just prose On-demand guardrail hooks (/careful, /freeze) Let it remember (log / SQLite) Don’t railroad — leave room to adapt

The take

The knowledge of how your organization actually operates can be captured, versioned, shared & executed — and the thing capturing it is a humble folder with a script and a gotchas list inside. For the builder, that’s context engineering with real tools attached. For whoever owns the budget, it’s the difference between AI that starts from zero every morning and an asset that compounds. Caveats: best practices are still evolving, checked-in Skills cost context, and curation beats accumulation. Start with one Skill, one gotcha, and the category that catches your mistakes.

Source: “Lessons from building Claude Code: How we use skills,” Thariq Shihipar (Anthropic), Claude blog, 3 June 2026. Categories, examples & measured claims are Anthropic’s; framing is the author’s. Docs: code.claude.com/docs/en/skills.

thorstenmeyerai.com

From Prompts to Shared Workflows

The report matters because it frames AI agent setup as organizational infrastructure, not just individual prompt craft. If a Skill stores a team’s instructions, scripts and caveats in one reusable folder, the same working knowledge can be shared, reviewed and improved like other engineering assets.

For companies using coding agents, the practical point is consistency. Anthropic’s account suggests Skills can reduce the need for workers to restate the same rules each day and may help agents apply team-specific standards, especially around verification and review.

Amazon

AI development automation tools

As an affiliate, we earn on qualifying purchases.

Claude Code’s Folder Model

The Thorsten Meyer AI write-up describes the key correction as simple: a Skill is a folder, not a clever markdown file. In that model, SKILL.md provides root instructions and a trigger description, while subfolders such as references, scripts and assets supply deeper material only when needed.

The source compares the design to giving a new hire a short operating guide that points to detailed documentation. Anthropic’s suggested practices include writing descriptions for the model, adding scripts instead of only prose, using on-demand guardrails, and keeping room for the agent to adapt.

“A Skill is a folder, not a prompt.”
— Thorsten Meyer AI summary

Amazon

AI code review software

As an affiliate, we earn on qualifying purchases.

Measurement Details Still Missing

Several points remain unclear from the provided material. The source does not give the underlying data for Anthropic’s quality measurement, how many teams were included, how output quality was scored, or whether the results apply outside Claude Code and Anthropic’s own engineering practices.

It is also not yet clear how quickly Skills become hard to maintain at scale. The source notes that best practices are still evolving, that checked-in Skills can cost context, and that curation matters more than accumulation.

Amazon

AI workflow automation software

As an affiliate, we earn on qualifying purchases.

Teams Test Skill Libraries

The next step for adopters is likely to be small pilots, especially around verification Skills that catch mistakes in existing workflows. The source recommends starting with one Skill, one known caveat and the category most likely to prevent repeated errors.

Anthropic’s documentation at code.claude.com/docs/en/skills is the cited reference for implementation details. Readers should expect practices around Skill design, memory, hooks and governance to keep changing as more teams test agent workflows in production settings.

Amazon

AI scripting and reference tools

As an affiliate, we earn on qualifying purchases.

Key Questions

What did Anthropic publish?

Anthropic published a June 3, 2026 Claude blog post about how it uses Claude Code Skills across its engineering organization.

What is a Skill in this report?

A Skill is described as a folder an agent can discover, read and run. It can contain instructions, scripts, references, templates, configuration and hooks.

Which Skill type had the biggest reported effect?

According to the source material, verification Skills had the largest measured effect on output quality, though the provided material does not include the full measurement details.

Why does this matter for engineering teams?

The report suggests teams can turn repeated instructions into shared workflows that are versioned and reused, reducing reliance on one-off prompting.

What is still uncertain?

The source does not confirm how well Anthropic’s findings generalize to other companies, other agents or non-engineering teams. Maintenance costs and governance practices are also still developing.

Source: Thorsten Meyer AI

Wellness content on this site is informational and not a substitute for professional medical guidance.

A Skill Is a Folder, Not a Prompt: What Anthropic Learned Running Hundreds of Them

Up next

7 Best Beauty Products in 2026

Author

The Blissful Studio Team

A Skill is a folder, not a prompt