Claude Code/
Lesson

Three core design principles determine whether your skill works smoothly or wastes tokens and confuses Claude.

Principle 1: Progressive disclosure

Instead of loading everything into Claude's context upfront, information is revealed in three layers, each more detailed than the last, each loaded only when needed.

Level 1, YAMLWhat is yaml?A human-readable text format used for configuration files, including Docker Compose and GitHub Actions workflows. frontmatter (always loaded)

Always in Claude's system promptWhat is system prompt?Hidden instructions set by the developer that shape how an AI assistant behaves throughout a conversation. Users don't see it, but it defines the AI's persona and constraints., just enough for Claude to decide whether this skill is relevant to the current task.

yaml
---
name: sprint-planner
description: Manages sprint planning in Linear. Use when user says "plan sprint",
  "create sprint tasks", or "Linear sprint".
---

Costs 50-100 tokens. Even with 20 skills installed, that's only 1,000-2,000 tokens of overhead, negligible.

Level 2, SKILL.md body (loaded when relevant)

The full instructions, loaded only when Claude decides the skill is relevant. This is where your workflow steps, examples, and domain knowledge live. Typically 500-2,000 tokens, only paid when actually needed.

markdown
# Sprint Planner

## Instructions
1. Fetch current backlog from Linear via MCP
2. Check team capacity for the sprint period
3. Suggest task prioritization based on urgency and effort
4. Create tasks with proper labels and story points

Level 3, Linked files (loaded on demand)

Additional files in your skill folder that Claude navigates only when needed. A 50-page APIWhat is api?A set of rules that lets one program talk to another, usually over the internet, by sending requests and getting responses. guide in references/api-guide.md won't cost tokens unless Claude needs to look something up.

your-skill/
├── SKILL.md                  ← Level 2
├── references/
│   ├── api-guide.md          ← Level 3 (loaded on demand)
│   └── error-codes.md        ← Level 3 (loaded on demand)
└── scripts/
    └── validate.py           ← Level 3 (run when needed)

Here's how the three levels compare:

LevelWhatWhen loadedTypical sizeToken cost
Level 1YAML frontmatterEvery message50-100 tokensAlways paid
Level 2SKILL.md bodyWhen skill is relevant500-2,000 tokensPaid per activation
Level 3Linked filesWhen Claude needs detailsUnlimitedPaid on demand

A skill with 10,000 tokens of reference documentation only costs 50 tokens when it's not being used.

AI pitfall
A common mistake is putting everything in the SKILL.md body (Level 2) instead of using references (Level 3). If your skill has a long API guide, move it to references/api-guide.md and reference it from SKILL.md. This keeps Level 2 lean.
02

Principle 2: Composability

Claude can load multiple skills simultaneously. Your skill should coexist with others, don't assume you're the only skill installed.

What this means in practice:

  • Handle your domain and step aside for everything else
  • Keep your skill's purpose narrow enough to not conflict with others
  • Don't override general behaviors, a code review skill that also tries to handle documentation will conflict with a dedicated documentation skill

Here's what good composability looks like:

"code-reviewer" - reviews PRs using team standards
✅ "release-notes" - generates changelog from commits
✅ "deploy-helper" - guides through deployment checklist

❌ "dev-assistant" - tries to do code review, docs, AND deployment
Good to know
When multiple skills are relevant, Claude loads all of them and synthesizes their instructions. This works well when skills are composable. It breaks down when skills contradict each other.
03

Principle 3: Portability

A skill works the same way across Claude.ai, Claude Code, and the APIWhat is api?A set of rules that lets one program talk to another, usually over the internet, by sending requests and getting responses.. The folder structure, YAMLWhat is yaml?A human-readable text format used for configuration files, including Docker Compose and GitHub Actions workflows. format, and instruction style are identical everywhere.

If your skill has scripts that need Python or specific packages, note that in the compatibility field:

yaml
compatibility: |
  Requires Python 3.10+ and the pandas library.
  Scripts will not run in Claude.ai (web only) - Claude Code or API required.

Claude will still follow the markdown instructions even if it can't execute the scripts.

04

Naming rules

These rules are strict. Violations break skill loading silently, your skill just won't appear.

WhatRuleValidInvalid
Folder namekebab-case onlynotion-project-setupNotionProjectSetup
Folder nameNo spacessprint-plannerSprint Planner
Folder nameNo underscorescode-reviewercode_reviewer
SKILL.mdExact name, case-sensitiveSKILL.mdskill.md, Skill.md
Skill name in YAMLNo "claude" or "anthropic"my-reviewerclaude-reviewer
Folder nameLowercase onlydata-processorData-Processor
Edge case
The folder name and the name field in your YAML frontmatter should always match. If they don't, some platforms may fail to load the skill. Always update both when renaming.
05

The skills + MCPWhat is mcp?Model Context Protocol - a standard that lets AI tools connect to external services like databases, issue trackers, or APIs. combination

If you're building MCP integrations, skills add a critical layer on top:

MCP aloneSkills on top of MCP
Provides the tools (API access, data retrieval)Provides the workflows (how to use those tools)
What Claude can doHow Claude should do it
User must know the right sequence of callsSkill encodes the sequence automatically
Each user discovers patterns independentlyBest practices are shared via the skill
Raw tool access, trial-and-errorGuided, optimized workflows

For example, a Linear MCP server gives Claude access to create issues, update statuses, and query projects. But without a skill, Claude doesn't know your team's sprint planning workflow, the specific labels you use, how you estimate story points, or what order to process the backlog. A sprint-planner skill encodes all of that domain knowledge.

If you're an MCP server builder, bundling a skill with your integration improves adoption. Users who connect your MCP and immediately get guided workflows will stick with your product.

Good to know
You don't need to build the MCP server yourself to build a skill on top of one. Skills and MCP servers are independently authored and distributed, they just work well together.