· AI

CEO of Tallyfy · AI advisor at Blue Sheen for mid-size companies

Subagent vs parallel agent vs skill in Claude Code

Subagent, parallel agent and skill get used as if they mean the same thing in Claude Code. They do not. A skill is reusable instructions that cost almost nothing until invoked. A subagent is delegated work in a fresh isolated context. Parallel agent is not a primitive at all. Picking the wrong one wastes tokens or floods your context.

Key takeaways

  • A skill is reusable instructions - a SKILL.md file whose body loads only when used, so it costs almost nothing at rest
  • A subagent is delegated work - it runs in a fresh, isolated context window and returns only a summary
  • Parallel agent is not a real primitive - the phrase just means running several subagents at the same time
  • Choose by what you are protecting - context, repeatability, or wall-clock time

What is the difference between a subagent and a skill? People ask it constantly, usually right after a Claude Code session has done something slow and expensive that a smaller move would have handled.

Here is the answer before the detail. A skill is reusable instructions. A subagent is delegated work. A parallel agent is not a separate thing at all; it is just several subagents running at once. Three different answers to three different problems. Claude Code makes it easy to grab the wrong one, because all three feel like the same act of getting the AI to do more.

The deeper question, and the one worth the rest of this post, is what each one costs. The cost is where a wrong choice actually hurts. A skill you never needed is a rounding error. A subagent you did not need is a whole context window you paid to spin up and throw away. So the goal is not to memorize definitions. It is to know which lever you are pulling and what it bills you.

The three things people conflate

Three names, three different mechanisms. A skill is a SKILL.md file: a written instruction set that Claude loads when it is relevant, or when you type its slash command. The body sits dormant and costs almost nothing until something invokes it. A subagent is a unit of delegated work. Claude hands a task to a worker that runs in its own fresh context window, with its own tools and permissions, and sends back only a summary. The main conversation never sees the worker’s mess. A parallel agent is not a configured object at all. It is a description of timing: several subagents running at the same time instead of one after another. You do not create a parallel agent. You run subagents in parallel. Conflate these three and you will reach for an isolated worker when a skill would do, or run things in sequence that should have run at once.

The reason the confusion persists is that all three are reached the same way, by talking to Claude in plain language. You do not import a library or call a constructor. You say “review this code” and Claude might apply a skill, might delegate to a subagent, might do neither. The mechanism is invisible at the moment you trigger it. That is good for ease of use and bad for cost intuition, because the thing you cannot see is the thing you cannot budget for.

What a subagent is

A subagent is the heavyweight option, and it is heavy for a reason. The official Claude Code subagents documentation describes it plainly: each subagent runs in its own context window with a custom system prompt, specific tool access, and independent permissions. It starts fresh. It does not see your conversation history, the skills you have already invoked, or the files Claude has already read. Claude writes a short delegation message describing the task, and the subagent works from there.

That isolation is the entire point. When a side task would flood your main conversation with search results, logs, or file contents you will never look at again, a subagent does that work somewhere else and hands back only the summary. Your main context stays clean. In long Claude Code sessions, that cleanliness is worth real money, because a context window that fills up forces compaction, and compaction is where detail quietly goes missing.

It is worth knowing the naming history here, because it trips people up. The tool that spawns this kind of worker used to be called the Task tool. As of Claude Code version 2.1.63 it was renamed to the Agent tool, and old Task(...) references still work as aliases. A custom subagent, the kind you define once and reuse, is a Markdown file in .claude/agents/ with frontmatter for its name, description, tools, and model. You can even point a subagent at a cheaper model like Haiku to control cost. If you want the deeper comparison of one-off delegation versus a defined, reusable specialist, I wrote that up separately in the Task tool versus subagents piece.

One nuance that matters for the cost conversation: there is a variant called a fork. A fork is a subagent that inherits the entire conversation so far instead of starting fresh. It trades away the input isolation, since it sees everything the main session sees, but its own tool calls still stay out of your conversation. Use a fork when a clean subagent would need so much background that re-explaining the situation costs more than the isolation saves.

Parallel is not a primitive

Here is the part that the phrase “parallel agent” gets wrong. There is no parallel agent. There is no setting, no file, no object you configure with that name. What exists is subagents, and subagents can run two ways: in the foreground, where the main session blocks and waits, or in the background, where they run concurrently while you keep working. Background subagents are the concurrency. “Run parallel research across the authentication, database, and API modules” is just three subagents started at once.

So when someone says parallel agent, translate it in your head to “several subagents at the same time.” That translation matters because it tells you the cost. Running three subagents in parallel does not cost less than running them in sequence. It costs the same in tokens, three separate context windows, and saves you only wall-clock time. Parallelism buys speed, not efficiency. If you were hoping that “going parallel” would make a job cheaper, it will not. It will make it faster and bill you the same.

Two larger structures sit beyond a single session, and they are worth knowing. Background agents let you run many independent Claude Code sessions at once and watch them from one place. Agent teams go further and let separate sessions communicate. Those are separate primitives and a real step up in commitment. But they are a step up in commitment, and most people reaching for “a parallel agent” do not need them. They need two or three subagents started together inside the session they already have. If you are trying to get this right across a team and the token bill is the symptom that sent you looking, my door is open.

What a skill is

A skill is the lightweight option, and the contrast with a subagent is the whole story. A skill is a SKILL.md file: YAML frontmatter that tells Claude when the skill applies, plus Markdown instructions Claude follows when it runs. Custom slash commands were merged into skills, so a skill is also how you build your own /command. Where a subagent is a worker, a skill is a recipe.

The cost difference is structural. The official Claude Code skills documentation puts it directly:

“Unlike CLAUDE.md content, a skill’s body loads only when it’s used, so long reference material costs almost nothing until you need it.” — Claude Code documentation

Read that carefully, because it explains the resting cost. A skill’s short description sits in context so Claude knows the skill exists and when to reach for it. The full body, which can be hundreds of lines of procedure, does not load until the skill is actually invoked. A subagent spends a whole context window the moment it runs. A skill spends almost nothing until the moment you need it, and even then the content enters the conversation once and stays, rather than being paid for again on every turn. For anything you do repeatedly, a checklist, a house style, a deployment procedure, a skill is the correct home. Pasting the same instructions into chat every time is the pattern a skill exists to kill. This sits alongside plugins and connectors as one of the ways Claude is extended, and I have mapped how those plugins, connectors and skills relate if you want the wider picture.

Skills and subagents are not rivals. They compose. A skill can be set to run in a forked subagent with one frontmatter line, which gives you a reusable recipe that also executes in isolation. The point of telling them apart is not to pick a side. It is to know which cost you are signing up for.

Which one to reach for

Decision tree for choosing between a skill, a subagent, and running subagents in parallel in Claude Code

The decision comes down to one question asked three ways. What are you protecting?

If you are protecting repeatability, write a skill. The signal is that you keep typing the same instructions, or a section of your CLAUDE.md has quietly turned from a fact into a procedure. A skill captures that once and costs you almost nothing until it fires. This is the most underused of the three, because it does not feel like “using an agent,” and people came looking for an agent.

If you are protecting context, spawn a subagent. The signal is a side task that will dump material into your main conversation that you will never reference again: a wide search, a pile of logs, the full text of twenty files. Send that work into an isolated window and take back the summary. The cost is real, a fresh context window, so do it when the isolation is worth more than the spin-up.

If you are protecting wall-clock time, run subagents in parallel. The signal is several independent investigations that do not depend on each other’s results. Start them together. Just remember the bill is the same as running them one by one; you are buying speed, not savings. And if none of those signals is present, the real answer is the fourth box on the diagram: do not reach for any of them. Just ask Claude. The cheapest agent is the one you did not spawn.

The mistake worth avoiding is treating “more machinery” as “more capable.” A skill, a subagent, and three subagents in parallel are not a ladder you climb toward better results. They are three tools on a wall, and the craft is the same as any workshop. You do not pick up the heaviest tool because the job feels important. You pick up the one shaped like the problem in front of you.

About the Author

Amit Kothari is an experienced consultant, advisor, coach, and educator specializing in AI and operations for executives and their companies. With 25+ years of experience, he is the Co-Founder & CEO of Tallyfy® (raised $3.6m, the Workflow Made Easy® platform) and Partner at Blue Sheen, an AI advisory firm for mid-size companies. He helps companies identify, plan, and implement practical AI solutions that actually work. Originally British and now based in St. Louis, MO, Amit combines deep technical expertise with real-world business understanding. Read Amit's full bio →

Disclaimer: The content in this article represents personal opinions based on extensive research and practical experience. While every effort has been made to ensure accuracy through data analysis and source verification, this should not be considered professional advice. Always consult with qualified professionals for decisions specific to your situation.

Related Posts

View All Posts »
The built-in agent types in Claude Code

The built-in agent types in Claude Code

Claude Code ships with five built-in agent types: Explore, Plan, general-purpose, statusline-setup, and claude-code-guide. Most people know two of them. The other three run constantly and shape how much your sessions cost. This is the full catalog, what each one is for, and why knowing them changes how you read your own terminal.

How the general-purpose agent works in Claude Code

How the general-purpose agent works in Claude Code

The general-purpose agent in Claude Code is not the main agent and not something you pick. It is a built-in subagent Claude routes to on its own for complex, multi-step work. It inherits your model and, in current Claude Code, spawns as a fork of your conversation. This post explains how it actually works and what that costs you.

What is a subagent in Claude Code

What is a subagent in Claude Code

A subagent in Claude Code is a specialized worker that runs in its own fresh, isolated context window, with its own tools and permissions, and reports back only a summary. It is how Claude does a noisy side task without flooding your main conversation. Here is what a subagent is, what file defines it, and when it earns its cost.

How to debug Claude Code subagents

How to debug Claude Code subagents

When a Claude Code subagent fails, you cannot open it and look inside. It ran in its own isolated context and handed back a summary. Debugging a subagent is the skill of reading that summary, recognizing context-isolation failures, and designing subagents that report enough to be diagnosed. Here is how to do it.

AI advisory services via Blue Sheen.
Contact me Follow 10k+