Claude Code - When to use task tool vs subagents
Stop guessing. Based on community patterns and documentation analysis, the difference is clear: Tasks for parallel search, subagents for persistent expertise. This is the decision framework emerging from real production patterns and user experiences.

Quick answers
Why does this matter? Tasks are ephemeral workers, subagents are persistent specialists - Tasks spin up lightweight Claude instances for one-off parallel work, while subagents maintain configurations across sessions
What should you do? Each approach carries a 20k token overhead cost - Both Tasks and subagents start with roughly 20,000 tokens of context loading before your actual work begins
What is the biggest risk? Parallelism caps at 10 concurrent operations - You can queue more, but only 10 Tasks or subagents run simultaneously, executing in batches
Where do most people go wrong? Context isolation is both the strength and the weakness - Separate 200k token windows prevent pollution but require careful orchestration to share results
The confusion that costs you speed
Use Tasks for parallel file searches. Use subagents for code review.
Done. Blog post over.
Except that’s what everyone says, and then you watch your token count explode while Claude spawns 50 Tasks to read three files. Or you carefully configure a subagent that can’t spawn its own workers, leaving you wondering why your “parallel” processing feels so… sequential. I’ve felt genuine frustration sitting there watching usage balloon past 160k tokens for work I expected to cost 3k.
Users are reporting patterns where subagents consume 160k tokens for work that takes 3k in the main context. The documentation covers the basics but not these edge cases. The official best practices help, though they don’t address token overhead in detail.
The Task tool and subagents aren’t just different interfaces to the same thing. They’re fundamentally different execution models with opposing strengths. And most people are using them backwards. It’s another example of how enterprises fragment their AI implementations instead of thinking about the whole picture.
What the Task tool actually does
The Task tool doesn’t create “subagents.” It spawns ephemeral Claude workers. Think temporary contractors who show up, do one specific job, then vanish. Each Task gets its own 200k context window, completely isolated from everything else.
Watch what actually happens when you run multiple Tasks:
# What you think happens:
# Task 1 starts -> Task 2 starts -> Task 3 starts -> all run together
# What happens:
# Batch 1: Tasks 1-10 start -> all must complete
# Batch 2: Tasks 11-20 start -> all must complete
# Batch 3: Tasks 21-30 start...Community testing shows Claude doesn’t dynamically pull from the queue as Tasks complete. It waits for the entire batch to finish before starting the next one. The parallelism level caps at 10, according to user reports.
Tasks are fast for the right job. Need to search for a pattern across 500 files? Spawn 10 parallel Tasks, each handling 50 files. They can’t talk to each other (that’s the point), but they all report back to you. The main thread stays clean while the workers dig through the mess.
The problem? Each Task starts with that 20k token overhead. Your “quick file search” just cost you 200k tokens before any actual work began. Active multi-agent sessions can consume 3-4x more tokens than single-threaded operations. This is where cost optimization strategies matter most.
Subagents aren’t what you think
Subagents aren’t faster Tasks. They’re not even really “sub” anything.
Subagents are specialized Claude instances with their own system prompts, tool permissions, and persistent configurations. Think department heads in your organization. The Security Reviewer, the Test Writer, the API Designer. They exist as Markdown files in your .claude/agents/ folder, ready to be called into service.
What I think most people miss: subagents can’t spawn other subagents. This limitation is by design, not a bug. When a subagent tries to use the Task tool, it gets nothing. No nested hierarchies, no recursive task decomposition. One level of delegation, full stop.
What they can do now: background subagents run concurrently while you keep working. Permissions get pre-approved before launch, and the subagent executes without blocking your main thread. Parallel execution without the communication overhead of Tasks.
That constraint creates clarity. Your main Claude instance becomes an orchestrator, and subagents become specialists. The code reviewer doesn’t suddenly decide to refactor your entire codebase. It reviews code. That’s it.
The real power is consistency. Configure a subagent once, use it across every project. Community-shared security-auditor subagents demonstrate how standardized configurations can catch common OWASP Top 10 vulnerabilities consistently. Same configuration, same results, every time.
When to use which
Forget the theory. A practical framework based on documented patterns and community experience looks like this.
Use Tasks when:
- You need to search without a target (“find all database connections” across 1,000 files)
- Parallel reads dominate (reading 50 config files to build a dependency map)
- Context isolation matters (analyzing competitor codebases without contamination)
- It’s truly one-off work you’ll never need again
- Speed matters more than token cost
Use subagents when:
- Expertise requires consistency (code review with specific style guides)
- Tool access needs restriction (reviewer can read, can’t write)
- Workflows repeat predictably (every PR gets the same security check)
- Teams need standardization (everyone uses the same test-writer agent)
- Context persistence matters across tasks
Use neither when:
- You’re working with 2-3 specific files. Stay in the main thread.
- Simple sequential operations. Keep it in primary context.
- Tasks need to communicate. Rethink your architecture.
- You need nested parallelism. Write a bash script.
Is it really worth spending more than 30 seconds on this decision for most operations? Probably not. The performance difference is often negligible. The token cost difference isn’t.
Real patterns worth stealing
These patterns come from documented use cases where speed and cost both matter.
The repository explorer pattern
When exploring a new codebase, everyone’s instinct is to spawn one Task per directory. Wrong move. Feature-based splitting works better:
# DON'T: One task per directory (fails on cross-references)
"Explore src/, tests/, docs/ using 3 parallel tasks"
# DO: Feature-based exploration
"Use 4 parallel tasks:
- Auth system: find all auth/login/session code
- Data models: locate all database schemas
- API endpoints: map all routes and handlers
- Test coverage: analyze test patterns"Each Task hunts for a concept, not a location. This approach handles cross-directory dependencies that directory-based splitting misses entirely.
The code review pipeline
This is where subagents dominate. A typical effective setup uses three specialized agents:
- style-checker: Runs first, catches formatting and naming issues
- security-reviewer: OWASP Top 10, credential scanning, injection vectors
- test-validator: Ensures tests cover the changes
They run sequentially, not in parallel. Each writes findings to a markdown file that the next one reads. No context pollution, no token explosion. The sequential workflow with file-based communication beats parallel execution for complex reviews.
The hybrid orchestration
For large refactoring, combine both:
- Main thread identifies all affected files
- Tasks (parallel) read current implementations
- Subagent (architect) designs the refactoring approach
- Tasks (parallel) implement changes in isolated files
- Subagent (test-writer) creates integration tests
- Main thread coordinates git operations
This pattern can significantly cut refactoring time compared to sequential processing, though tokens typically increase 3-4x. Sometimes that trade-off is worth it.
Limitations that will catch you off guard
Both approaches have failure modes worth knowing before they bite you.
Task tool gotchas: No visibility into running Tasks. You fire off 10 parallel operations and then wait. No progress bars, no intermediate output, nothing until they all complete or timeout. Users have been requesting better progress tracking in GitHub discussions for months.
Task results can be truncated. When a Task returns findings from 100 files, you might only see summaries. Critical details like stack traces can get lost in the handoff.
No error recovery within Tasks. If Task 7 of 10 fails, the others continue, but Task 7 won’t retry or provide useful failure info. Generic “task failed” and nothing more.
Subagent surprises: Subagents can’t see each other’s work. You can’t have a designer agent pass specs directly to a coder agent. Everything routes through the main thread, adding latency and token overhead.
Configuration drift is real. That carefully tuned subagent from six months ago? Its behavior shifts subtly as Claude’s base model updates. Version control your agent configs and test them periodically.
The 20k token overhead isn’t negotiable. Even a subagent that reads one file and returns “LGTM” costs 20k tokens. For small tasks, staying in the main thread is 10x cheaper.
Three questions that replace every decision matrix
Stop optimizing for elegance. Optimize for getting work done.
Question 1: Will I run this exact operation again?
- Yes. Create a subagent.
- No. Continue to Question 2.
Question 2: Do I need to search or read more than 10 files?
- Yes. Use Tasks.
- No. Stay in the main thread.
Question 3: Must operations share context?
- Yes. Stay in the main thread.
- No. Use Tasks if parallel, subagent if specialized.
Three questions. Five seconds.
The teams that fail with Claude Code design elaborate multi-agent choreographies before writing a single line of code. It’s similar to how AI readiness assessments can lie to you. Over-engineering before understanding the actual constraints. The teams that succeed start simple, measure performance, then fix only the bottlenecks that actually matter.
Your token budget will thank you. Your deadlines will too. Most importantly, you’ll ship features instead of debugging agent communication protocols.
The real insight isn’t choosing between Tasks and subagents. It’s recognizing that the main thread is still the best orchestrator Claude Code has. Everything else is a tool for moving faster when you know exactly what you need. When you layer this into a full project management system with persistent CLAUDE.md files and structured folder hierarchies, even non-code work benefits from the same parallel execution patterns.
About the Author
Amit Kothari is an experienced consultant, advisor, coach, and educator specializing in AI and operations for executives and their companies. With 25+ years of experience and as the founder of Tallyfy (raised $3.6m), he helps mid-size companies identify, plan, and implement practical AI solutions that actually work. Originally British and now based in St. Louis, MO, Amit combines deep technical expertise with real-world business understanding.
Disclaimer: The content in this article represents personal opinions based on extensive research and practical experience. While every effort has been made to ensure accuracy through data analysis and source verification, this should not be considered professional advice. Always consult with qualified professionals for decisions specific to your situation.