Claude Code - When to use Task tool vs subagents (the real difference)
Stop guessing. Based on community patterns and documentation analysis, the difference is clear - Tasks for parallel search, subagents for persistent expertise. Here is the decision framework emerging from real usage.

Key takeaways
- Tasks are ephemeral workers, subagents are persistent specialists - Tasks spin up lightweight Claude instances for one-off parallel work, while subagents maintain configurations across sessions
- Each approach has a 20k token overhead cost - Both Tasks and subagents start with roughly 20,000 tokens of context loading before your actual work begins
- Parallelism caps at 10 concurrent operations - You can queue more, but only 10 Tasks or subagents run simultaneously, executing in batches
- Context isolation is both the strength and weakness - Separate 200k token windows prevent pollution but require careful orchestration to share results
- Need help implementing these strategies? Let's discuss your specific challenges.
The confusion that costs you speed
Use Tasks for parallel file searches. Use subagents for code review.
Done. Blog post over.
Except… that’s what everyone says, and then you watch your token count explode while Claude spawns 50 Tasks to read three files. Or you carefully configure a subagent that can’t spawn its own workers, leaving you wondering why your “parallel” processing feels so… sequential.
Here’s what’s actually happening. Users are reporting patterns where subagents consume 160k tokens for work that takes 3k in the main context. The documentation doesn’t cover this clearly. Neither do the official best practices.
The Task tool and subagents aren’t just different interfaces to the same thing. They’re fundamentally different execution models with opposing strengths. And everyone’s using them backwards. It’s another example of how enterprises fragment their AI implementations instead of thinking holistically.
What Tasks actually do (hint: not what you think)
The Task tool doesn’t create “subagents.” It spawns ephemeral Claude workers - think of them as temporary contractors who show up, do one specific job, then vanish. Each Task gets its own 200k context window, completely isolated from everything else.
Watch what happens when you run multiple Tasks:
# What you think happens:
# Task 1 starts → Task 2 starts → Task 3 starts → all run together
# What actually happens:
# Batch 1: Tasks 1-10 start → all must complete
# Batch 2: Tasks 11-20 start → all must complete
# Batch 3: Tasks 21-30 start...
Community testing shows that Claude doesn’t dynamically pull from the queue as Tasks complete. It waits for the entire batch to finish before starting the next one. The parallelism level caps at 10, according to user reports.
But here’s the magic: Tasks are fast for the right job. Need to search for a pattern across 500 files? Spawn 10 parallel Tasks, each handling 50 files. They can’t talk to each other (that’s the point), but they all report back to you. The main thread stays clean while the workers dig through the mess.
The problem? Each Task starts with that 20k token overhead. Your “quick file search” just cost you 200k tokens before any actual work began. Active multi-agent sessions consume 3-4x more tokens than single-threaded operations. This is where cost optimization strategies become critical.
Subagents - the misunderstood powerhouse
Subagents aren’t faster Tasks. They’re not even really “sub” anything.
Subagents are specialized Claude instances with their own system prompts, tool permissions, and persistent configurations. Think of them as department heads in your organization - the Security Reviewer, the Test Writer, the API Designer. They exist as Markdown files in your .claude/agents/
folder, ready to be called into service.
Here’s what nobody mentions: subagents can’t spawn other subagents. This limitation is by design, not a bug. When a subagent tries to use the Task tool, it gets nothing. No nested hierarchies, no recursive task decomposition. Just one level of delegation.
But that constraint creates clarity. Your main Claude instance becomes an orchestrator, and subagents become specialists. The code reviewer doesn’t suddenly decide to refactor your entire codebase. It reviews code. That’s it.
The real power? Consistency. Configure a subagent once, use it across every project. Community-shared security-auditor subagents demonstrate how standardized configurations can catch common OWASP Top 10 vulnerabilities consistently. Same configuration, same results, every time.
The decision matrix that actually works
Forget the theory. Here’s a practical framework based on documented patterns and community experiences:
Use Tasks when:
- Searching without a target: “Find all database connections” across 1,000 files
- Parallel reads dominate: Reading 50 config files to build a dependency map
- Context isolation matters: Analyzing competitor codebases without contamination
- It’s truly one-off work: You’ll never need this exact operation again
- Speed beats token cost: You’ve got 2 hours, not 2 days
Use Subagents when:
- Expertise requires consistency: Code review with specific style guides
- Tool access needs restriction: Reviewer can read, can’t write
- Workflows repeat predictably: Every PR gets the same security check
- Teams need standardization: Everyone uses the same test-writer agent
- Context persistence matters: Maintaining conversation history across tasks
Never use either when:
- Working with 2-3 specific files → Use the main thread
- Simple sequential operations → Keep it in primary context
- Tasks need to communicate → Rethink your architecture
- You need nested parallelism → Write a bash script instead
Here’s the brutal truth: if you’re spending more than 30 seconds deciding, you’re overthinking it. The performance difference for most operations is negligible. The token cost difference? That’s where you feel it.
Real patterns from production
Here are patterns that emerge from documented use cases where speed and cost both matter.
The Repository Explorer Pattern
When exploring a new codebase, everyone’s instinct is to spawn one Task per directory. Wrong move. Here’s what works:
# DON'T: One task per directory (fails on cross-references)
"Explore src/, tests/, docs/ using 3 parallel tasks"
# DO: Feature-based exploration
"Use 4 parallel tasks:
- Auth system: find all auth/login/session code
- Data models: locate all database schemas
- API endpoints: map all routes and handlers
- Test coverage: analyze test patterns"
Each Task hunts for a concept, not a location. This approach handles cross-directory dependencies that directory-based splitting misses entirely.
The Code Review Pipeline
This is where subagents dominate. A typical effective setup includes three specialized agents:
- style-checker: Runs first, catches formatting/naming issues
- security-reviewer: OWASP Top 10, credential scanning, injection vectors
- test-validator: Ensures tests actually test the changes
The key? They run sequentially, not in parallel. Each writes findings to a markdown file that the next one reads. No context pollution, no token explosion. The sequential workflow with file-based communication beats parallel execution for complex reviews.
The Hybrid Orchestration
For large refactoring, combine both:
- Main thread identifies all affected files
- Tasks (parallel) read current implementations
- Subagent (architect) designs the refactoring approach
- Tasks (parallel) implement changes in isolated files
- Subagent (test-writer) creates integration tests
- Main thread coordinates git operations
This pattern can significantly reduce refactoring time compared to sequential processing, though tokens typically increase 3-4x. Sometimes that trade-off is worth it.
The limitations nobody talks about
Both approaches have failure modes that’ll ruin your day if you don’t know about them.
Task Tool Gotchas
The worst one? No visibility into running Tasks. You fire off 10 parallel operations and then… wait. No progress bars, no intermediate output, nothing until they all complete or timeout. GitHub issue #3013 has been requesting progress tracking for months.
Task results can be truncated. When a Task returns findings from 100 files, you might only see summaries. The full context doesn’t transfer back - just what the Task decides is important. Critical details like stack traces can be lost this way.
No error recovery within Tasks. If Task 7 of 10 fails, the others continue, but Task 7 won’t retry or provide detailed failure info. You get a generic “task failed” message.
Subagent Surprises
The worst surprise? Subagents can’t see each other’s work. You can’t have a designer agent pass specs to a coder agent directly. Everything routes through the main thread, adding latency and token overhead.
Configuration drift is real. That carefully tuned subagent you created six months ago? Its behavior subtly changes as Claude’s base model updates. Best practice is to version control agent configs and test them regularly.
The 20k token overhead isn’t negotiable. Even a subagent that just reads one file and returns “LGTM” costs you 20k tokens. For small tasks, staying in the main thread is 10x cheaper.
The framework you can actually implement
Stop optimizing for elegance. Optimize for getting work done. Here’s a dead-simple decision process:
Question 1: Will I run this exact operation again?
- Yes → Create a subagent
- No → Continue to Question 2
Question 2: Do I need to search/read more than 10 files?
- Yes → Use Tasks
- No → Stay in main thread
Question 3: Must operations share context?
- Yes → Stay in main thread
- No → Use Tasks if parallel, subagent if specialized
That’s it. Three questions. Takes 5 seconds.
The teams that fail with Claude Code are the ones designing elaborate multi-agent choreographies before writing a single line of code. It’s similar to how AI readiness assessments can lie to you - over-engineering before understanding the real constraints. The ones that succeed? They start simple, measure performance, then optimize only the bottlenecks that actually matter.
Your token budget will thank you. Your deadlines will thank you. Most importantly, you’ll ship features instead of debugging agent communication protocols.
The real insight isn’t choosing between Tasks and subagents. It’s recognizing that the main thread - yes, boring, sequential, single-threaded you - is still the best orchestrator Claude Code has. Everything else is just a tool to help you move faster when you know exactly what you need.
About the Author
Amit Kothari is an experienced consultant, advisor, and educator specializing in AI and operations. With 25+ years of experience and as the founder of Tallyfy (raised $3.6m), he helps mid-size companies identify, plan, and implement practical AI solutions that actually work. Originally British and now based in St. Louis, MO, Amit combines deep technical expertise with real-world business understanding.
Disclaimer: The content in this article represents personal opinions based on extensive research and practical experience. While every effort has been made to ensure accuracy through data analysis and source verification, this should not be considered professional advice. Always consult with qualified professionals for decisions specific to your situation.