· AI

CEO of Tallyfy · AI advisor at Blue Sheen for mid-size companies

Anthropic managed agents are not office agents

Anthropic managed agents and office agents are different products with confusingly similar names. Managed agents is a developer API for running autonomous Claude agents on managed infrastructure. The interesting part is the brain-hands split: Anthropic runs the agent loop, while the sandbox can run in your own environment. This is what it is, and when to use it.

What you will learn

  1. Why managed agents and office agents are different products that share a confusing name
  2. What the managed harness handles for you, and the four concepts it is built on
  3. How the brain-hands split lets the agent loop and the code execution run in different places
  4. Where managed agents fit against the Messages API and Claude Code subagents
  5. When the managed harness earns its place, and the lock-in question to ask first

Anthropic now has two agent products with names so close that people mix them up constantly. Office agents and managed agents. They are not two versions of one thing. They are different products, for different people, doing different jobs.

Office agents is a consumer feature. It is a toggle inside Claude’s Cowork settings that lets Claude carry one conversation across the Excel and PowerPoint add-ins, so it reads your spreadsheet in one app and builds slides in the other. An end user flips it on. No code. I covered it in a separate post on what office agents actually do.

Managed agents is a developer product. It is an API for building and running autonomous agents on Anthropic’s infrastructure, so Claude can read files, run commands, browse the web, and execute code for minutes or hours without you building the machinery underneath. A developer calls it. All code. The audiences barely overlap.

So if you searched “Anthropic managed agents” wanting the Office feature, you are in the wrong place, and the reverse is true too. This post is about the developer product: what the managed harness actually is, the brain-and-hands architecture that makes it worth attention, and when it is the right call over the alternatives.

What managed agents is, and what it is not

Claude Managed Agents is a pre-built agent harness that runs on managed infrastructure. Anthropic’s own docs set it against the Messages API: the Messages API gives you direct model access and you build your own agent loop, tool execution, and runtime; managed agents hand you that whole loop already built. You define what the agent is, and Anthropic runs it. Claude can read files, run shell commands, search the web, and execute code inside a secure environment, across a session that lasts minutes or hours. The harness also carries the performance work you would otherwise do by hand, including prompt caching and context compaction. The product reached public beta on April 9, 2026, and it is on by default for every Claude API account. That last detail is worth pausing on. There is no waitlist and no sales call. A normal API key plus one beta header is the entire entry requirement.

That openness is unusual, and it inverts what people expect. An enterprise-grade agent product usually arrives behind a gate: a tier you must qualify for, a quota to clear, a contract to sign. Managed agents has none of that. The reason is plain once you see it. Anthropic wants developers building on this harness, and a gate would only slow that down.

A managed agent is also not Claude Code. Claude Code is the interactive terminal tool a developer drives by hand. Managed agents is closer in spirit to a piece of cloud infrastructure: a place to run an autonomous Claude agent that you do not operate yourself. Hold the three apart. Office agents shares context across two Office apps for a person clicking around a spreadsheet. Claude Code is a coding tool you sit in front of. Managed agents runs a long autonomous task for a program that called it. Same company, adjacent weeks of announcements, three different buyers.

How the managed harness works

The harness rests on four ideas, and they are the whole mental model.

An agent is the definition you write once: the model, the system prompt, the tools, any MCP servers or skills, all referenced later by ID. The environment decides where sessions run, and it is either an Anthropic-managed cloud container or a self-hosted sandbox on infrastructure you control. A session is one live run of an agent inside an environment, working a single task. And events are the traffic between your application and the agent: your instructions, the agent’s tool results, the status updates along the way.

The working loop is short to describe, and that is the point. You create an agent, create an environment, and start a session that references both. You send the task as an event, and Claude runs autonomously, calling tools and streaming results back over server-sent events. The session history is kept on Anthropic’s servers, so you can fetch the full record later, and you can send more events mid-run to steer the agent or interrupt it to change direction. Every Managed Agents request carries one beta header, managed-agents-2026-04-01, and the Anthropic SDK adds it for you. Sessions are stateful by design: they hold a filesystem and a conversation history, they survive pauses, and they resume cleanly. That statefulness is the product’s strength, and, as the next sections show, the root of its sharpest limitation.

Inside that loop, Claude reaches for a fixed set of built-in tools: a Bash tool for shell commands, file operations for reading and editing files, and web search and fetch for pulling in outside information. You extend the set with your own MCP servers. The same harness is offered on the Claude Platform on AWS, with small differences in feature availability, so a team already standardized on AWS is not pushed onto a separate path.

The harness is still growing. Two features sit in research preview rather than general availability: MCP tunnels, for reaching private tools, and “dreaming,” where an agent reviews its past sessions to find patterns and improve. You request access to those separately. The core loop, though, is what most people will use, and it is available to everyone today.

The brain and the hands

Here is the part of managed agents actually worth slowing down for. The environment, the place the agent’s code runs, does not have to be Anthropic’s.

Anthropic describes the architecture as “decoupling the brain from the hands.” The brain is the agent loop: the reasoning, the choice of which tool to call next, the model inference, the prompt caching. That always runs on Anthropic. The hands are the execution layer: the sandbox where shell commands actually run, where files are read and written, where code executes. The hands can run elsewhere. On May 19, 2026, Cloudflare and Anthropic announced exactly this. With Cloudflare environments, the agent loop stays on Anthropic, but every tool call executes inside a Cloudflare sandbox, either a lightweight V8 isolate that boots in milliseconds or a full Linux microVM for heavier work. The split lets an enterprise keep the agent’s execution next to its own data and inside its own network controls, while still using Anthropic’s reasoning.

Claude managed agents: Anthropic runs the agent loop, an environment runs the sandbox and tools

Why does the split matter so much? Because the thing most enterprises are nervous about is not Claude’s reasoning. It is what an autonomous agent touches when it runs. An agent that can run shell commands and read files can do real work, and real damage if handled carelessly. The brain-hands split lets a security team put the hands where they can watch them: a sandbox in their own cloud account, behind their own egress rules, with their own audit logging and credential injection. Cloudflare’s version adds outbound proxy policies and private tunnels so the agent reaches only the services you allowlist. The reasoning still happens at Anthropic, but the blast radius of the execution sits inside your perimeter. Containment like that is where reliable agent design starts.

Working out where the hands should run is an architecture decision worth getting right before you write code. If you want to think it through for your own setup, my door is open.

How this compares to building it yourself

Managed agents is one of several ways to put Claude to work, and the way to choose is to see what each one is for. The Messages API is the raw option: you get model access and you build the agent loop, the tool execution, and the runtime yourself. Maximum control, maximum work. The Claude Agent SDK sits in between, giving you Anthropic’s loop logic as a library you run on your own infrastructure, so you keep operational control without writing the loop from scratch. Managed agents goes furthest: Anthropic runs the loop and, by default, the infrastructure too. Claude Code subagents are a different thing again. A subagent is not a deployment product; it is a way to delegate work inside an interactive Claude Code session. If you are weighing those, I have written separately on subagents, parallel agents, and skills. The rule of thumb stays plain: the more of the harness you hand to Anthropic, the less you operate and the less you can change.

That tradeoff is the whole decision. Building on the Messages API means you own every part and can tune every part, which is right when your agent does something unusual that a generic harness would get wrong. Managed agents means you own almost none of the plumbing, which is right when your agent does something fairly ordinary and you would rather ship than maintain. Anthropic’s pitch for managed agents is “prototype to production in days rather than months,” and for a common-shaped agent that is a fair claim, because the months usually go into exactly the harness work managed agents removes. The question is never which option is best in the abstract. It is how unusual your agent is.

A worked example makes that concrete. Say you want an agent that triages incoming support tickets: read the ticket, check two internal systems, draft a reply, hand it to a human. That is an ordinary shape, it runs for a bounded stretch, and nothing about it needs a custom loop. Managed agents fits it cleanly. Now say you want an agent that runs a long, branching research process with its own bespoke memory model and an unusual control flow. A generic harness will fight you at every turn. That is Messages API or Agent SDK territory. The shape of the agent decides, not the size of the company building it.

When managed agents earn their place

Managed agents earn their place when three things hold at once: the agent runs long, its shape is fairly ordinary, and you would rather not operate sandbox infrastructure. A nightly job that investigates production errors, a research task that runs for an hour, an agent that builds and tests code. Those fit, and the early adopters point the same way. Notion is using managed agents for workspace AI. Asana uses it for automatic task planning, and Sentry runs it to analyze stack traces and diagnose bugs. None of those is an exotic agent. They are ordinary long-running tasks that nobody wanted to build a harness for.

But two limits belong in the decision before you commit. The first is data retention. Because sessions are stateful and stored on Anthropic’s servers, managed agents is not currently eligible for Zero Data Retention or for HIPAA Business Associate Agreement coverage. If your workload needs either, the Anthropic-managed cloud container is out, and a self-hosted environment moves from optional to mandatory. The second is lock-in. The more of the harness Anthropic runs, the more your agent depends on Anthropic-shaped concepts, and the harder it is to move later. That is not a reason to avoid the product. It is a reason to know what you are trading.

It is worth stating the anti-cases directly, because the wrong fit is expensive. Do not reach for managed agents for a short, synchronous task: a single prompt-and-response, a quick classification, anything that finishes in one model call. The managed harness is built for long, autonomous, multi-step work, and using it for a one-shot task pays the setup overhead of a full managed session to do something the Messages API does in a single call. Do not reach for it when your agent’s logic is unusual at its core either, because a managed harness is opinionated, and an unusual agent spends its life fighting those opinions. And do not reach for it for a workload that cannot meet the data-retention limits above. Managed agents is a sharp tool with a specific shape, and forcing a different shape into it is how a fast start becomes a slow rebuild.

Cost deserves a clear word too. You pay for the model tokens the agent consumes plus the managed runtime it uses, and an autonomous agent left to run is a different spending shape from a single API call. One Messages API request is one request. A managed session that runs for an hour, calling tools in a loop, can consume many times that while you are not watching. This is not an argument against the product. It is an argument for putting a ceiling on how long a session may run and checking the spend early, before a long-running agent quietly becomes a long-running invoice.

So where does this go? My read is that the brain-hands split is the part that lasts. The fully-managed version, everything on Anthropic, is the convenient on-ramp, and plenty of teams will start there because it is fast. But the teams that stay will be the ones that put the hands in their own environment, because data control and the freedom to leave both live on that side of the line. Anthropic clearly knows this. It built self-hosted sandboxes into the product on day one and partnered with Cloudflare within weeks of launch. The prediction is not that managed agents wins or loses as a whole. It is that “managed” quietly comes to mean managed reasoning with self-hosted execution for any company that has something to protect. The harness was always the easy part to give away. The execution is the part worth keeping.

About the Author

Amit Kothari is an experienced consultant, advisor, coach, and educator specializing in AI and operations for executives and their companies. With 25+ years of experience, he is the Co-Founder & CEO of Tallyfy® (raised $3.6m, the Workflow Made Easy® platform) and Partner at Blue Sheen, an AI advisory firm for mid-size companies. He helps companies identify, plan, and implement practical AI solutions that actually work. Originally British and now based in St. Louis, MO, Amit combines deep technical expertise with real-world business understanding. Read Amit's full bio →

Disclaimer: The content in this article represents personal opinions based on extensive research and practical experience. While every effort has been made to ensure accuracy through data analysis and source verification, this should not be considered professional advice. Always consult with qualified professionals for decisions specific to your situation.

Related Posts

View All Posts »
Is the Anthropic Certified Architect worth it

Is the Anthropic Certified Architect worth it

The Anthropic Certified Architect, Foundations is the first official Claude technical certification. It is also brand new and still in an early-adopter phase, which makes it hard to value. The free Anthropic Academy courses are the part worth doing today. The credential is a bet on a job market that does not exist yet.

What the Anthropic partner program actually is

What the Anthropic partner program actually is

The Anthropic partner program, the Claude Partner Network, launched in 2026. The surprise is how open it is: membership is free and any organization bringing Claude to market can join. That means joining is not the achievement. It is a box of enablement tools, and it gives you a multiplier, not leads. Here is what it actually is.

A complete guide to working with Claude

A complete guide to working with Claude

Working with Claude now means a dozen things: Claude Code, the Desktop app, agents, the partner network, certifications, connectors. This is the map. A hub that lays out the five regions of working with Claude and links each one to a deeper guide.

Self-hosted vs managed AI agents is a governance call

Self-hosted vs managed AI agents is a governance call

The choice between self-hosted and managed AI agents gets treated as build versus buy, a cost question. It is not. It is a governance decision about where your data goes, what you can audit, and whether you can leave. Here is what each path gives you and how to decide.

AI advisory services via Blue Sheen.
Contact me Follow 10k+