Amit Kothari CEO of Tallyfy, AI advisor at Blue Sheen

Self-hosted vs managed AI agents is a governance call

In brief

The choice between self-hosted and managed AI agents gets treated as build versus buy, a cost question. It is not. It is a governance decision about where your data goes, what you can audit, and whether you can leave. Here is what each path gives you and how to decide.

Amit Kothari Follow 10k+

May 20, 2026 · AI

CEO of Tallyfy · AI advisor at Blue Sheen for mid-size companies

Self-hosted vs managed AI agents is a governance call

Build versus buy. For most software that decision turns on cost and time: is it cheaper and faster to run our own, or to pay for a managed service? People reach for AI agents the same way and ask the same question. The question is not wrong. It is just not the one that decides anything.

The choice between self-hosted and managed AI agents is a governance decision. It is about where your data goes when an agent runs, what record you can produce of what the agent did, and whether you can leave the arrangement later without rebuilding everything. Cost sits downstream of all three. A managed service you are not allowed to use for regulated data is not cheap. It is unavailable, at any price.

So this post does not rank the two on price (I do that in a separate piece on the managed-agent cost crossover). It compares them on what actually settles the decision. Two models exist: managed agents, where a vendor runs the harness for you, and self-hosted agents, where you run the framework yourself. What follows is what each one really gives you, what really decides between them, and how to make the call.

Why this is not a cost question

The build-versus-buy reflex comes from ordinary software, where the two options really are interchangeable and price really is the variable. A managed database and a self-hosted database store the same rows; you pick on cost and effort. AI agents break that assumption. An agent is not a passive store. It reads data, makes decisions, and takes actions, often on systems that matter. Where that happens, and who can see a record of it, are not cost line-items. They are governance facts. A managed agent service might be the cheapest and fastest option available and still be the wrong one, because your regulator, your contract, or your own risk policy does not allow the data to leave your control. Cost cannot rescue an option that compliance has already ruled out. So the first question is never “which is cheaper.” It is “which is allowed,” and only then “which is better.”

This is why the cost-first instinct misleads people. They run a price comparison, pick the cheaper option, start building, and only later discover a constraint that was always there. A data-residency clause in a customer contract. An audit requirement from a regulator. A security policy that forbids sending certain records to an outside service. Discovered late, every one of those forces a rebuild. The cost comparison was real, and it was also irrelevant, because it answered a question that came second. Governance comes first, and the rest of this post is about reading your governance position clearly enough to choose well.

The pull toward the cost frame is understandable, because cost is the one variable that is easy to put in a spreadsheet. Governance is not. There is no clean number for can we legally do this, or what happens to us in an audit, so those questions get deferred while the comparison that does fit a spreadsheet gets done first. That is the trap in one sentence: the easy-to-quantify question crowds out the one that actually decides. A useful discipline is to refuse to open the cost spreadsheet until the governance questions have a written answer. Not a vague sense, a written answer, because a constraint nobody wrote down is a constraint somebody will forget.

The managed path

Take Anthropic’s Claude Managed Agents as the clearest example of the managed model. You define the agent, and Anthropic runs the harness: the agent loop, the tool execution, the sandbox, the state. Anthropic’s pitch is prototype to production in days rather than months, and for an ordinary agent that is a fair claim, because the months usually go into building exactly the harness a managed service hands you. What you give up is location. By default the agent loop and the execution both run on Anthropic’s infrastructure, and the session state, including a filesystem and a full conversation history, is stored on Anthropic’s servers. That storage has a concrete consequence: because managed agents is stateful by design, it is not currently eligible for Zero Data Retention or for HIPAA Business Associate Agreement coverage. For some workloads that single line ends the conversation. Healthcare data under HIPAA is the obvious case, but it is not the only one. A financial firm whose regulator expects records to stay inside a controlled boundary can hit the same wall. So can a government contractor with data-handling clauses, or a company whose own customers were promised their data would not pass to sub-vendors without consent. None of those is exotic.

Notice what the managed path is good at, because it is real. It removes an entire category of work. You do not build the loop, you do not operate the sandbox, you do not carry the reliability and scaling burden. For a team whose constraint is shipping speed, and whose data has no special handling rules, that is a strong offer and the cheaper option in any fair accounting, once you price your own engineering time. The managed path is not a weak choice. It is a choice with a governance bill attached, and the bill is paid in control.

Be concrete about who the managed path fits, because the answer is a lot of teams. A startup shipping its first AI feature fits it. So does an internal tool that touches no regulated data, and so does a team small enough that operating its own agent infrastructure would consume the very engineers who should be building the product. For all of those, managed is not a compromise; it is the correct call. The mistake is not choosing managed. The mistake is choosing it without checking whether a governance constraint quietly rules it out. Run the check, get a clean result, and the speed the managed path buys is yours to keep with no asterisk attached.

The self-hosted path

Self-hosted means you run the agent framework yourself, on infrastructure you control. The mature options are open source. LangGraph, from the LangChain team, models an agent as a graph of states and has become a common default for stateful production workflows in regulated industries. CrewAI organizes work around role-based agents and is quick to get a prototype running. LlamaIndex grew out of retrieval and is strongest when pulling from your own data is the central job. Whichever you pick, the shape of the deal is the same. The agent loop runs in your virtual private cloud, the data never leaves your boundary, and every action the agent takes is logged where your own tools can read it. The price is operational. You own the sandboxing, the upgrades, the scaling, and the reliability work. Nobody hands you that harness. You build and maintain it, and that is real, ongoing engineering.

People underrate that operational price, and then resent it. A self-hosted agent platform is not a one-time build. It is a system you run: patched, monitored, scaled, kept reliable while the frameworks underneath it move fast. If you have ever compared the agent libraries directly, my piece on LangChain and LlamaIndex goes deeper on that. The self-hosted path buys you control over your data and your own exit. It charges you in standing engineering effort. That is the trade, stated plainly, and it is a fair trade for the team that actually needs what it buys.

It is worth naming what self-hosting actually demands, because teams underestimate it twice. The first underestimate is the build: standing up a self-hosted agent platform is a real project, not a weekend. The second, larger one is the run: the frameworks move fast, the model providers change their APIs, the security patches keep coming, and someone has to own all of that indefinitely. A self-hosted agent platform with no clear owner does not stay self-hosted in any useful sense; it slowly rots into a liability. So the real question for the self-hosted path is not can we build it. It is will we still be maintaining it well in two years. If the answer is no, the control it offered was never real.

What actually decides it

Four factors decide this, and none of them is price. Data residency: does regulation or contract require the data an agent touches to stay inside a boundary you control? Audit: when someone asks what the agent did six months ago, can you produce the record, or does it sit in a vendor system you cannot fully query? Control: when the agent needs an unusual loop or a custom guardrail, can you change the machinery, or are you held to what the managed harness exposes? Exit: if you have to move later, how much rebuilding does leaving cost? Run those four questions over your own situation. If all four come back comfortable, governance is not constraining you, and you should choose on speed and effort, where managed usually wins. If even one comes back hard, that factor, not cost, has already made your decision, and it points toward self-hosted execution.

There is also a middle path, and it is worth knowing before you treat this as binary. Anthropic’s managed agents can run in a self-hosted environment: the agent loop, the brain, stays on Anthropic, while the sandbox where code actually executes, the hands, runs on infrastructure you control. That hybrid resolves the data-residency factor without making you build the whole loop. It does not resolve all four. The reasoning still happens at the vendor, so a workload that cannot send anything at all to an outside model is still a fully self-hosted job. But for the common case, where the constraint is about where execution and stored data sit rather than the model call itself, the hybrid is often the right answer, and it is why this is not a clean two-way split.

Mapping your own four factors onto a real architecture is the kind of work Blue Sheen does with clients.

Of the four factors, audit is the one teams discover too late, so it deserves a closer look. The question is not whether logs exist; both paths produce logs. The question is whether you can produce, on demand, a complete and queryable record of what an agent did, in a form an auditor accepts, without depending on a vendor’s cooperation and a vendor’s export format. On the self-hosted path that record is yours by construction. On the managed path it is yours only to the extent the vendor exposes it. For a workload that will face a real audit, that difference is not a detail. It is the factor, and it usually points the same way data residency does.

By construction means something specific here. If every action the agent takes lands as a commit, the audit trail is the commit log: what changed, when, and the reasoning sitting next to the change itself, queryable offline with no vendor in the loop. When an auditor asks what an agent did on a given day, the answer is a git history they can read directly, and it survives a change of infrastructure because it was never the vendor’s to hand back.

Making the call

Put it together and the method is straightforward. Do not start from a feature comparison. Start from your governance position, the four factors, and let that position narrow the field before you compare anything else.

A decision path: hard governance rules point to self-hosted or hybrid, no rules and a standard loop point to managed

If you have hard governance rules, data that must stay put or actions that must be auditable on your terms, you need self-hosted execution, either fully self-hosted or the hybrid. If you have no hard rules, the question becomes engineering: an unusual agent loop favors a self-hosted framework for the control it gives, and an ordinary loop favors the managed harness for the speed. Cost enters only here, at the end, as a tie-breaker between options that governance has already cleared. When you reach that tie-breaker, here is how the cost actually works out, and why the hourly rate is the wrong number to start from.

For a lot of teams the answer really is managed, and they should not talk themselves out of it. A startup building an internal tool has no data-residency clause to honor and no auditor waiting. For that team governance is not a constraint, and self-hosting would just be operational cost with no governance return. The method here is not a bias toward self-hosting. It is a filter. For the unconstrained team it returns managed quickly, and that is the right answer, cleanly reached.

The hybrid deserves one more push, because it is the option teams skip and should not. Most organizations that think they need full self-hosting actually have one binding constraint, usually data residency or execution control, and a single binding constraint is exactly what the hybrid resolves: managed reasoning, self-hosted execution. Reaching straight for full self-hosting when the hybrid would have done means taking on the entire operational burden to solve a problem a partial solution already solved. Before committing to run everything yourself, work out precisely which factors are hard. If it is one, the hybrid is probably your answer, and it is a far lighter thing to own.

The implication is the part worth holding onto. A team that picks managed because it is cheap and fast, and then discovers a governance constraint that was there the whole time, pays twice: once to build on the managed service, and again to rebuild self-hosted when the constraint finally surfaces. The cost comparison they ran at the start did not save them money. It cost them a rebuild. Decide the governance question first, deliberately, while it is still cheap to decide. That is the whole discipline. The track switch is easy to throw before the train arrives and very hard to throw after.

ai-agentsclaudebuild-vs-buyagent-infrastructuremanaged-agents

About the Author

Amit Kothari is an experienced consultant, advisor, coach, and educator specializing in AI and operations for executives and their companies. With 25+ years of experience, he is the Co-Founder & CEO of Tallyfy® (raised $3.6m, the Workflow Made Easy® platform) and Partner at Blue Sheen, an AI advisory firm for mid-size companies. He helps companies identify, plan, and implement practical AI solutions that actually work. Originally British and now based in St. Louis, MO, Amit combines deep technical expertise with real-world business understanding. Read Amit's full bio →

Disclaimer: The content in this article represents personal opinions based on extensive research and practical experience. While every effort has been made to ensure accuracy through data analysis and source verification, this should not be considered professional advice. Always consult with qualified professionals for decisions specific to your situation.

Contact me More about me

View All Posts »

Managed AI agents and the cost crossover nobody calculates

Anthropic managed agents bill $0.08 per session-hour, and everyone races to compare that to a cheap VM. The comparison misses the point. Runtime is a rounding error next to tokens, and the operations bill decides the rest. Here is where self-hosting an AI agent actually starts to pay, with the real 2026 numbers.

Anthropic managed agents are not office agents

Anthropic managed agents and office agents are different products with confusingly similar names. Managed agents is a developer API for running autonomous Claude agents on managed infrastructure. The interesting part is the brain-hands split: Anthropic runs the agent loop, while the sandbox can run in your own environment. This is what it is, and when to use it.

What Claude office agents actually do and why you should care

Claude office agents let Claude share context across Excel and PowerPoint through a single toggle. Here is what the feature actually does, what the Skills framework changes, and the security gaps you need to know about before enabling it.

Claude is allowed in regulated finance, but it has no EU data residency

Two objections kill most regulated-finance AI conversations before they start. The first, that Anthropic does not permit Claude for regulated work, is false: Claude for Financial Services exists, banks run it, and the usage policy names finance high-risk, not forbidden. The second is real and almost nobody states it plainly: first-party Claude Enterprise has no EU data residency at all. There is no "eu" inference region and workspace storage is US-only. If you are FCA-regulated, that is the fact to design around, and the only EU route runs through a hyperscaler.

Your locked-down Claude sandbox is a holding pattern, not a destination

Giving everyone Claude inside an isolated VM, no sensitive data allowed, feels like the safe way to start. It is a fine way to start. The trouble is what happens when you leave people there: the leak it was built to stop walks out by copy-paste anyway, the friction recruits the shadow AI you were trying to prevent, and the value never compounds because nothing in an ephemeral box survives the session. A sandbox is a scaffold. Scaffolds come down.

An MCP server is unreviewed code with your file system in scope

Treat every MCP server as untrusted code that runs with the access your agent has, because that is what it is. Anthropic docs say the directory lists connectors but does not security-audit them. A registry of approved servers with nothing enforcing it is a memo. The control that binds is a managed allowlist matched by URL or command, never by name.