The short version
A signed BAA makes a Claude healthcare workflow legal. It does not make it safe. The engineering work is keeping protected health information away from the model wherever the task allows, and recording what the model saw when it cannot be avoided. Three design patterns do most of that work.
- De-identify before the model, using the HIPAA Safe Harbor or Expert Determination standard
- Keep PHI local, sending the model the smallest slice the task needs
- Log what the model saw, so an auditor can reconstruct every access
Search for how to use Claude in a healthcare setting and the guidance converges on three words: get a BAA. The advice is correct, and it is also where the useful part stops. A Business Associate Agreement is a contract. It is not an architecture. It tells you that protected health information is legally allowed to reach the vendor. It tells you nothing about how to build the workflow so the model sees as little of that PHI as the task can tolerate.
That gap is the subject of this post. The BAA is the legal floor. The design patterns are the engineering that sits on top of it, and they are what decides whether a breach of the model becomes a breach of your patients. There are three patterns worth knowing, plus the reference architecture they add up to. None of them is exotic. All of them come down to one instinct: treat PHI as something the model passes through, never something the model holds.
The BAA is the floor
A BAA and a safe workflow are different things, and conflating them is the most common mistake in healthcare AI. A Business Associate Agreement is the legal instrument that permits protected health information to reach a vendor, and Anthropic offers one. I have written separately on what a BAA covers for Claude Code, and the short version is that it is narrower than people assume. But even at its widest, a BAA only changes what is permitted. It does not change what is wise. A signed BAA makes it lawful to send a patient’s full record to a model. It does not make doing so a good idea, because every field of PHI that reaches the model is a field that now lives in one more place, travels one more hop, and appears in one more log. The legal floor says you may. The engineering question is how little you can get away with sending, and that question is answered by design patterns, not paperwork.
None of this makes the BAA optional. Without it, sending PHI to Claude is a violation on its own, and the broader picture of Claude and HIPAA starts there. The point is narrower and easy to miss. The BAA is a yes-or-no gate, and once you are through it, it stops helping. It does not shrink the data you send or watch how you send it, and it does not record what happened. Everything past the gate is architecture, and architecture is built from patterns.
The pull toward stopping at the BAA is understandable. The BAA is the part with a clear finish line. You request it and you sign it, and then it is done, and a signed agreement feels like a milestone reached. The design patterns have no signing ceremony. They are ongoing engineering, and engineering does not announce its own completion. So the BAA gets the attention and the architecture gets deferred, which is exactly backward, because the architecture is the part that fails quietly.
De-identify before the model
The strongest pattern is the oldest one: do not send PHI you can avoid sending. HIPAA gives a precise standard for this. Its de-identification rule defines two methods. Safe Harbor removes eighteen specified identifiers, names, full dates, contact details, record numbers, biometric data, and the rest, after which the data is no longer PHI under HIPAA. Expert Determination has a qualified statistician certify that the re-identification risk is very small, which allows gentler techniques like date-shifting that keep more of the data useful. Applied to a Claude workflow, the pattern is to run de-identification before the model call. The model receives data that has already cleared one of those two standards, so what reaches Claude is not PHI at all. This does not fit every task. Some clinical work needs the real dates or the real identifiers. But for the large class of tasks that do not, de-identifying first turns a compliance problem into a non-problem.
The ordering is the part people get wrong. De-identification has to happen before the data reaches Claude, on infrastructure you control, not as something you ask the model to do. A model can be asked to strip names from text, and it will mostly succeed, but mostly is not a standard, and a model that is removing identifiers has by definition already received them. The pattern only works as a gate in front of the model. Run it after, and the PHI has already made the trip you were trying to prevent.
There is a hard case inside this pattern worth naming. De-identifying structured data, a database column of dates or a field of record numbers, is mechanical. De-identifying free-text clinical notes is not. A discharge summary mentions the patient by name in a sentence, references a spouse, names the referring physician, gives a town. The eighteen identifiers are all in there, scattered through prose. Stripping them reliably from narrative text is its own engineering problem, and it is the reason de-identification sometimes fails in practice even when the intent was right. If your healthcare workflow runs on free-text notes, budget real effort for that step. It is not a one-liner.
A second caution belongs with de-identification: it is not a one-time decision. The HIPAA Safe Harbor list is fixed, but your data is not. A new field gets added to the record, a new free-text section appears in the intake form, an integration starts pulling a column nobody reviewed, and a de-identification step that was correct last quarter now lets an identifier through. The pattern is only as good as its coverage of the data as it actually is today. So the de-identification step needs the same treatment as any other piece of safety-critical code, with a test suite and a named owner and a review every time the schema changes. Set it and forget it is exactly how PHI leaks.
Keep PHI local
Some tasks need real PHI; a model summarizing an actual patient’s chart cannot work on a de-identified copy. For those, the second pattern is restraint about how much goes and how long it lingers. HIPAA already names the principle: minimum necessary, the rule that anyone handling PHI should touch only the slice a task requires. A model is no exception. The pattern is to keep the full record on infrastructure you control and send the model only the fields one specific step needs, rather than uploading the whole chart because uploading the whole chart is easier. Pair that with a Zero Data Retention arrangement, so the slice that does go is not retained after the request finishes. The combination is what local-first means in practice: the system of record stays yours, the model receives a deliberately small and short-lived view, and PHI never accumulates on the vendor side.
This is where retrieval architecture earns its place. Instead of handing the model a whole record, the workflow holds the record in a store you own and pulls only the specific fields a step needs, assembling a small, purpose-built prompt. The same discipline runs through any sound data-privacy implementation: the question is never what could the model use, it is what does this step actually require. A model asked to check a drug interaction needs the medication list. It does not need the name or the address. It does not need the visit history either, so it should never be handed them.
There is a limit to how local you can keep things, and it is worth being plain about. The model itself runs on Anthropic’s infrastructure; local-first does not mean nothing leaves. It means the smallest possible slice leaves, for the shortest possible time, and the system of record never does. A team that hears local-first and pictures an air-gapped model has the wrong picture. The realistic version is a boundary you draw deliberately: the full record on your side, the smallest useful view crossing to the model and then gone. That is not the model running locally. It is the data discipline running locally, and for almost every healthcare workflow that is the boundary that actually matters.
Log what the model saw
When PHI does reach the model, the third pattern makes that fact reconstructable. HIPAA’s Security Rule requires audit controls, mechanisms that record activity on systems holding PHI, and a model call that handled PHI is exactly such activity. The pattern is to log it yourself, on your side of the boundary, rather than assuming the vendor’s logs will answer an auditor’s question months later. Each entry records the same few things: which user or process made the call, what PHI it included, when, and for what purpose. Done well, this log lets you answer the question every healthcare audit eventually asks, who accessed this patient’s information and why, even when the accessor was an automated step rather than a person. The audit pattern does not prevent exposure. It makes exposure accountable, and accountability is a HIPAA requirement in its own right, not an optional nicety.
One detail decides whether this pattern works: the log has to be yours. A vendor keeps its own records, but those records were built to answer the vendor’s questions, not an auditor’s, and you cannot count on being able to query them, export them, or trust them to be complete for your purposes. The audit log that satisfies a HIPAA reviewer is the one your own system writes, at your own boundary, in a shape you control. Build it into the workflow from the start, not as something to reconstruct later.
A useful audit log also has to outlive the workflow that wrote it. HIPAA expects audit records to be retained and reviewable for years, not days, so the log cannot be a rolling buffer that ages out after a sprint. It needs its own durable storage and its own retention policy, and it needs access controls of its own, because the log itself describes who saw PHI and is sensitive for that reason. The pattern is small to state and easy to underbuild: write the entry, store it somewhere durable, protect it like the data it describes, and be ready to search it on the day an auditor asks.
Putting the patterns together
Stack the three patterns and a reference architecture appears. The system of record, the database holding patient data, stays on infrastructure you control. A de-identification step sits between that store and any model call: for the tasks that allow it, PHI is stripped to the Safe Harbor or Expert Determination standard before anything leaves. For the tasks that need real identifiers, a minimum-necessary step sends only the required slice, under a Zero Data Retention arrangement, and any re-identification happens locally after the model returns. Every call that touched PHI writes an entry to an audit log you own. Claude sits in the middle of that flow doing the reasoning work, and at no point does it become the place PHI lives. That is the whole design goal in one line: the model is a processor that PHI passes through, never a store where PHI rests.
One step in that flow deserves a second look: re-identification. When a task ran on de-identified data but the result needs to point back to a real patient, the mapping from the de-identified token to the real identity has to live somewhere, and that somewhere is sensitive. Keep that mapping on your side, never in the prompt and never in the model’s view. The model produces a result keyed to an anonymous token; your own system, after the model is done, joins that token back to the patient. Done that way, the model never holds both the analysis and the identity at once, which is the whole point of de-identifying in the first place.
Designing that data flow, and deciding task by task which PHI the model actually needs, is the kind of architecture work worth getting right early. Blue Sheen helps teams think it through.
Where does this go? My read is that PHI-minimizing architecture stops being a specialist concern and becomes the default shape of healthcare AI. The early approach, send the model everything and rely on the BAA, works until the first audit or the first breach, and then it stops working all at once. The teams building healthcare AI that lasts are designing as though the model should see the least PHI the task can run on, because that is both the safer system and the cheaper one to defend. That holds whether you are a hospital network or a small practice adopting AI for the first time; the patterns scale down as cleanly as they scale up. A BAA will always be necessary. It will also, increasingly, be the least interesting part of a competent healthcare AI design. The interesting part is the architecture that means a breach of the model is not a breach of the patients.



