A sales leader types one line into your shiny new analytics agent: “why did Apex drop last quarter?”
Five words. Feels trivial. It is the hardest thing in the whole system.
Which Apex? You have Apex Industries, Apex Logistics, and a parent group that also goes by Apex that rolls up nine subsidiaries, some of which are not really Apex at all. Drop in what, revenue or order volume or margin? Last quarter on what calendar, fiscal or actual, and relative to what, last year or budget or the prior quarter? The agent has to settle every one of those before it touches a single row, and if it guesses wrong on any of them it will still produce a clean, formatted, completely wrong answer.
That gap, between what the user said and what the user meant, is where analytics AI actually lives or dies. We spend our attention on the model’s reasoning. The reasoning is rarely the problem.
Why this is worse than normal search
Google forgives a vague query because it shows you ten results and lets you pick. You do the disambiguation, in your head, in half a second, by scanning.
An analytics agent does not get that grace. It returns one number, or one narrative, and people act on it. The whole value proposition is “you do not have to go digging.” So the agent has to do the digging-and-deciding that you used to do yourself, and it has to do it before it answers, not after.
This is the part that surprises teams. They assume the intelligence is in the analysis. Most of the intelligence has to be in the intake.
The three things that have to be pinned
Strip a business question down and you find three kinds of ambiguity, every time.
- The entity. Which customer, which product, which plant, which rep. This is the nastiest because names are a swamp. The same company shows up as Apex Inc, Apex Inc with no period, Apex International, and a misspelling someone typed into the order system at 4pm on a Friday.
- The window. What date range, on which calendar. “This quarter” is not a fact. It is a question.
- The measure and its scope. Revenue gross or net of credits. Including or excluding intercompany. At the group level or the legal-entity level.
Get all three right and the analysis is almost mechanical. Get one wrong and you have built a very expensive way to mislead people.
Entity resolution is the swamp
The hardest of the three, by a distance, is the entity. And it is hard for a reason that has nothing to do with AI.
Your customers do not have one true name. They have a name in each system they touch, plus the name the salesperson uses, plus the name on the invoice, plus whatever the buyer’s parent company is called this year after the last reorg. The word “Delta” could be an airline, a faucet brand, or a dental plan. The word “Jaguar” is an animal and a car. Inside one company’s data, “Apex” might be four unrelated buyers and one umbrella group that contains three of them.
This is the master data problem, and people have been fighting it since long before language models. The usual tools are an identity service like Dun and Bradstreet to anchor real-world entities by address, a master record that ties variants together, and a group name that bundles subsidiaries. None of that is AI. It is patient, unglamorous data work, and it is the foundation the agent stands on.
Here is the trap. If you skip it and let the agent “just figure out” which Apex, it will sometimes silently roll up unrelated companies into one total. The answer will look right. It will be nonsense. And nobody will catch it, because the whole point was that nobody was going to check the underlying rows.
Where AI genuinely helps, and where it must not
So is AI useless at the front of the pipeline? No. It is excellent at the fuzzy, forgiving parts.
Spelling and near-matches are exactly its strength. Type “coke” and a decent model knows you might mean Coca-Cola. Type a name with a transposed letter and it recovers. Ask in a half-formed way and it can propose a clean reading. This is real, and it is better than the rigid keyword matching we used to bolt onto search boxes.
What AI must not do is resolve a high-stakes ambiguity silently. There is a bright line here. The model can suggest. It cannot decide on its own that your nine-word question meant one specific legal entity out of forty candidates and then run a number on it without telling you. The cost of a wrong guess is too high and too invisible.
The right shape is a conversation with a guardrail. The agent narrows the field using its fuzzy matching and your master data, then it stops and asks. “Apex could mean Apex Industries, Apex Logistics, or the Apex group of nine companies. Which?” Confirm first, query second.
Chat is a bad place to disambiguate
This is the design point almost everyone gets wrong, and it is why so many of these agents feel clumsy.
Disambiguation in a pure chat box is painful. If the agent has to dump forty customer names into the conversation and ask you to type the right one back, you will not read forty names. You will give up, or worse, you will pick wrong because you skimmed.
People do not resolve ambiguity by reading prose. They resolve it by picking from a list. A dropdown, a set of checkboxes, a few radio buttons. That is the natural interface for “which of these did you mean,” and it is exactly what a chat-only surface cannot give you cleanly.
This is one of the quiet reasons I push teams toward agents they actually control rather than a locked chat widget. The day you need to show the user a real picker instead of a wall of text, you want to be able to build it. If your agent can only talk, your disambiguation will always be worse than it should be.
What this means if you are building one
Treat intake as a first-class stage, not an afterthought you bolt onto the prompt. Budget real design time for it. In my experience it deserves more attention than the analysis step, because the analysis is the part the model is already good at.
Invest in the boring master data. The synonym tables, the entity anchoring, the group rollups. This is where the durable advantage is, and it is the part no model can manufacture for you. The companies that win at analytics AI will be the ones whose data was clean enough to disambiguate against, and that work predates the AI by years.
Make the agent ask. Build the confirmation step in, with a real picker where the choices are many. Slower by a second or two. Right far more often.
And test it on messy questions, not clean ones. The demo always uses a perfectly specified query because that is what makes the demo land. Real users type five vague words and expect magic. Your test set should be full of the ugly, half-formed, ambiguous questions people actually ask, because those are the ones that will break it.
Running Tallyfy for over a decade, the lesson that stuck hardest is that software fails at the edges, not the center. The happy path always works in the demo. The value, and the danger, is in what the system does when the input is messy. Analytics AI is the same. The model can answer almost anything. The whole game is making sure it answered the question you actually asked.





