There is a strange habit in teams putting AI on their data. They treat it as a greenfield project. New prompts, new definitions, new everything, as if the company had never measured anything before.
Then you look two clicks away and find a Power BI workspace with forty dashboards that hundreds of people use every day, dashboards that have been corrected and re-corrected for years until the numbers were finally trusted. That is not nothing. That is the most validated business logic in the entire company, and the AI project is ignoring it to start from scratch.
Stop doing that. Your existing dashboards are the answer key. Here is how to use them.
Why a shipped dashboard is so valuable
A dashboard that has survived in production is a strange and precious thing. Every number on it has been wrong at some point, someone complained, and it got fixed. That cycle ran for years. What is left is a set of definitions that the business has, in effect, signed off on by using them to run the place.
The data lead I respect most put it simply once: if a dashboard number is off, a hundred tickets show up by lunch. That pressure is a feature. It means the surviving dashboards are battle-tested in a way no fresh prompt can be on day one.
So the dashboard gives you two gifts. It tells you what the correct number is, and it tells you how that number is calculated. Both are exactly what an AI agent needs and exactly what is hardest to produce from a blank page.
Gift one: the measure definitions
The calculations behind a Power BI dashboard live in its semantic model, written as DAX measures. This is the real intellectual property, the part that encodes which costs count toward margin, how credits net out, what happens at a subtotal. An agent that re-invents these will drift. An agent that reuses them will match the dashboard your people already trust.
So extract them. Pull the measure definitions out of the semantic model and write them into a plain reference file, a metric dictionary. For each measure, capture its name, what it means in business terms, and the exact logic. Gross margin is not “revenue minus cost.” It is the specific formula your finance team agreed on, with all its edge cases, and that specific version is what you want the agent to use.
This file becomes a skill the agent reads before it answers. Now when someone asks for margin, the agent does not guess. It uses the same definition the dashboard uses. The number ties out, every time, because it is literally the same calculation.
There is a useful side effect here. The act of writing the dictionary forces you to find measures that disagree with each other, the five slightly different “sales” fields, the two definitions of an active customer. Most companies have these landmines and do not know it until an AI starts stepping on them. Better to find them while you are reading your own dashboards than after the agent has shipped a contradictory answer to an executive.
Gift two: the test set, for free
This is the part almost nobody exploits, and it is the best idea in this whole post.
Every tile on every dashboard is a known-correct question-and-answer pair.
Look at a tile that says revenue was a certain figure for the quarter, filtered to a region. That is a question (“what was revenue last quarter in this region”) with a verified answer (the number on the tile). You have hundreds of these already rendered and trusted. They are a regression test set you did not have to build.
So build the catalog. For each meaningful tile, write down the question it answers and the value it shows. Then run your agent against that catalog and compare. Where the agent matches the dashboard, you have evidence. Where it diverges, you have a bug to chase, before a user finds it.
And keep the catalog. Every time you change a prompt, a view, or a model, re-run it. Now every change is something you can check instead of something you hope about. This is the difference between an analytics agent you can responsibly put in front of fifty people and one you are quietly praying over. I have watched teams ship without this and spend the next month firefighting trust. The dashboards would have caught most of it.
One thing to add to the catalog that the dashboards will not give you: deliberately ugly questions. The vague ones, the misspelled ones, the ones with an ambiguous customer name. Dashboards are all clean, specified queries. Real users are not. So pad the catalog with the messy questions people actually ask, and record what a good answer looks like for each.
Gift three: the filter patterns
There is a quieter third thing dashboards teach you: how people actually slice the business.
The slicers and filters on a dashboard are a record of the questions that matter. If every dashboard filters by region, plant, and customer tier, then those are the dimensions people think in, and your agent should know to offer them. The default date range on a tile tells you what “recently” usually means here. The way a dashboard rolls customers into groups tells you how to resolve a name.
You can read all of this straight off the reports. It is free design research for how the agent should behave, written by years of real usage rather than guesswork in a planning meeting.
A working sequence
If you want this as a concrete order of operations, here is the one I would run.
Start with your three or four most-used dashboards, the ones that would generate the angriest tickets if they broke. Those carry the most-validated logic, so they are the highest-value to mine first.
Extract their measure definitions into a metric dictionary, in plain language plus the exact formula, and have a human who knows the business read it for sanity. This is also where you surface the conflicting-definition landmines.
Turn the dashboard tiles into a test catalog of question-and-answer pairs, then add a batch of messy real-world questions with expected answers.
Wire the metric dictionary into the agent as a skill it reads before answering, so it reuses the agreed definitions instead of improvising.
Run the agent against the catalog, fix the divergences, and keep the catalog as your regression suite for every future change.
That is most of the unglamorous work of a trustworthy analytics agent, and you got the raw material for free because you already did the hard part years ago.
The mindset shift
The instinct to start fresh comes from treating AI as a new kind of thing that needs new kinds of inputs. It does not. It needs the same thing every analyst needs: correct definitions and a way to check the work. You have both, already, embedded in the reports you shipped.
Running Tallyfy for over a decade taught me to distrust the blank page. The blank page feels clean and it is usually a waste, because somewhere in the building someone already solved the hard part and wrote it down, even if they wrote it down in a dashboard instead of a document. The skill is finding that work and reusing it, not redoing it.
Your dashboards are not legacy to be replaced. They are the training material and the answer key for whatever you build next. Read them before you write a single prompt.





