The Department of Provable Answers – An Internal Audit Manager’s Story
Assumed deployment posture: Tenant Platform Fee: Tier 0 enabled. Prod PED: Tier 0.
It’s the Tuesday before Audit Committee, and James is staring at an agenda item that looks deceptively tidy: “AI controls: independent assurance update.”
Seven words. Behind them, three weeks of work that James already knows will follow the same script it always follows. He’ll send evidence requests. Control owners will promise to respond by Wednesday and actually respond the following Monday.
He’ll get screenshots pasted into emails. He’ll get a spreadsheet someone populated by hand with no source trail. He’ll interview people who sincerely believe the controls are working but can’t show him how they know.
He’ll assemble it all into workpapers that look rigorous but feel fragile, because the underlying evidence could be challenged by any non-exec with a pointed question and ten minutes of patience.
James has been in internal audit for eleven years. He doesn’t get paid to be diplomatic about this. He gets paid to be right, in a way that’s testable.
And the honest truth is that most AI assurance today isn’t testable. It’s a collection of interviews and artefacts that prove the control was designed, not that it operated.
There’s a world of difference between those two things, and audit committees are starting to notice.
Then, during a pre-audit walkthrough, a control owner casually mentions that the vendor pushed a model update mid-sprint.
Someone else says the system is now using a new tool-call path, but only for a subset of cases.
Two throwaway sentences, and James feels the ground shift under his assurance plan. The question is no longer “are the controls designed correctly?” It’s “are the controls actually operating, consistently, across versions that nobody told audit about?”
In the old version of this story, James spends the next week chasing the change. Who approved it? When did it go live? Is there a change ticket?
The change ticket references a Jira epic that references a Confluence page that hasn’t been updated.
The model risk team says they were informed. The control owner says they informed them. Nobody has a signed record of what the control actually did during the transition period.
But this isn’t the old version of this story.
James opens PARCIS XAI-Lite and treats it as what it is: a System of Evidence that produces an audit trail at decision time, not an after-the-fact explanation scrapbook.
XAI-Lite wraps the AI stack at the decision boundary without touching the model. Enforcement lives on the synchronous path.
Every governed decision emits a QiTraceID—a cryptographic receipt backed by the tamper-evident QiLedger. For James, this means something that changes the entire shape of his work: QiTraceID becomes the audit join-key. Not a spreadsheet row someone typed by hand.
A stable, verifiable case reference minted at the moment the decision was made.
He doesn’t send evidence requests. He doesn’t book interviews. He asks the system for a testable population: “Give me a sample of governed decisions for this model and this business process over the last quarter. For each one: show me the QiTraceID, the model and tool identifiers and versions, the policy set and version, the gate outcome, and the integrity anchors.”
The sample arrives in minutes. Not as a narrative. As a population of verifiable cases. He clicks into a handful and the feeling is unfamiliar: calm.
For each decision, the system has captured timestamps, endpoint alias, jurisdiction, policy set and version, the governance fingerprint before and after the decision, the Ethics Gate action, and cryptographic integrity anchors.
It’s “what happened” rendered as a verifiable record, not a story someone reconstructed from memory.
Now he goes after the sore spot—the mid-sprint model change. “Show me all model and policy changes during the period, and for each one: where is the promotion evidence?”
Because XAI-Lite expects explicit change hygiene—policy and model changes emit a signed promotion record with mandatory fields: who, what, when, why, integrity hash—James can see the change, see whether it was authorised, and see what the control did during the transition.
Silent upgrades stop being invisible. The review focuses on whether the control worked, not on whether someone remembered to tell the truth.
Then James does the thing that makes internal audit genuinely valuable to a board: he re-performs. Not by replaying the raw AI conversations—he doesn’t need to.
He picks a sample of QiTraceIDs and tests the control chain. Did the receipt get minted? Is the integrity anchor intact? Does the model version match what was approved for production? Was the correct policy version in force? Did the Policy & Ethics Gate fire, and what was the outcome? Were exceptions logged and attributable? All of this is Tier 0—the baseline.
Governance-minimal receipts without retaining raw data. James doesn’t need payload vaults or forensic kits. He needs to verify that the control operated, not replay what the AI said. And if a receipt is missing, or a gate outcome doesn’t match the declared policy, or a model version appears that shouldn’t be in production—that’s not an awkward conversation.
That’s a measurable control effectiveness finding he can enumerate by QiTraceID and surface as an assurance KPI.
Now comes the moment that usually steals an entire week: building the audit pack. James clicks Export and generates an assurance bundle shaped for his workpapers—CSV, PDF, JSON outputs and zipped evidence bundles per QiTraceID, stored with versioning and WORM retention where required, anchored back into QiLedger so anyone can verify the pack against the cryptographic record.
Evidence retrieval, not screenshot theatre.
The Audit Committee meeting lands differently this time.
Instead of saying “we interviewed the control owners and reviewed supporting documentation,” James says: “We tested operating effectiveness by sampling governed decisions, verifying ledger-anchored receipts, confirming policy and version lineage, and checking gate behaviour across the model change. Exceptions are enumerated by QiTraceID and mapped to specific control gaps.”
The non-exec who always asks the hardest question looks at him. “And can a third party verify this independently?” James shows her the ledger anchors, the integrity hashes, the replay pointers.
“Yes. Without asking us to explain it.”
Here’s what James has learned: internal audit doesn’t fail because auditors aren’t thorough.
It fails because the evidence infrastructure forces thoroughness to depend on human memory, manual artefacts, and the goodwill of control owners who are busy doing their actual jobs.
Fix the evidence—make it machine-generated, decision-time, tamper-evident, and independently verifiable—and audit stops being the department of awkward questions.
It becomes the department of provable answers.