Send Me the Pack – An FCA Supervisor’s Story
It’s 16:35 on a Friday, and Amara’s inbox looks like a small floodplain.
Three firms. Three “urgent” updates. One board-level consumer issue that’s been escalating quietly for a week. And a statistical wobble in outcomes data from a firm that recently deployed an AI-driven decisioning pathway—might be drift, might be noise, might be something uglier. Amara doesn’t know yet.
That’s the problem.
She’s been a supervisor for nine years. The part of the job that nobody outside the authority understands is the asymmetry. The firm has the system. The firm has the data. The firm has the engineers who built it and the risk team who approved it.
Amara has what the firm is willing to tell her, wrapped in PDFs with careful language and selective metrics. Her job is to look at that curated version of reality and decide—on behalf of the public—whether the firm’s controls are adequate. And she has to get it right. Intervene too early on weak evidence and she’s overreaching. Intervene too late and consumers are harmed.
The window between those two failures is narrow, and it’s built entirely on the quality of the evidence she can access.
She’s lived the old version of this Friday too many times.
She sends the firm a formal information request. The firm takes a week to align internally on what they’re comfortable sharing. They send logs from one system, a model card from another, and a narrative document that explains what the controls are designed to do without showing whether they actually operated.
Amara asks for decision-level evidence. The firm says retention varies. She asks what version of the model was running when the outcomes shifted. The firm says they’ll check.
They come back with a spreadsheet that doesn’t quite reconcile with the logs they sent the previous week. She asks whether the policy gate was active during the period. Nobody is sure.
Three weeks in, Amara has a thick folder of partial information and a growing suspicion that the firm isn’t being evasive—they genuinely can’t reconstruct what happened.
And that’s the outcome that keeps supervisors awake: not bad actors, but good firms that can’t prove they’re good. Because when the evidence is fragmented, every supervisory conversation becomes a negotiation about what the data means rather than what actually happened.
Intervention starts to feel subjective. And subjective intervention is the one thing a regulator can’t defend—not to the firm, not to the tribunal, not to the public.
But this Friday is different.
Amara doesn’t ask for another narrative. She asks for evidence in a format that can be verified.
“Send me the pack.”
Not a policy document. Not a model card. A decision-level evidence pack: what went in, what ran, which checks passed, who changed what, and how to replay the decision under the same conditions. The firm replies: “We can do that. We’re running PARCIS XAI-Lite.”
XAI-Lite wraps the firm’s AI stack at the decision boundary without touching the model. Every governed decision emits a QiTraceID—a cryptographic receipt backed by a tamper-evident audit spine. The governance view is derived from the same integration hooks and decision context as the underlying AI, not a shadow copy assembled after the request arrived.
The pack arrives as both machine-readable JSON and a signed human-readable bundle, and Amara feels something rare in supervisory work: relief. Because instead of “trust us,” the pack gives falsifiable structure.
For each sampled decision, there’s a replayable proof capsule with the anatomy a supervisor actually needs: QiTraceID with timestamps, entity, and jurisdiction. Model and version lineage.
The policy regime in force at decision time, with applicable controls, operator bounds, and Ethics Gate status.
A role-appropriate rationale. And cryptographic integrity—ledger anchor plus evidence integrity hash—so a third party can verify the pack without relying on the firm’s word.
Amara’s questions become sharper. And, unexpectedly, kinder. Sharper because they’re anchored to evidence. Kinder because the firm can answer them without assembling a war room.
She asks: “Show me whether the control was operating, not whether it exists.” The pack answers with gate outcomes and policy references recorded at the boundary, tied to the exact policy version in force at decision time. Not a document that says “we do X.” Receipts showing X happened.
She asks: “Did anything change during the period the outcomes shifted?” The evidence shows model and version and policy version per decision, with promotion records that prevent silent upgrades from going unnoticed. Amara can see whether a vendor update or configuration change coincides with the wobble. Not a hand-wavy explanation. Traceability.
Now she can make a proportionate decision, intervene, require remediation, or allow continued operation with constraints, grounded in evidence rather than inference.
The firm’s tiered evidence model helps: they run Tier 1 day-to-day on the in-scope decisioning route, providing inspectable, exportable receipts plus an encrypted payload vault for documentary replay when legal or audit thresholds require it.
Tier 2, the time-bounded forensic kit, is enabled on-demand for scoped incident windows to produce richer artefacts and a defensible timeline under an explicit incident basis.
Amara can request Tier 1 replay on a scoped set immediately and request Tier 2 capture for the next window without forcing the firm into permanent over-collection.
Proportionate supervision, matched by proportionate evidence.
And here’s the part that changes Amara’s job structurally, not just on this case.
The following week, a second firm reports a similar pattern. Not identical, but rhyming. Under the old approach, this would look like a coincidence until it becomes a headline. Under the evidence-pack approach, Amara can cluster the signals—groups of QiTraceIDs with time-aligned deltas and replay pointers—without demanding raw data transfers.
Because the pack schema is standardised, she can compare like-for-like across firms and systems. She stops adjudicating presentation quality and starts adjudicating control effectiveness.
Thematic reviews become evidence exercises, not dialect translation.
By Monday morning, Amara’s briefing to her seniors is no longer built on “the firm says.”
It’s built on “the evidence shows”: what happened at the case level, when it happened in the time-indexed regime, under what controls with policy references and gate outcomes, and what changed with versioned lineage—all backed by artefacts that can be independently verified.
Here’s what Amara knows after nine years of supervision: regulators don’t fail because they lack authority. They fail because information asymmetry turns every intervention into a judgement call that’s hard to defend.
The firm knows more than the supervisor. The supervisor knows less than the public assumes. And in that gap, harm accumulates while evidence is negotiated.
Fix the evidence—make it decision-time, tamper-evident, standardised, and independently verifiable—and you don’t just speed up supervision. You make it objective.
Interventions become defensible because they’re grounded in receipts, not persuasion. And the firms that invest in that evidence infrastructure aren’t just compliant. They’re supervisable.
In the end, that’s the thing that protects everyone.