Every large organisation now has an AI policy. They've been through Legal, been signed off by the CRO, and live in the governance portal. And almost none of them actually govern anything at runtime.
The gap between "we have a policy" and "the policy is enforced" is where incidents live. It's also where audit findings come from six months later.
Paper policy vs runtime policy
A paper policy says "sensitive data must not be sent to third-party model providers." A runtime policy enforces that at the gateway — requests carrying PII are blocked or re-routed before they reach the model. The first is a statement of intent. The second is a guarantee.
Most organisations have the first and think they have the second. They find out they don't the first time something goes wrong, or the first time audit asks to see evidence.
What runtime enforcement requires
It requires four things: a policy hub that models your rules as machine-readable artefacts, a gateway that routes every inference through those rules, an evidence store that captures each policy decision, and an exception path with a named approver for anything that needs to bypass the rule.
If any of those four is missing, you don't have runtime governance. You have aspiration.
The three runtime events you must log
For every live AI service, you need to be able to answer in minutes: what did the system do? Under what policy? Who approved any exceptions? That means every prompt, every model response, and every approval event has to be captured, joined to the policy that governed it, and retained for a period that satisfies your regulator.
This is not expensive or exotic. It's the baseline. If you can't do this today, your AI is running without a control plane — and the only reason nothing has gone wrong is luck.
Design-time reviews are necessary, not sufficient
Design reviews catch the obvious problems before deployment. They don't catch drift. They don't catch a prompt that was fine last month and isn't fine this month. They don't catch a user who's been quietly exfiltrating data in small queries for six weeks.
Runtime governance is how you catch what design reviews can't. The two are complementary — one prevents, the other detects and responds.
Where to start if you're behind
Pick one live service. Wire the gateway. Turn on policy enforcement with a minimal rule set. Capture evidence per interaction. Run it for 30 days. You'll learn more in that month about what your real risk surface looks like than in any amount of design-time review. Then extend to the next service.