AI engineering patterns worth stealing for risk engines

Mar 20

AI development is fascinating not just the outcomes but the development patterns.

Derivatives risk models were built in the 1990s. They were brilliant for the 1990s they are not incredibly slow now.

The AI infrastructure people have spent five years solving a structurally similar problem under different constraints. How do you get a very expensive computation to run in near-real-time without losing the correctness properties that make the output trustworthy? The answers they arrived at are worth stealing. Not the hype, though would kill for their multiples. The engineering patterns.

Speculative execution has caught my attention today. I keep coming back to it. In large language model inference, a cheap model proposes outputs and an expensive model verifies them. The expensive model only does full work where the cheap one fails. For derivatives margining, the analogy is direct: a fast surrogate proposes the margin move, and a high-fidelity verifier intervenes only at the boundary. Most updates pass. Tail events do not. You get asymmetric compute spend with bounded error. That is a better architecture than running everything approximately all the time and hoping the tails are quiet, or worse perfect analysis but run in batches every hour.

The other patterns that translate well are sparse routing, cache-centric design, and fused memory kernels. None of them are massive steps on their own. All of them make it faster. The reason FlashAttention is fast is not primarily clever mathematics. It is because it inderstands how memory moves around hardware. Risk engines have the same bottleneck and largely ignore it. We process the same results again and again.

This isn’t a solution for everything. A lot of AI tricks work because text generation tolerates approximation in places where clearing and risk cannot. You cannot borrow the approximation directly. You can borrow the verification architecture. The principle is simple: only import patterns that either preserve exactness through a verifier, or come with explicit, measured uncertainty and promotion rules.

We really are stuck in the 90s with things like SPAN. We are not going to unlock a high velocity tokenised market with highly agile margin if we don’t start to adopt these massive pieces of progress. We should not accept mediocrity in risk and clearing.

Loads more to come….

This first appeared on LinkedIn on 20 March 2026. If you want to comment or discuss, that’s the place.

James Davies

AI engineering patterns worth stealing for risk engines

A Saturday night building a risk surface

Design of monetary infrastructure matters