The Old Trick Behind Every Playmaker

Research Grants

The Old Trick Behind Every Playmaker

This Utrecht University lab merged a 200-year-old numerical method with cooperative game theory. It led to the fastest, most stable way yet to explain what a machine learning model just predicted.

By Linda Petrini

2026-06-03

8 mins to read

At Utrecht University, researchers from two life sciences groups kept running into the same problem. Each of their samples produced thousands of separate protein readings at once, and a model trained on those readings could be reacting to any of them. So one evident question kept coming up: why did the model decide that?

“That’s what we need to know to trust it, ” says Pavel Sinitcyn, who leads research at both the AI Technology for Life and the Biomolecular Mass Spectrometry and Proteomics groups at Utrecht. “That’s what science is for.” So they hired a postdoctoral researcher, Majid Mohammadi, to build that explanation.

It is tempting to assume the explainability problem lives where the hype does: inside large language models. But many of the decisions that get regulated, who gets the loan, the diagnosis, the insurance rate, still run on tabular data. And there the strongest models are still tree ensembles like XGBoost or CatBoost, not the fashionable deep networks.

“They outperform transformers, deep neural networks, Gaussian processes, ” says Grigory Reznikov, who built the project’s tree implementation. “Tabular data is still very common in industrial settings.”

The pressure to explain those models is about to jump. From August 2026, the EU AI Act gives anyone hit by certain high-risk automated decisions the right to a “clear and meaningful” explanation. The law never says the word Shapley, but feature attribution on a tree is what compliance teams reach for.

The standard tool here is the Shapley value, from cooperative game theory, where it settles a fairness question: if a team pulls something off together, how much credit does each player deserve? You weigh every possible way the team could have formed, measure what each player adds as they join, and average it.

Swap the vocabulary and you have model explainability: the players are the features, the team’s score is the prediction, a feature’s Shapley value is its fair share. That fairness guarantee is why it anchors SHAP, the most widely used explainability toolkit.

Significantly improving this standard remained an open problem until Sinitcyn’s team, supported by Nebius Academy, set out to solve it. The result, QuadraSHAP, is what benchmarks now show as the fastest and most reliable method to explain the decisions of ML models used in medicine, finance, and industry.

Majid Mohammadi describes himself as a “kernel freak”

Grigory Reznikov built the tree implementation

Research lead Pavel Sinitcyn

Pick a Side

The trouble with Shapley values hides in “every possible way the team could have formed.” With d features, that means 2^(d-1) coalitions to weigh, and the count detonates. Ten features give you 512 teams. A hundred give you more than a hundred billion billion billion. A few hundred push the coalition count beyond anything brute force could hope to touch, and proteomics works with twenty thousand.

So the demand collides with the arithmetic. Specialized methods sidestep the explosion by exploiting a model’s structure, but until recently they made you pick a side. The fast methods compress the combinatorial sum into a polynomial representation, which is efficient but numerically fragile: as trees get deeper the polynomial’s degree climbs, small rounding errors get amplified, and eventually the numbers stop holding together.

The careful ones stay exact, but their cost climbs with the size and depth of the model, until on the largest forests and the highest-dimensional data they slow to a crawl or fail to finish at all. Fast or trustworthy, not both.

An Odd Thing in Common

Kernel methods are a classic family of prediction models, the machinery inside support vector machines and Gaussian processes, that make predictions by comparing new inputs to known examples. Many of them share a convenient property: the model’s response to a group of features is just the individual pieces multiplied together.

“I’m a kernel freak, ” Mohammadi says. “In a specific class of kernel methods you have this product structure — the relation between the decision and the features can be written as a product.”

Decision trees hide the same shape; a tree reaches its verdict through a chain of yes/no questions whose contributions combine the very same way. It is an odd thing for a kernel and a tree to have in common, and exactly why one method can explain both.

A Shapley value averages a feature’s contribution over every possible team the others could form, and the averaging isn’t even: coalitions are weighted differently depending on their size, through a weight built from factorials. That weight, knotted into the sum over all those teams, is what makes the whole thing hard, and Mohammadi had been circling it since an earlier paper.

“I’d always been thinking about how to get rid of that weight, ” he says. “If you get rid of it, you can compute everything in linear time. ” His way out was to recognize the weight in disguise. Those factorial terms can be rewritten as a Beta function, a standard way of turning discrete counting into a smooth integral.

Once the weight becomes an integral, the multiply-together structure pays off. A sum over every possible team turns into a single product with one factor per feature, so no team has to be listed at all. The combinatorial sum collapses into a one-dimensional integral.

And integrating a polynomial exactly is one of the most thoroughly solved problems in mathematics. Gauss-Legendre quadrature has done it since the early 1800s. Feed it enough points and the newly developed method, QuadraSHAP, converges to the exact Shapley value up to floating-point precision.

The unglamorous half, turning that into fast code for real tree ensembles, fell to Reznikov. “Things related to trees don’t really have an aha moment, ” he says. “It’s a lot of step-by-step optimizations, each one quite well known. You just need to do it carefully, many times in a row.”

Having It Both Ways

Previous methods often traded off speed against numerical stability. QuadraSHAP delivers both.

The trustworthy half is where the drama lives. A basic promise of Shapley values is that the per-feature shares add up to the prediction itself, the way line items add up to a bill; when they don’t, the explanation is broken. On deep decision forests trained on text, the fast interpolation method breaks that promise spectacularly: its attribution sums can diverge catastrophically from the model output. QuadraSHAP, on the same forests, lands within about one part in ten trillion, which is essentially the limit of floating-point precision.

Method	Speed	Stable at Depth?
SHAP (TreeSHAP)	Baseline	Yes, but slow
Linear TreeSHAP	Fast	No: error blows past 10^200
QuadraSHAP	2–5× faster	Yes: error ~10⁻¹³

In practice, the speed improvements may matter more day to day. On tree ensembles, QuadraSHAP is the fastest stable method in every setup tested, three to five times quicker than the standard SHAP library. On kernels the margin is larger: twenty-five times faster at five hundred features, ninety-five times at a thousand. Past two thousand, the previous best method times out while QuadraSHAP finishes in seconds.

Causality Is Downstream

There is a quiet warning under the result, and the team is firm about it. A Shapley value is not a cause. It tells you how much a feature contributed to a prediction, its share, not what would change if you altered it.

If you change the value of a feature, a counterfactual tells you the prediction can flip, Mohammadi says. “Shapley doesn’t give you that hint at all. It’s merely an importance indicator. ” Sinitcyn puts the rest where it belongs: “Causality is downstream experimental validation. ”

This distinction matters in practice. A widely cited 2020 paper argued that Shapley-based explanations don’t serve the human goals we want from explanations, and that fixing their flaws drags you, unavoidably, into causal reasoning. Yet regulators and users want exactly the causal version, the “change this and the loan clears, ” which an attribution can’t promise.

Sinitcyn points at mistaking background correlations for meaningful signals. “You can build whatever you like, ” he says, “but without figuring out what is actually happening, it’s useless. ” The team now wants to test whether the same approach extends beyond trees and kernels, including recurrent architectures and biological applications.

Assisted by Krikamol Muandet from CISPA Helmholtz Center for Information Security and Siu Lun Chau from Nanyang Technological University QuadraSHAP came together in about twenty days, a sprint before the NeurIPS deadline. Then, on submission day, a message arrived: another group had posted a paper titled Quadrature-TreeSHAP, and a third team, from an entirely different derivation, had reached the same Gauss-Legendre rule.

So why now, and why all at once? Partly scale: many groups were circling the same bottleneck at once. But mostly timing: the fast-versus-stable tradeoff only started to hurt after a few years of bigger, deeper models and approaching regulation, and pain is what sends researchers back to nineteenth century maths.