Content area
Full Text
Machine learning models based on trees are the most popular nonlinear models in use today1,2. Random forests, gradient boosted trees and other tree-based models are used in finance, medicine, biology, customer retention, advertising, supply chain management, manufacturing, public health and other areas to make predictions based on sets of input features (Fig. 1a, left). For these applications, models often must be both accurate and interpretable, where interpretability means that we can understand how the model uses input features to make predictions3. However, despite the rich history of global interpretation methods for trees, which summarize the impact of input features on the model as a whole, much less attention has been paid to local explanations, which reveal the impact of input features on individual predictions (that is, for a single sample) (Fig. 1a, right).
Fig. 1 Local explanations based on TreeExplainer enable a wide variety of new ways to understand global model structure. [Images not available. See PDF.]
a, A local explanation based on assigning a numeric measure of credit to each input feature. b, By combining many local explanations, we can represent global structure while retaining local faithfulness to the original model. We demonstrate this by using three medical datasets to train gradient boosted decision trees and then compute local explanations based on SHAP values3. Computing local explanations across all samples in a dataset enables development of many tools for understanding global model structure.
Current local explanation methods include: (1) reporting the decision path, (2) using a heuristic approach that assigns credit to each input feature4 and (3) applying various model-agnostic approaches that require repeatedly executing the model for each explanation3,5–8. Each current method has limitations. First, simply reporting a prediction’s decision path is unhelpful for most models, particularly those based on multiple trees. Second, the behaviour of the heuristic credit allocation has yet to be carefully analysed; we show here that it is strongly biased to alter the impact of features based on their tree depth. Third, since model-agnostic methods rely on post hoc modelling of an arbitrary function, they can be slow and suffer from sampling variability.
We present TreeExplainer, an explanation method for trees that enables the tractable computation of optimal...