Shap Smart

Explain machine learning model predictions using SHAP (SHapley Additive exPlanations), which applies game-theoretic Shapley values to attribute feature contributions to predictions. This skill covers TreeExplainer, DeepExplainer, KernelExplainer, summary plots, dependence plots, force plots, and interaction effects.

When to Use This Skill

Choose Shap Smart when you need to:

Explain why a model made a specific prediction (local interpretation)
Identify the most important features globally across all predictions
Detect feature interactions and non-linear effects in complex models
Satisfy regulatory requirements for model transparency (GDPR, FCRA)

Consider alternatives when:

You need simple feature importance from tree models (use model.feature_importances_)
You need counterfactual explanations (use DiCE or Alibi)
You need attention-based explanations for transformers (use BertViz or Captum)

Quick Start


pip install shap scikit-learn xgboost matplotlib


import shap
import xgboost as xgb
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
import warnings
warnings.filterwarnings('ignore')

# Train a model
X, y = shap.datasets.california()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
                                                      random_state=42)
model = xgb.XGBRegressor(n_estimators=100, max_depth=5, random_state=42)
model.fit(X_train, y_train)

# Create explainer (TreeExplainer for tree models)
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

# Global feature importance
shap.summary_plot(shap_values, X_test, plot_type="bar", show=False)
import matplotlib.pyplot as plt
plt.tight_layout()
plt.savefig("shap_importance.pdf")
plt.close()

# Detailed summary with direction of effect
shap.summary_plot(shap_values, X_test, show=False)
plt.tight_layout()
plt.savefig("shap_summary.pdf")
plt.close()

# Explain single prediction
print(f"Prediction: {model.predict(X_test.iloc[:1])[0]:.2f}")
print(f"Base value: {explainer.expected_value:.2f}")
print(f"Top contributing features:")
idx = 0
feature_effects = list(zip(X_test.columns, shap_values[idx]))
feature_effects.sort(key=lambda x: abs(x[1]), reverse=True)
for feat, val in feature_effects[:5]:
    print(f"  {feat}: {val:+.3f}")

Core Concepts

Explainer Types

Explainer	Supported Models	Speed	Accuracy
`TreeExplainer`	XGBoost, LightGBM, RandomForest, CatBoost	Very fast	Exact
`LinearExplainer`	Linear/logistic regression, SVM (linear)	Fast	Exact
`DeepExplainer`	PyTorch, TensorFlow neural networks	Medium	Approximate
`GradientExplainer`	PyTorch, TensorFlow neural networks	Medium	Approximate
`KernelExplainer`	Any model (model-agnostic)	Slow	Approximate
`PermutationExplainer`	Any model (model-agnostic)	Slow	Exact (asymptotic)

Comprehensive Model Explanation


import shap
import xgboost as xgb
import matplotlib.pyplot as plt
import numpy as np

def explain_model(model, X_train, X_test, feature_names=None, output_dir='.'):
    """Generate comprehensive SHAP explanations for a tree model."""
    explainer = shap.TreeExplainer(model)
    shap_values = explainer.shap_values(X_test)

    # 1. Global importance (bar)
    fig, ax = plt.subplots(figsize=(8, 6))
    shap.summary_plot(shap_values, X_test, plot_type="bar", show=False)
    plt.tight_layout()
    plt.savefig(f"{output_dir}/global_importance.pdf")
    plt.close()

    # 2. Beeswarm (summary with direction)
    fig, ax = plt.subplots(figsize=(8, 6))
    shap.summary_plot(shap_values, X_test, show=False)
    plt.tight_layout()
    plt.savefig(f"{output_dir}/beeswarm.pdf")
    plt.close()

    # 3. Dependence plots for top features
    mean_abs_shap = np.abs(shap_values).mean(axis=0)
    top_features = np.argsort(mean_abs_shap)[-3:][::-1]

    for feat_idx in top_features:
        fig, ax = plt.subplots(figsize=(6, 4))
        shap.dependence_plot(feat_idx, shap_values, X_test,
                             show=False, ax=ax)
        plt.tight_layout()
        feat_name = X_test.columns[feat_idx] if hasattr(X_test, 'columns') else f'feature_{feat_idx}'
        plt.savefig(f"{output_dir}/dependence_{feat_name}.pdf")
        plt.close()

    # 4. Interaction values (if dataset is small enough)
    if X_test.shape[0] <= 500:
        interaction_values = explainer.shap_interaction_values(X_test[:100])
        fig, ax = plt.subplots(figsize=(8, 6))
        shap.summary_plot(interaction_values[:, :, 0], X_test[:100],
                          show=False)
        plt.tight_layout()
        plt.savefig(f"{output_dir}/interactions.pdf")
        plt.close()

    return shap_values, explainer

# Usage:
# shap_values, explainer = explain_model(model, X_train, X_test)

Configuration

Parameter	Description	Default
`feature_perturbation`	How TreeExplainer handles feature dependence	`"tree_path_dependent"`
`model_output`	What model output to explain (raw, probability, log_loss)	`"raw"`
`nsamples`	Number of samples for KernelExplainer	`"auto"` (2048)
`link`	Link function for KernelExplainer	`"identity"`
`check_additivity`	Verify SHAP values sum to prediction	`True`
`approximate`	Use faster approximate TreeExplainer	`False`
`max_display`	Maximum features shown in summary plots	`20`
`plot_size`	Figure size for SHAP plots	`"auto"`

Best Practices

Use the most specific explainer for your model type — TreeExplainer for tree-based models gives exact Shapley values in polynomial time. KernelExplainer is model-agnostic but exponentially slower. Using the wrong explainer wastes computation or sacrifices accuracy when a fast exact method exists.
Always check that SHAP values sum to the prediction — For any sample, the sum of all SHAP values plus the base value should equal the model output. If check_additivity=True raises an error, there's a version incompatibility or model type mismatch. This sanity check catches misconfigured explainers.
Sample your background data for KernelExplainer — KernelExplainer uses a background dataset to compute expectations. Using the full training set is prohibitively slow. Use shap.kmeans(X_train, 100) or shap.sample(X_train, 100) to create a representative background set.
Use dependence plots to find interactions — After identifying important features via summary plots, create dependence plots to see how feature values relate to SHAP values. SHAP automatically colors by the strongest interacting feature, revealing non-linear effects and interactions that global importance misses.
Report both global and local explanations — Global importance (summary plots) shows which features matter overall. Local explanations (force plots, waterfall plots) show why the model made a specific decision. Both perspectives are needed for comprehensive model understanding, especially in regulated domains.

Common Issues

SHAP values are slow for large datasets — TreeExplainer is fast but still scales with dataset size. Compute SHAP values on a representative subsample (500-1000 points) rather than the entire test set. For KernelExplainer, use nsamples=500 for initial exploration and increase only if explanations are noisy.

Multi-class SHAP values have unexpected shapes — For multi-class models, shap_values is a list of arrays (one per class). shap_values[0] gives SHAP values for class 0. Use shap.summary_plot(shap_values[class_idx], X_test) for a specific class, or pass the full list for a multi-class summary.

Force plot doesn't display in Jupyter — Call shap.initjs() at the top of your notebook to enable JavaScript-based force plots. For non-Jupyter environments, use shap.plots.waterfall(shap.Explanation(values=shap_values[0], base_values=explainer.expected_value, data=X_test.iloc[0])) which renders as a static matplotlib figure.

⚠️ Loading Issue

Shap Smart

Shap Smart

When to Use This Skill

Quick Start

Core Concepts

Explainer Types

Comprehensive Model Explanation

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

Full-Stack Code Reviewer

Test Suite Generator

Pro Architecture Workspace