23 апреля 2026 г.·3 мин чтения

Feature importance: SHAP vs Gain

Проверь себя · 1/3разбор после ответа

У пользователя first_name = ' Анна ' и last_name = ' Иванова ' (с лишними пробелами по краям). Что вернёт SELECT CONCAT(TRIM(first_name), ' ', TRIM(last_name))?

Зачем это знать

«Какие фичи важнее всего в модели?» — вопрос, который стейкхолдеры спрашивают на каждой презентации ML-проекта. Аналитик, который ответит «вот gain-importance», получит косой взгляд от ML-инженера, потому что gain может вводить в заблуждение.

SHAP — modern standard для interpretability. Middle+ аналитик должен знать разницу.

Что такое feature importance

Measure, какие features contribute к model predictions.

Помогает:

Interpret model
Explain к stakeholders
Debug (weird feature import → data issue?)
Feature selection

Методы

1. Gain (tree-based)

Для tree models (Random Forest, XGBoost).

Sum of information gain каждой splitting feature. Default в XGBoost.

Python:

model.feature_importances_

Plus: fast, built-in. Minus:

Bias toward high-cardinality features
Global, не per-prediction
Не intuitive («gain» — что это?)

2. Split

Counts how often feature used для split.

Similar issues as gain.

3. Permutation importance

Shuffle column values → check performance drop.

from sklearn.inspection import permutation_importance

r = permutation_importance(model, X_val, y_val, n_repeats=10)

Plus: model-agnostic, less biased. Minus: computationally expensive.

4. SHAP (SHapley Additive exPlanations)

Based on game theory. Fair credit attribution.

Для каждого prediction:

P(x) = baseline + sum(SHAP values)

SHAP value = contribution feature к specific prediction.

import shap

explainer = shap.Explainer(model)
shap_values = explainer(X)

# Global importance
shap.summary_plot(shap_values, X)
# Per-prediction
shap.waterfall_plot(shap_values[0])

Plus: theoretically sound, per-prediction. Minus: slow для big data.

Gain vs SHAP: ключевая разница

Gain

Tree-specific
Global
Biased (high-cardinality features «важнее»)
Один scalar per feature

SHAP

Model-agnostic
Per-prediction + aggregate
Theoretically principled
Direction of effect (positive vs negative)

Пример разницы

Модель churn prediction.

Gain:

age — 25%
days_since_last_login — 20%
country — 15%
income — 10%

SHAP summary shows что:

days_since_last_login — actually highest impact
age inflated потому что high cardinality

SHAP часто corrects gain misleads.

Per-prediction insights

SHAP может объяснить конкретное prediction:

«Почему этот user predicted как churn?»

High: days_since_login = 30 (+20% churn)
Medium: no premium = yes (+5%)
Low: recent support ticket (-2%)

Gain не может. SHAP — can.

Готовься к собесу аналитика как в Duolingo

10 минут в день — SQL, Python, A/B, метрики. 1700+ вопросов в Telegram

Открыть Карьерник в Telegram

Tools

shap

import shap

# Explain model
explainer = shap.Explainer(model, X_train)
shap_values = explainer(X_test)

# Plots
shap.summary_plot(shap_values, X_test)
shap.waterfall_plot(shap_values[0])
shap.dependence_plot('feature_name', shap_values, X_test)

eli5

Alternative, simpler syntax.

lime

Local explanations, similar to SHAP, но less theoretical.

В аналитике

Business communication

SHAP plots → easy к show к business: «эти factors drive churn».

Model debugging

Weird SHAP → data issue?

Feature engineering

SHAP shows interactions, non-linearities → ideas for new features.

Causal inference?

SHAP shows correlation, not causation. Careful!

Частые ошибки

Interpret Gain как causal

«Age increases churn» — не следует из gain. Correlation.

Ignore context

Single value может вводить в заблуждение. Always look at distributions.

Skip per-prediction

Global importance often masks useful patterns per user.

Forget direction

Feature «important» не означает positive impact. Could go either way.

На собесе

«Как interpret feature importance?» Gain для quick look, SHAP для rigorous interpretation.

«Проблемы gain?» Biased toward high-cardinality, tree-specific, global only.

«SHAP вкратце?» Game theory, fair attribution, per-prediction + global.

«Causation?» No. SHAP shows correlation с prediction.

Связанные темы

FAQ

SHAP для больших datasets?

Slow. Subsample или use TreeSHAP (faster).

Permutation enough?

Good baseline, часто достаточно. SHAP когда нужно per-prediction.

Neural nets?

SHAP works (DeepSHAP). Или integrated gradients.