Как посчитать MAE в SQL

Закрепи формулу mae в Карьернике
Запомнить надолго — 5 коротких сессий с задачами на эту тему. Бесплатно
Тренировать mae в Telegram

Зачем MAE

MAE = Mean Absolute Error. Average |predicted - actual|. Robust to outliers (linear penalty). Easy interpret: «average error in same units as target».

Формула

MAE = mean(|predicted - actual|)

Базовый расчёт

SELECT
    COUNT(*) AS predictions,
    AVG(ABS(predicted_value - actual_value)) AS mae,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY ABS(predicted_value - actual_value)) AS median_error,
    MIN(ABS(predicted_value - actual_value)) AS min_error,
    MAX(ABS(predicted_value - actual_value)) AS max_error
FROM predictions
WHERE prediction_date >= CURRENT_DATE - INTERVAL '30 days';

MAE vs RMSE

SELECT
    AVG(ABS(error)) AS mae,
    SQRT(AVG(POWER(error, 2))) AS rmse,
    SQRT(AVG(POWER(error, 2))) - AVG(ABS(error)) AS rmse_mae_gap
FROM (SELECT predicted_value - actual_value AS error FROM predictions) e;

Gap large → outliers present. Gap small → errors uniform.

Свойство MAE RMSE
Outliers Robust Sensitive
Penalizes large Linearly Quadratically
Interpretation Direct Same units

Bias check

MAE doesn't tell direction. Use signed error:

SELECT
    AVG(predicted_value - actual_value) AS mean_error,  -- bias
    AVG(ABS(predicted_value - actual_value)) AS mae,
    CASE
        WHEN AVG(predicted_value - actual_value) > 0 THEN 'OVER-predicting'
        WHEN AVG(predicted_value - actual_value) < 0 THEN 'under-predicting'
        ELSE 'unbiased'
    END AS direction
FROM predictions;

Bias close to 0, MAE high → noisy. Bias high, MAE high → systematic error.

Закрепи формулу mae в Карьернике
Запомнить надолго — 5 коротких сессий с задачами на эту тему. Бесплатно
Тренировать mae в Telegram

По segments

SELECT
    segment,
    COUNT(*) AS predictions,
    AVG(predicted_value - actual_value) AS bias,
    AVG(ABS(predicted_value - actual_value)) AS mae,
    AVG(actual_value) AS avg_actual,
    AVG(ABS(predicted_value - actual_value)) / NULLIF(AVG(actual_value), 0) AS mae_pct_of_mean
FROM predictions p
JOIN entities e USING (entity_id)
WHERE prediction_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY segment
ORDER BY mae DESC;

Per-segment MAE / mean ratio — comparable across scales.

Weighted MAE

When some predictions more important:

SELECT
    SUM(weight * ABS(predicted_value - actual_value)) / SUM(weight) AS weighted_mae
FROM predictions;

E.g., predict revenue: weight = actual_value (large transactions matter more).

MAE over time

SELECT
    DATE_TRUNC('week', prediction_date) AS week,
    AVG(ABS(predicted_value - actual_value)) AS mae,
    AVG(predicted_value - actual_value) AS bias
FROM predictions
WHERE prediction_date >= CURRENT_DATE - INTERVAL '12 weeks'
GROUP BY 1
ORDER BY 1;

Drift over time = retrain trigger.

Частые ошибки

Ошибка 1. MAE = RMSE assumption. Generally MAE < RMSE. Outliers push RMSE way up.

Ошибка 2. Direction lost. MAE only magnitude. Track bias separately.

Ошибка 3. MAE on percentage. Percentage of mean — MAPE. Different metric.

Ошибка 4. Outlier influence. MAE robust but ne immune. One huge error pulls.

Ошибка 5. Per-segment ignored. Aggregate MAE 5%. Some segments 50%. Drill.

Связанные темы

FAQ

MAE vs RMSE?

MAE — robust, MAE — outlier-sensitive. Pick based on use case.

MAE units?

Same as target. $5 MAE on price predictions = avg $5 error.

MAE vs MAPE?

MAE — absolute. MAPE — percentage (divides by actual).

MAE для probability predictions?

Yes но обычно use log-loss / Brier для classification.

MAE 0 — что значит?

Perfect predictions. Suspicious — verify.