Как посчитать MAE в SQL
Содержание:
Зачем MAE
MAE = Mean Absolute Error. Average |predicted - actual|. Robust to outliers (linear penalty). Easy interpret: «average error in same units as target».
Формула
MAE = mean(|predicted - actual|)Базовый расчёт
SELECT
COUNT(*) AS predictions,
AVG(ABS(predicted_value - actual_value)) AS mae,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY ABS(predicted_value - actual_value)) AS median_error,
MIN(ABS(predicted_value - actual_value)) AS min_error,
MAX(ABS(predicted_value - actual_value)) AS max_error
FROM predictions
WHERE prediction_date >= CURRENT_DATE - INTERVAL '30 days';MAE vs RMSE
SELECT
AVG(ABS(error)) AS mae,
SQRT(AVG(POWER(error, 2))) AS rmse,
SQRT(AVG(POWER(error, 2))) - AVG(ABS(error)) AS rmse_mae_gap
FROM (SELECT predicted_value - actual_value AS error FROM predictions) e;Gap large → outliers present. Gap small → errors uniform.
| Свойство | MAE | RMSE |
|---|---|---|
| Outliers | Robust | Sensitive |
| Penalizes large | Linearly | Quadratically |
| Interpretation | Direct | Same units |
Bias check
MAE doesn't tell direction. Use signed error:
SELECT
AVG(predicted_value - actual_value) AS mean_error, -- bias
AVG(ABS(predicted_value - actual_value)) AS mae,
CASE
WHEN AVG(predicted_value - actual_value) > 0 THEN 'OVER-predicting'
WHEN AVG(predicted_value - actual_value) < 0 THEN 'under-predicting'
ELSE 'unbiased'
END AS direction
FROM predictions;Bias close to 0, MAE high → noisy. Bias high, MAE high → systematic error.
По segments
SELECT
segment,
COUNT(*) AS predictions,
AVG(predicted_value - actual_value) AS bias,
AVG(ABS(predicted_value - actual_value)) AS mae,
AVG(actual_value) AS avg_actual,
AVG(ABS(predicted_value - actual_value)) / NULLIF(AVG(actual_value), 0) AS mae_pct_of_mean
FROM predictions p
JOIN entities e USING (entity_id)
WHERE prediction_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY segment
ORDER BY mae DESC;Per-segment MAE / mean ratio — comparable across scales.
Weighted MAE
When some predictions more important:
SELECT
SUM(weight * ABS(predicted_value - actual_value)) / SUM(weight) AS weighted_mae
FROM predictions;E.g., predict revenue: weight = actual_value (large transactions matter more).
MAE over time
SELECT
DATE_TRUNC('week', prediction_date) AS week,
AVG(ABS(predicted_value - actual_value)) AS mae,
AVG(predicted_value - actual_value) AS bias
FROM predictions
WHERE prediction_date >= CURRENT_DATE - INTERVAL '12 weeks'
GROUP BY 1
ORDER BY 1;Drift over time = retrain trigger.
Частые ошибки
Ошибка 1. MAE = RMSE assumption. Generally MAE < RMSE. Outliers push RMSE way up.
Ошибка 2. Direction lost. MAE only magnitude. Track bias separately.
Ошибка 3. MAE on percentage. Percentage of mean — MAPE. Different metric.
Ошибка 4. Outlier influence. MAE robust but ne immune. One huge error pulls.
Ошибка 5. Per-segment ignored. Aggregate MAE 5%. Some segments 50%. Drill.
Связанные темы
- Как посчитать RMSE в SQL
- Как посчитать MAPE в SQL
- Как посчитать R² в SQL
- Linear regression для аналитика
FAQ
MAE vs RMSE?
MAE — robust, MAE — outlier-sensitive. Pick based on use case.
MAE units?
Same as target. $5 MAE on price predictions = avg $5 error.
MAE vs MAPE?
MAE — absolute. MAPE — percentage (divides by actual).
MAE для probability predictions?
Yes но обычно use log-loss / Brier для classification.
MAE 0 — что значит?
Perfect predictions. Suspicious — verify.