Как посчитать AUC ROC в SQL
Содержание:
Зачем AUC ROC
AUC ROC = area under ROC curve. Threshold-independent: измеряет ranking ability. AUC 0.5 = random. 1.0 = perfect. Robust к imbalanced classes (sort of).
Интерпретация
| AUC | Quality |
|---|---|
| 0.5 | Random (no signal) |
| 0.6-0.7 | Weak |
| 0.7-0.8 | Decent |
| 0.8-0.9 | Good |
| 0.9+ | Excellent (verify не leakage) |
AUC = probability что random positive > random negative по score.
Базовый расчёт
Алгоритм: для каждой pair (positive, negative), check who got higher score. AUC = % pairs где positive > negative.
Approximate в SQL (Mann-Whitney U):
WITH ranked AS (
SELECT
actual,
score,
RANK() OVER (ORDER BY score) AS rank_score
FROM predictions
),
stats AS (
SELECT
SUM(CASE WHEN actual = 1 THEN rank_score ELSE 0 END) AS sum_ranks_positive,
COUNT(*) FILTER (WHERE actual = 1) AS n_positive,
COUNT(*) FILTER (WHERE actual = 0) AS n_negative
FROM ranked
)
SELECT
n_positive,
n_negative,
sum_ranks_positive,
-- AUC = (sum_ranks_pos - n_pos*(n_pos+1)/2) / (n_pos*n_neg)
(sum_ranks_positive::NUMERIC - n_positive * (n_positive + 1.0) / 2)
/ NULLIF(n_positive::NUMERIC * n_negative, 0) AS auc
FROM stats;По segments
WITH segment_ranks AS (
SELECT
segment,
actual,
score,
RANK() OVER (PARTITION BY segment ORDER BY score) AS rank_score
FROM predictions
JOIN entities USING (entity_id)
),
segment_stats AS (
SELECT
segment,
SUM(CASE WHEN actual = 1 THEN rank_score ELSE 0 END) AS sum_ranks_pos,
COUNT(*) FILTER (WHERE actual = 1) AS n_pos,
COUNT(*) FILTER (WHERE actual = 0) AS n_neg
FROM segment_ranks
GROUP BY segment
)
SELECT
segment,
n_pos,
n_neg,
(sum_ranks_pos::NUMERIC - n_pos * (n_pos + 1.0) / 2) / NULLIF(n_pos::NUMERIC * n_neg, 0) AS auc
FROM segment_stats
WHERE n_pos >= 10 AND n_neg >= 10
ORDER BY auc DESC;Per-segment AUC catches model bias.
PR AUC alternative
For very imbalanced data, PR (Precision-Recall) AUC better:
-- Approximate PR AUC via trapezoidal rule
-- Steps:
-- 1) For each threshold, calculate precision and recall
-- 2) Sort by recall
-- 3) Compute area under curve
WITH thresholds AS (
SELECT DISTINCT score AS threshold FROM predictions
),
pr_points AS (
SELECT
t.threshold,
COUNT(*) FILTER (WHERE p.score >= t.threshold AND p.actual = 1)::NUMERIC
/ NULLIF(COUNT(*) FILTER (WHERE p.score >= t.threshold), 0) AS precision,
COUNT(*) FILTER (WHERE p.score >= t.threshold AND p.actual = 1)::NUMERIC
/ NULLIF(COUNT(*) FILTER (WHERE p.actual = 1), 0) AS recall
FROM thresholds t
CROSS JOIN predictions p
GROUP BY t.threshold
)
SELECT *
FROM pr_points
ORDER BY recall;
-- Then numerical integration в PythonAUC ROC trend
SELECT
DATE_TRUNC('week', prediction_date) AS week,
-- Compute AUC inline (Mann-Whitney)
(SUM(CASE WHEN actual = 1 THEN RANK() OVER (PARTITION BY DATE_TRUNC('week', prediction_date) ORDER BY score) ELSE 0 END)::NUMERIC
- COUNT(*) FILTER (WHERE actual = 1) * (COUNT(*) FILTER (WHERE actual = 1) + 1.0) / 2)
/ NULLIF(COUNT(*) FILTER (WHERE actual = 1)::NUMERIC * COUNT(*) FILTER (WHERE actual = 0), 0) AS auc
FROM predictions
WHERE prediction_date >= CURRENT_DATE - INTERVAL '12 weeks'
GROUP BY 1
ORDER BY 1;AUC падает → model drift → retrain.
Частые ошибки
Ошибка 1. AUC 0.5 = bad. Yes, no better than random.
Ошибка 2. AUC > 1. Bug. Max 1.
Ошибка 3. AUC = Accuracy. Different metrics. AUC ranking, Accuracy classification correctness.
Ошибка 4. Class imbalance. AUC can быть deceiving на extremely imbalanced. PR AUC alternative.
Ошибка 5. Ties. Same score for multiple — handle carefully (avg rank).
Связанные темы
- Как посчитать F1 score в SQL
- Как посчитать precision recall в SQL
- Что такое precision и recall
- Classification vs regression
FAQ
AUC ROC range?
0.5 (random) to 1.0 (perfect). < 0.5 — anti-correlated (flip predictions).
Какой AUC ok?
Credit scoring: 0.7-0.8. Spam: 0.95+. Cancer detection: 0.9+. Fraud: 0.7-0.85.
AUC vs Accuracy?
AUC — threshold-independent ranking. Accuracy — fixed threshold классификация. AUC более robust.
PR AUC vs ROC AUC?
PR AUC better для imbalanced (e.g., 1% positives). ROC AUC может быть optimistic.
Ties в score?
Average rank typically. Most frameworks handle automatically.