Как посчитать AUC ROC в SQL

Закрепи формулу auc roc в Карьернике
Запомнить надолго — 5 коротких сессий с задачами на эту тему. Бесплатно
Тренировать auc roc в Telegram

Зачем AUC ROC

AUC ROC = area under ROC curve. Threshold-independent: измеряет ranking ability. AUC 0.5 = random. 1.0 = perfect. Robust к imbalanced classes (sort of).

Интерпретация

AUC Quality
0.5 Random (no signal)
0.6-0.7 Weak
0.7-0.8 Decent
0.8-0.9 Good
0.9+ Excellent (verify не leakage)

AUC = probability что random positive > random negative по score.

Базовый расчёт

Алгоритм: для каждой pair (positive, negative), check who got higher score. AUC = % pairs где positive > negative.

Approximate в SQL (Mann-Whitney U):

WITH ranked AS (
    SELECT
        actual,
        score,
        RANK() OVER (ORDER BY score) AS rank_score
    FROM predictions
),
stats AS (
    SELECT
        SUM(CASE WHEN actual = 1 THEN rank_score ELSE 0 END) AS sum_ranks_positive,
        COUNT(*) FILTER (WHERE actual = 1) AS n_positive,
        COUNT(*) FILTER (WHERE actual = 0) AS n_negative
    FROM ranked
)
SELECT
    n_positive,
    n_negative,
    sum_ranks_positive,
    -- AUC = (sum_ranks_pos - n_pos*(n_pos+1)/2) / (n_pos*n_neg)
    (sum_ranks_positive::NUMERIC - n_positive * (n_positive + 1.0) / 2)
    / NULLIF(n_positive::NUMERIC * n_negative, 0) AS auc
FROM stats;

По segments

WITH segment_ranks AS (
    SELECT
        segment,
        actual,
        score,
        RANK() OVER (PARTITION BY segment ORDER BY score) AS rank_score
    FROM predictions
    JOIN entities USING (entity_id)
),
segment_stats AS (
    SELECT
        segment,
        SUM(CASE WHEN actual = 1 THEN rank_score ELSE 0 END) AS sum_ranks_pos,
        COUNT(*) FILTER (WHERE actual = 1) AS n_pos,
        COUNT(*) FILTER (WHERE actual = 0) AS n_neg
    FROM segment_ranks
    GROUP BY segment
)
SELECT
    segment,
    n_pos,
    n_neg,
    (sum_ranks_pos::NUMERIC - n_pos * (n_pos + 1.0) / 2) / NULLIF(n_pos::NUMERIC * n_neg, 0) AS auc
FROM segment_stats
WHERE n_pos >= 10 AND n_neg >= 10
ORDER BY auc DESC;

Per-segment AUC catches model bias.

Закрепи формулу auc roc в Карьернике
Запомнить надолго — 5 коротких сессий с задачами на эту тему. Бесплатно
Тренировать auc roc в Telegram

PR AUC alternative

For very imbalanced data, PR (Precision-Recall) AUC better:

-- Approximate PR AUC via trapezoidal rule
-- Steps:
-- 1) For each threshold, calculate precision and recall
-- 2) Sort by recall
-- 3) Compute area under curve

WITH thresholds AS (
    SELECT DISTINCT score AS threshold FROM predictions
),
pr_points AS (
    SELECT
        t.threshold,
        COUNT(*) FILTER (WHERE p.score >= t.threshold AND p.actual = 1)::NUMERIC
        / NULLIF(COUNT(*) FILTER (WHERE p.score >= t.threshold), 0) AS precision,
        COUNT(*) FILTER (WHERE p.score >= t.threshold AND p.actual = 1)::NUMERIC
        / NULLIF(COUNT(*) FILTER (WHERE p.actual = 1), 0) AS recall
    FROM thresholds t
    CROSS JOIN predictions p
    GROUP BY t.threshold
)
SELECT *
FROM pr_points
ORDER BY recall;
-- Then numerical integration в Python

AUC ROC trend

SELECT
    DATE_TRUNC('week', prediction_date) AS week,
    -- Compute AUC inline (Mann-Whitney)
    (SUM(CASE WHEN actual = 1 THEN RANK() OVER (PARTITION BY DATE_TRUNC('week', prediction_date) ORDER BY score) ELSE 0 END)::NUMERIC
     - COUNT(*) FILTER (WHERE actual = 1) * (COUNT(*) FILTER (WHERE actual = 1) + 1.0) / 2)
    / NULLIF(COUNT(*) FILTER (WHERE actual = 1)::NUMERIC * COUNT(*) FILTER (WHERE actual = 0), 0) AS auc
FROM predictions
WHERE prediction_date >= CURRENT_DATE - INTERVAL '12 weeks'
GROUP BY 1
ORDER BY 1;

AUC падает → model drift → retrain.

Частые ошибки

Ошибка 1. AUC 0.5 = bad. Yes, no better than random.

Ошибка 2. AUC > 1. Bug. Max 1.

Ошибка 3. AUC = Accuracy. Different metrics. AUC ranking, Accuracy classification correctness.

Ошибка 4. Class imbalance. AUC can быть deceiving на extremely imbalanced. PR AUC alternative.

Ошибка 5. Ties. Same score for multiple — handle carefully (avg rank).

Связанные темы

FAQ

AUC ROC range?

0.5 (random) to 1.0 (perfect). < 0.5 — anti-correlated (flip predictions).

Какой AUC ok?

Credit scoring: 0.7-0.8. Spam: 0.95+. Cancer detection: 0.9+. Fraud: 0.7-0.85.

AUC vs Accuracy?

AUC — threshold-independent ranking. Accuracy — fixed threshold классификация. AUC более robust.

PR AUC vs ROC AUC?

PR AUC better для imbalanced (e.g., 1% positives). ROC AUC может быть optimistic.

Ties в score?

Average rank typically. Most frameworks handle automatically.