Как посчитать F1 Score в SQL
Содержание:
Зачем F1
F1 = harmonic mean precision и recall. Balances both. Useful when classes imbalanced. Accuracy lies при skewed data — F1 honest.
Формула
Precision = TP / (TP + FP)
Recall = TP / (TP + FN)
F1 = 2 × (Precision × Recall) / (Precision + Recall)F1 range: 0-1. 1 = perfect classifier.
Базовый расчёт
WITH confusion AS (
SELECT
COUNT(*) FILTER (WHERE predicted = 1 AND actual = 1) AS tp,
COUNT(*) FILTER (WHERE predicted = 1 AND actual = 0) AS fp,
COUNT(*) FILTER (WHERE predicted = 0 AND actual = 1) AS fn,
COUNT(*) FILTER (WHERE predicted = 0 AND actual = 0) AS tn
FROM predictions
WHERE prediction_date >= CURRENT_DATE - INTERVAL '30 days'
)
SELECT
tp, fp, fn, tn,
tp::NUMERIC / NULLIF(tp + fp, 0) AS precision,
tp::NUMERIC / NULLIF(tp + fn, 0) AS recall,
2.0 * tp / NULLIF(2 * tp + fp + fn, 0) AS f1
FROM confusion;Macro vs Micro (multi-class)
Macro: average F1 per class. Micro: aggregate TP/FP/FN across classes.
WITH per_class AS (
SELECT
class_name,
COUNT(*) FILTER (WHERE predicted = class_name AND actual = class_name) AS tp,
COUNT(*) FILTER (WHERE predicted = class_name AND actual <> class_name) AS fp,
COUNT(*) FILTER (WHERE predicted <> class_name AND actual = class_name) AS fn
FROM predictions p
CROSS JOIN (SELECT DISTINCT actual AS class_name FROM predictions) c
GROUP BY class_name
),
per_class_f1 AS (
SELECT
class_name,
2.0 * tp / NULLIF(2 * tp + fp + fn, 0) AS f1
FROM per_class
)
SELECT
AVG(f1) AS macro_f1,
(SELECT 2.0 * SUM(tp) / NULLIF(SUM(2 * tp + fp + fn), 0) FROM per_class) AS micro_f1
FROM per_class_f1;По threshold
For probabilistic classifiers:
WITH thresholds AS (
SELECT unnest(ARRAY[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]) AS threshold
),
f1_per_threshold AS (
SELECT
t.threshold,
COUNT(*) FILTER (WHERE p.score >= t.threshold AND p.actual = 1) AS tp,
COUNT(*) FILTER (WHERE p.score >= t.threshold AND p.actual = 0) AS fp,
COUNT(*) FILTER (WHERE p.score < t.threshold AND p.actual = 1) AS fn
FROM thresholds t
CROSS JOIN predictions p
GROUP BY t.threshold
)
SELECT
threshold,
tp, fp, fn,
tp::NUMERIC / NULLIF(tp + fp, 0) AS precision,
tp::NUMERIC / NULLIF(tp + fn, 0) AS recall,
2.0 * tp / NULLIF(2 * tp + fp + fn, 0) AS f1
FROM f1_per_threshold
ORDER BY threshold;Pick threshold maximizing F1.
F-beta variations
F1 weights precision и recall equally. F-beta tunes:
F-beta = (1 + β²) × precision × recall / (β² × precision + recall)- F0.5: precision matters more (β=0.5)
- F1: balanced
- F2: recall matters more (β=2)
-- F2 example (recall priority)
SELECT
5.0 * precision * recall / NULLIF(4 * precision + recall, 0) AS f2
FROM ...Частые ошибки
Ошибка 1. F1 = Accuracy assumption. F1 differs especially с imbalanced classes.
Ошибка 2. F1 ignores TN. TN не в formula. Negative class performance hidden.
Ошибка 3. Threshold = 0.5 default. Optimize threshold maximizing F1. 0.5 — convention, not optimal.
Ошибка 4. Multi-class confusion. Per-class F1 vs macro vs micro — different stories.
Ошибка 5. F1 for regression. F1 — classification. Use RMSE / MAE / R² для regression.
Связанные темы
- Как посчитать precision recall в SQL
- Как посчитать AUC ROC в SQL
- Что такое precision и recall
- Classification vs regression
FAQ
F1 range?
0-1. 0 — terrible. 1 — perfect.
Какой F1 ok?
Depends. Spam: 0.95+. Fraud detection: 0.6-0.8. Medical diagnosis: 0.8+.
F1 vs Accuracy?
Imbalanced classes — F1 reflects reality. Accuracy misleading.
Macro vs Micro F1?
Macro: each class equal weight. Micro: data-point equal weight. Macro shows hidden class bias.
F1 ignores negative class — что делать?
Use balanced accuracy or AUC если negative class important.