Как посчитать Fraud Rate в SQL

Закрепи формулу fraud rate в Карьернике
Запомнить надолго — 5 коротких сессий с задачами на эту тему. Бесплатно
Тренировать fraud rate в Telegram

Зачем Fraud Rate

Fraud Rate = % transactions confirmed as fraudulent. Below 0.1% — excellent. > 0.5% — major losses. Fraud драйверы: stolen cards, account takeover, friendly fraud.

Формула

Fraud Rate (count) = fraud_transactions / total_transactions × 100%
Fraud Rate (value) = fraud_volume / total_volume × 100%

Fraud confirmed после chargeback / investigation. Real fraud — subset of disputes.

Базовый расчёт

SELECT
    DATE_TRUNC('month', transaction_date) AS month,
    COUNT(*) AS transactions,
    SUM(amount) AS volume,
    COUNT(*) FILTER (WHERE fraud_confirmed) AS fraud_count,
    SUM(amount) FILTER (WHERE fraud_confirmed) AS fraud_volume,
    COUNT(*) FILTER (WHERE fraud_confirmed)::NUMERIC * 100 / COUNT(*) AS fraud_rate_pct,
    SUM(amount) FILTER (WHERE fraud_confirmed) * 100.0 / SUM(amount) AS fraud_value_pct
FROM transactions
WHERE transaction_date >= CURRENT_DATE - INTERVAL '12 months'
GROUP BY 1
ORDER BY 1;

False positive / negative

Fraud detection model evaluation:

SELECT
    -- True Positive: flagged AND fraud
    COUNT(*) FILTER (WHERE flagged AND fraud_confirmed) AS true_positive,
    -- False Positive: flagged but NOT fraud
    COUNT(*) FILTER (WHERE flagged AND NOT fraud_confirmed) AS false_positive,
    -- True Negative: not flagged AND NOT fraud
    COUNT(*) FILTER (WHERE NOT flagged AND NOT fraud_confirmed) AS true_negative,
    -- False Negative: not flagged but fraud
    COUNT(*) FILTER (WHERE NOT flagged AND fraud_confirmed) AS false_negative,
    -- Precision = TP / (TP+FP)
    COUNT(*) FILTER (WHERE flagged AND fraud_confirmed)::NUMERIC
    / NULLIF(COUNT(*) FILTER (WHERE flagged), 0) AS precision,
    -- Recall = TP / (TP+FN)
    COUNT(*) FILTER (WHERE flagged AND fraud_confirmed)::NUMERIC
    / NULLIF(COUNT(*) FILTER (WHERE fraud_confirmed), 0) AS recall
FROM transactions
WHERE transaction_date >= CURRENT_DATE - INTERVAL '30 days';

Trade-off с approval rate

Strict fraud rules → reject more legit transactions → lose revenue.

SELECT
    DATE_TRUNC('week', transaction_date) AS week,
    COUNT(*) AS attempts,
    COUNT(*) FILTER (WHERE status = 'approved')::NUMERIC * 100 / COUNT(*) AS approval_pct,
    COUNT(*) FILTER (WHERE status = 'declined' AND decline_reason = 'fraud')::NUMERIC * 100 / COUNT(*) AS fraud_decline_pct,
    COUNT(*) FILTER (WHERE fraud_confirmed)::NUMERIC * 100 / COUNT(*) AS fraud_rate_pct
FROM payment_attempts
WHERE transaction_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY 1
ORDER BY 1;

If fraud decline pct ↑ but actual fraud rate same → rules over-rejecting.

Закрепи формулу fraud rate в Карьернике
Запомнить надолго — 5 коротких сессий с задачами на эту тему. Бесплатно
Тренировать fraud rate в Telegram

По segment

SELECT
    customer_country,
    customer_age_days,
    COUNT(*) AS transactions,
    COUNT(*) FILTER (WHERE fraud_confirmed)::NUMERIC * 100 / COUNT(*) AS fraud_rate
FROM transactions
WHERE transaction_date >= CURRENT_DATE - INTERVAL '90 days'
GROUP BY customer_country, customer_age_days
HAVING COUNT(*) >= 100
ORDER BY fraud_rate DESC;

New accounts often higher fraud. New countries — first txn особенно risky.

Net loss after recovery

SELECT
    SUM(amount) FILTER (WHERE fraud_confirmed) AS gross_fraud_loss,
    SUM(amount) FILTER (WHERE fraud_confirmed AND recovered) AS recovered,
    SUM(amount) FILTER (WHERE fraud_confirmed) - SUM(amount) FILTER (WHERE fraud_confirmed AND recovered) AS net_fraud_loss,
    COUNT(*) FILTER (WHERE fraud_confirmed AND recovered)::NUMERIC * 100
    / NULLIF(COUNT(*) FILTER (WHERE fraud_confirmed), 0) AS recovery_pct
FROM transactions
WHERE transaction_date >= CURRENT_DATE - INTERVAL '90 days';

Частые ошибки

Ошибка 1. Confirmed fraud vs disputes. Dispute ≠ fraud. Many disputes — friendly fraud / refund proxy. Confirm carefully.

Ошибка 2. Fraud rate без context. 0.3% absolute fraud rate. But after 3DS / declined attempts → low. Track full funnel.

Ошибка 3. False positives ignored. Reject 5% legit transactions — bigger revenue impact than 0.1% fraud avoided.

Ошибка 4. Old fraud labels. Fraud confirmed weeks later. Reporting current month → fresh data missing.

Ошибка 5. Per-merchant aggregation. Marketplace — fraud per merchant. Some sellers magnets для fraud.

Связанные темы

FAQ

Какой Fraud Rate ok?

< 0.1% — excellent. 0.1-0.3% — average. > 0.5% — major issue.

Fraud vs Chargeback?

Chargeback — dispute filed. Fraud — actual unauthorized. Subset typically.

Recovery rate?

Card recovery — 5-10% typical. Chargeback won = recovered.

Precision vs Recall trade-off?

High precision (low FP) → fewer legit declined но more fraud through. High recall (low FN) → catch fraud but reject more legit.

Tune model how?

ROC curve / Precision-Recall curve. Pick threshold based on cost of FP vs FN.