Как посчитать Conversion Uplift в SQL

Закрепи формулу conversion uplift в Карьернике
Запомнить надолго — 5 коротких сессий с задачами на эту тему. Бесплатно
Тренировать conversion uplift в Telegram

Зачем Conversion Uplift

Uplift = разница в конверсии между treatment и control. Главная output A/B-теста. Без uplift CI / p-value not actionable.

Формула

Absolute Uplift = conv_treatment - conv_control
Relative Uplift = (conv_treatment - conv_control) / conv_control × 100%

Пример: control 5%, treatment 5.5%. Absolute = 0.5 pp. Relative = 10%.

Базовый расчёт

WITH stats AS (
    SELECT
        variant,
        COUNT(DISTINCT user_id) AS users,
        COUNT(DISTINCT user_id) FILTER (WHERE converted) AS converters,
        COUNT(DISTINCT user_id) FILTER (WHERE converted)::NUMERIC * 100 / COUNT(DISTINCT user_id) AS conv_pct
    FROM ab_test
    WHERE test_id = 123
    GROUP BY variant
)
SELECT
    MAX(CASE WHEN variant = 'A' THEN users END) AS users_a,
    MAX(CASE WHEN variant = 'A' THEN conv_pct END) AS conv_a_pct,
    MAX(CASE WHEN variant = 'B' THEN users END) AS users_b,
    MAX(CASE WHEN variant = 'B' THEN conv_pct END) AS conv_b_pct,
    MAX(CASE WHEN variant = 'B' THEN conv_pct END)
    - MAX(CASE WHEN variant = 'A' THEN conv_pct END) AS absolute_uplift_pp,
    (MAX(CASE WHEN variant = 'B' THEN conv_pct END)
     - MAX(CASE WHEN variant = 'A' THEN conv_pct END))
    / NULLIF(MAX(CASE WHEN variant = 'A' THEN conv_pct END), 0) * 100 AS relative_uplift_pct
FROM stats;

Absolute vs Relative

Baseline Treatment Absolute Relative
5% 6% 1 pp 20%
50% 51% 1 pp 2%
1% 1.5% 0.5 pp 50%

Same absolute lift, very different relative. Use both.

Закрепи формулу conversion uplift в Карьернике
Запомнить надолго — 5 коротких сессий с задачами на эту тему. Бесплатно
Тренировать conversion uplift в Telegram

CI для uplift

Standard error для пропорций:

WITH stats AS (
    SELECT
        COUNT(*) FILTER (WHERE variant = 'A') AS n_a,
        COUNT(*) FILTER (WHERE variant = 'A' AND converted) AS x_a,
        COUNT(*) FILTER (WHERE variant = 'B') AS n_b,
        COUNT(*) FILTER (WHERE variant = 'B' AND converted) AS x_b
    FROM ab_test
)
SELECT
    x_a::NUMERIC / n_a AS p_a,
    x_b::NUMERIC / n_b AS p_b,
    x_b::NUMERIC / n_b - x_a::NUMERIC / n_a AS uplift_abs,
    -- SE для diff of proportions
    SQRT(
        (x_a::NUMERIC/n_a) * (1 - x_a::NUMERIC/n_a) / n_a
        + (x_b::NUMERIC/n_b) * (1 - x_b::NUMERIC/n_b) / n_b
    ) AS se,
    -- 95% CI
    (x_b::NUMERIC/n_b - x_a::NUMERIC/n_a) - 1.96 * SQRT(
        (x_a::NUMERIC/n_a) * (1 - x_a::NUMERIC/n_a) / n_a
        + (x_b::NUMERIC/n_b) * (1 - x_b::NUMERIC/n_b) / n_b
    ) AS ci_low,
    (x_b::NUMERIC/n_b - x_a::NUMERIC/n_a) + 1.96 * SQRT(
        (x_a::NUMERIC/n_a) * (1 - x_a::NUMERIC/n_a) / n_a
        + (x_b::NUMERIC/n_b) * (1 - x_b::NUMERIC/n_b) / n_b
    ) AS ci_high
FROM stats;

CI крест ноль → not significant.

Uplift по сегментам

SELECT
    u.country,
    COUNT(*) FILTER (WHERE a.variant = 'A' AND a.converted)::NUMERIC * 100
    / NULLIF(COUNT(*) FILTER (WHERE a.variant = 'A'), 0) AS conv_a,
    COUNT(*) FILTER (WHERE a.variant = 'B' AND a.converted)::NUMERIC * 100
    / NULLIF(COUNT(*) FILTER (WHERE a.variant = 'B'), 0) AS conv_b,
    -- Uplift
    COUNT(*) FILTER (WHERE a.variant = 'B' AND a.converted)::NUMERIC * 100
    / NULLIF(COUNT(*) FILTER (WHERE a.variant = 'B'), 0)
    - COUNT(*) FILTER (WHERE a.variant = 'A' AND a.converted)::NUMERIC * 100
    / NULLIF(COUNT(*) FILTER (WHERE a.variant = 'A'), 0) AS abs_uplift_pp
FROM ab_test a
JOIN users u ON u.user_id = a.user_id
WHERE a.test_id = 123
GROUP BY u.country
ORDER BY abs_uplift_pp DESC;

Частые ошибки

Ошибка 1. Reporting only relative. 20% uplift звучит больше. Но на baseline 1% — practically meaningless.

Ошибка 2. Pooling всех segments. Overall +5%, но negative в key segment. Drill down.

Ошибка 3. CI ignored. Point estimate без CI = false certainty.

Ошибка 4. Novelty effect. Initial spike, then decay. Run long enough.

Ошибка 5. Different denominators. Conv_a denominator users, conv_b denominator events. Apples-oranges.

Связанные темы

FAQ

Reporting абсолют or relative?

Both. Relative для PR. Absolute для PM decisions.

CI пересекает 0?

Not significant at conventional α. But effect size может быть meaningful.

Negative uplift?

Treatment hurt. Investigate / kill feature.

Uplift падает в long-run?

Novelty effect. Or treatment effect decays. Multi-week tracking важен.

Uplift per segment противоречивый?

Simpson's paradox. Always check segments.