Как посчитать Conversion Uplift в SQL
Содержание:
Зачем Conversion Uplift
Uplift = разница в конверсии между treatment и control. Главная output A/B-теста. Без uplift CI / p-value not actionable.
Формула
Absolute Uplift = conv_treatment - conv_control
Relative Uplift = (conv_treatment - conv_control) / conv_control × 100%Пример: control 5%, treatment 5.5%. Absolute = 0.5 pp. Relative = 10%.
Базовый расчёт
WITH stats AS (
SELECT
variant,
COUNT(DISTINCT user_id) AS users,
COUNT(DISTINCT user_id) FILTER (WHERE converted) AS converters,
COUNT(DISTINCT user_id) FILTER (WHERE converted)::NUMERIC * 100 / COUNT(DISTINCT user_id) AS conv_pct
FROM ab_test
WHERE test_id = 123
GROUP BY variant
)
SELECT
MAX(CASE WHEN variant = 'A' THEN users END) AS users_a,
MAX(CASE WHEN variant = 'A' THEN conv_pct END) AS conv_a_pct,
MAX(CASE WHEN variant = 'B' THEN users END) AS users_b,
MAX(CASE WHEN variant = 'B' THEN conv_pct END) AS conv_b_pct,
MAX(CASE WHEN variant = 'B' THEN conv_pct END)
- MAX(CASE WHEN variant = 'A' THEN conv_pct END) AS absolute_uplift_pp,
(MAX(CASE WHEN variant = 'B' THEN conv_pct END)
- MAX(CASE WHEN variant = 'A' THEN conv_pct END))
/ NULLIF(MAX(CASE WHEN variant = 'A' THEN conv_pct END), 0) * 100 AS relative_uplift_pct
FROM stats;Absolute vs Relative
| Baseline | Treatment | Absolute | Relative |
|---|---|---|---|
| 5% | 6% | 1 pp | 20% |
| 50% | 51% | 1 pp | 2% |
| 1% | 1.5% | 0.5 pp | 50% |
Same absolute lift, very different relative. Use both.
CI для uplift
Standard error для пропорций:
WITH stats AS (
SELECT
COUNT(*) FILTER (WHERE variant = 'A') AS n_a,
COUNT(*) FILTER (WHERE variant = 'A' AND converted) AS x_a,
COUNT(*) FILTER (WHERE variant = 'B') AS n_b,
COUNT(*) FILTER (WHERE variant = 'B' AND converted) AS x_b
FROM ab_test
)
SELECT
x_a::NUMERIC / n_a AS p_a,
x_b::NUMERIC / n_b AS p_b,
x_b::NUMERIC / n_b - x_a::NUMERIC / n_a AS uplift_abs,
-- SE для diff of proportions
SQRT(
(x_a::NUMERIC/n_a) * (1 - x_a::NUMERIC/n_a) / n_a
+ (x_b::NUMERIC/n_b) * (1 - x_b::NUMERIC/n_b) / n_b
) AS se,
-- 95% CI
(x_b::NUMERIC/n_b - x_a::NUMERIC/n_a) - 1.96 * SQRT(
(x_a::NUMERIC/n_a) * (1 - x_a::NUMERIC/n_a) / n_a
+ (x_b::NUMERIC/n_b) * (1 - x_b::NUMERIC/n_b) / n_b
) AS ci_low,
(x_b::NUMERIC/n_b - x_a::NUMERIC/n_a) + 1.96 * SQRT(
(x_a::NUMERIC/n_a) * (1 - x_a::NUMERIC/n_a) / n_a
+ (x_b::NUMERIC/n_b) * (1 - x_b::NUMERIC/n_b) / n_b
) AS ci_high
FROM stats;CI крест ноль → not significant.
Uplift по сегментам
SELECT
u.country,
COUNT(*) FILTER (WHERE a.variant = 'A' AND a.converted)::NUMERIC * 100
/ NULLIF(COUNT(*) FILTER (WHERE a.variant = 'A'), 0) AS conv_a,
COUNT(*) FILTER (WHERE a.variant = 'B' AND a.converted)::NUMERIC * 100
/ NULLIF(COUNT(*) FILTER (WHERE a.variant = 'B'), 0) AS conv_b,
-- Uplift
COUNT(*) FILTER (WHERE a.variant = 'B' AND a.converted)::NUMERIC * 100
/ NULLIF(COUNT(*) FILTER (WHERE a.variant = 'B'), 0)
- COUNT(*) FILTER (WHERE a.variant = 'A' AND a.converted)::NUMERIC * 100
/ NULLIF(COUNT(*) FILTER (WHERE a.variant = 'A'), 0) AS abs_uplift_pp
FROM ab_test a
JOIN users u ON u.user_id = a.user_id
WHERE a.test_id = 123
GROUP BY u.country
ORDER BY abs_uplift_pp DESC;Частые ошибки
Ошибка 1. Reporting only relative. 20% uplift звучит больше. Но на baseline 1% — practically meaningless.
Ошибка 2. Pooling всех segments. Overall +5%, но negative в key segment. Drill down.
Ошибка 3. CI ignored. Point estimate без CI = false certainty.
Ошибка 4. Novelty effect. Initial spike, then decay. Run long enough.
Ошибка 5. Different denominators. Conv_a denominator users, conv_b denominator events. Apples-oranges.
Связанные темы
- Как посчитать confidence interval в SQL
- Как посчитать MDE в SQL
- Как посчитать p-value в SQL
- Как посчитать sample size в SQL
FAQ
Reporting абсолют or relative?
Both. Relative для PR. Absolute для PM decisions.
CI пересекает 0?
Not significant at conventional α. But effect size может быть meaningful.
Negative uplift?
Treatment hurt. Investigate / kill feature.
Uplift падает в long-run?
Novelty effect. Or treatment effect decays. Multi-week tracking важен.
Uplift per segment противоречивый?
Simpson's paradox. Always check segments.