Как посчитать MDE в SQL
Содержание:
Зачем MDE
MDE = минимальный effect, который A/B test способен detect (статистически значимо). До запуска теста — power analysis. После теста — confidence что null result не false negative.
Формула
Для conversion rate (z-test):
MDE = (z_α/2 + z_β) × sqrt(2 × p × (1-p) / n)где:
- z_α/2 ≈ 1.96 (α = 0.05)
- z_β ≈ 0.84 (power = 0.80)
- p = baseline conversion
- n = sample per variant
Базовый расчёт
Допустим baseline 5%, sample 10,000 per variant:
WITH params AS (
SELECT
0.05 AS baseline_conv,
10000 AS sample_per_variant,
1.96 AS z_alpha_2,
0.84 AS z_beta
)
SELECT
baseline_conv,
sample_per_variant,
(z_alpha_2 + z_beta) * SQRT(2.0 * baseline_conv * (1 - baseline_conv) / sample_per_variant) AS mde_absolute,
(z_alpha_2 + z_beta) * SQRT(2.0 * baseline_conv * (1 - baseline_conv) / sample_per_variant) / baseline_conv * 100 AS mde_relative_pct
FROM params;MDE absolute ≈ 0.65 percentage points. MDE relative ≈ 13% lift on baseline.
Для conversion rate
Текущая конверсия из БД:
WITH baseline AS (
SELECT
COUNT(DISTINCT CASE WHEN event = 'purchase' THEN user_id END)::NUMERIC
/ COUNT(DISTINCT user_id) AS conv_rate,
COUNT(DISTINCT user_id) AS total_users
FROM events
WHERE DATE >= CURRENT_DATE - INTERVAL '30 days'
AND DATE < CURRENT_DATE
),
n_per_variant AS (
SELECT total_users / 2 AS n -- 50/50 split
FROM baseline
)
SELECT
b.conv_rate,
n.n AS sample_per_variant,
(1.96 + 0.84) * SQRT(2.0 * b.conv_rate * (1 - b.conv_rate) / n.n) AS mde_abs,
(1.96 + 0.84) * SQRT(2.0 * b.conv_rate * (1 - b.conv_rate) / n.n) / b.conv_rate * 100 AS mde_rel_pct
FROM baseline b, n_per_variant n;Для continuous metric
Для ARPU / AOV (continuous):
MDE = (z_α/2 + z_β) × σ × sqrt(2 / n)WITH stats AS (
SELECT
AVG(amount) AS mean_arpu,
STDDEV(amount) AS std_arpu,
COUNT(*) AS n_users
FROM transactions
WHERE DATE >= CURRENT_DATE - INTERVAL '30 days'
)
SELECT
mean_arpu,
std_arpu,
n_users / 2 AS sample_per_variant,
(1.96 + 0.84) * std_arpu * SQRT(2.0 / (n_users / 2)) AS mde_abs,
(1.96 + 0.84) * std_arpu * SQRT(2.0 / (n_users / 2)) / mean_arpu * 100 AS mde_rel_pct
FROM stats;Частые ошибки
Ошибка 1. Power == significance level. α = 0.05, β = 0.20 (power = 0.80). Different things.
Ошибка 2. Absolute vs relative MDE. 2 pp lift on 5% base = 40% relative lift. Communicate which type.
Ошибка 3. Sample size не для variant. Sample per variant ≠ total. Confused в formula.
Ошибка 4. Outliers in continuous. SD inflated by outliers. Winsorize / trim → smaller MDE.
Ошибка 5. One-tailed vs two-tailed. z_α/2 для two-tailed. Most tests two-tailed. Confirm.
Связанные темы
- Как посчитать sample size в SQL
- Как посчитать confidence interval в SQL
- Как посчитать p-value в SQL
- Как посчитать statistical power в SQL
FAQ
MDE vs effect size?
MDE — pre-test (theoretical detectable). Effect size — observed.
Какой MDE acceptable?
Зависит от business. 1% absolute уровень мощно — высокая sample. 5% — easier, weaker.
Power 0.80 vs 0.95?
0.80 — стандарт. 0.95 — больше sample (≈30% больше).
MDE для AOV?
Continuous formula. SD critical. Higher SD = higher MDE.
MDE rel pct — что это?
MDE_abs / baseline. Lift на проценты. Easier interpret.