Как посчитать MDE в SQL

Закрепи формулу mde в Карьернике
Запомнить надолго — 5 коротких сессий с задачами на эту тему. Бесплатно
Тренировать mde в Telegram

Зачем MDE

MDE = минимальный effect, который A/B test способен detect (статистически значимо). До запуска теста — power analysis. После теста — confidence что null result не false negative.

Формула

Для conversion rate (z-test):

MDE = (z_α/2 + z_β) × sqrt(2 × p × (1-p) / n)

где:

  • z_α/2 ≈ 1.96 (α = 0.05)
  • z_β ≈ 0.84 (power = 0.80)
  • p = baseline conversion
  • n = sample per variant

Базовый расчёт

Допустим baseline 5%, sample 10,000 per variant:

WITH params AS (
    SELECT
        0.05 AS baseline_conv,
        10000 AS sample_per_variant,
        1.96 AS z_alpha_2,
        0.84 AS z_beta
)
SELECT
    baseline_conv,
    sample_per_variant,
    (z_alpha_2 + z_beta) * SQRT(2.0 * baseline_conv * (1 - baseline_conv) / sample_per_variant) AS mde_absolute,
    (z_alpha_2 + z_beta) * SQRT(2.0 * baseline_conv * (1 - baseline_conv) / sample_per_variant) / baseline_conv * 100 AS mde_relative_pct
FROM params;

MDE absolute ≈ 0.65 percentage points. MDE relative ≈ 13% lift on baseline.

Для conversion rate

Текущая конверсия из БД:

WITH baseline AS (
    SELECT
        COUNT(DISTINCT CASE WHEN event = 'purchase' THEN user_id END)::NUMERIC
        / COUNT(DISTINCT user_id) AS conv_rate,
        COUNT(DISTINCT user_id) AS total_users
    FROM events
    WHERE DATE >= CURRENT_DATE - INTERVAL '30 days'
      AND DATE < CURRENT_DATE
),
n_per_variant AS (
    SELECT total_users / 2 AS n  -- 50/50 split
    FROM baseline
)
SELECT
    b.conv_rate,
    n.n AS sample_per_variant,
    (1.96 + 0.84) * SQRT(2.0 * b.conv_rate * (1 - b.conv_rate) / n.n) AS mde_abs,
    (1.96 + 0.84) * SQRT(2.0 * b.conv_rate * (1 - b.conv_rate) / n.n) / b.conv_rate * 100 AS mde_rel_pct
FROM baseline b, n_per_variant n;
Закрепи формулу mde в Карьернике
Запомнить надолго — 5 коротких сессий с задачами на эту тему. Бесплатно
Тренировать mde в Telegram

Для continuous metric

Для ARPU / AOV (continuous):

MDE = (z_α/2 + z_β) × σ × sqrt(2 / n)
WITH stats AS (
    SELECT
        AVG(amount) AS mean_arpu,
        STDDEV(amount) AS std_arpu,
        COUNT(*) AS n_users
    FROM transactions
    WHERE DATE >= CURRENT_DATE - INTERVAL '30 days'
)
SELECT
    mean_arpu,
    std_arpu,
    n_users / 2 AS sample_per_variant,
    (1.96 + 0.84) * std_arpu * SQRT(2.0 / (n_users / 2)) AS mde_abs,
    (1.96 + 0.84) * std_arpu * SQRT(2.0 / (n_users / 2)) / mean_arpu * 100 AS mde_rel_pct
FROM stats;

Частые ошибки

Ошибка 1. Power == significance level. α = 0.05, β = 0.20 (power = 0.80). Different things.

Ошибка 2. Absolute vs relative MDE. 2 pp lift on 5% base = 40% relative lift. Communicate which type.

Ошибка 3. Sample size не для variant. Sample per variant ≠ total. Confused в formula.

Ошибка 4. Outliers in continuous. SD inflated by outliers. Winsorize / trim → smaller MDE.

Ошибка 5. One-tailed vs two-tailed. z_α/2 для two-tailed. Most tests two-tailed. Confirm.

Связанные темы

FAQ

MDE vs effect size?

MDE — pre-test (theoretical detectable). Effect size — observed.

Какой MDE acceptable?

Зависит от business. 1% absolute уровень мощно — высокая sample. 5% — easier, weaker.

Power 0.80 vs 0.95?

0.80 — стандарт. 0.95 — больше sample (≈30% больше).

MDE для AOV?

Continuous formula. SD critical. Higher SD = higher MDE.

MDE rel pct — что это?

MDE_abs / baseline. Lift на проценты. Easier interpret.