Как посчитать Statistical Power в SQL

Закрепи формулу statistical power в Карьернике
Запомнить надолго — 5 коротких сессий с задачами на эту тему. Бесплатно
Тренировать statistical power в Telegram

Зачем Power

Power = 1 - β = вероятность detect real effect (true positive rate). Convention: 80%. Без enough power null result inconclusive.

Формула

Для proportions:

Power = Φ(z_α/2 - δ × sqrt(n / (2 × p × (1-p)))) + extra_corrections

Simplification:

Power ≈ Φ(δ × sqrt(n/(2pq)) - z_α/2)

Базовый расчёт

Допустим baseline 5%, expected lift 10% relative (effect = 0.005), n = 10,000 per variant:

WITH params AS (
    SELECT
        0.05 AS baseline,
        0.055 AS treatment,
        10000 AS n_per_variant,
        1.96 AS z_alpha_2
)
SELECT
    baseline,
    treatment,
    treatment - baseline AS delta,
    n_per_variant,
    (treatment - baseline) * SQRT(n_per_variant / (2.0 * baseline * (1-baseline))) - z_alpha_2 AS z_power
    -- Power = Φ(z_power)
FROM params;

z_power ≈ 0.5. Φ(0.5) ≈ 0.69 → 69% power. Underpowered (need 80%).

Power curve

Как power растёт с sample:

WITH ns AS (
    SELECT unnest(ARRAY[1000, 5000, 10000, 20000, 50000, 100000]) AS n
),
calc AS (
    SELECT
        n,
        0.05 AS baseline,
        0.055 AS treatment,
        1.96 AS z_alpha
    FROM ns
)
SELECT
    n,
    (treatment - baseline) * SQRT(n / (2.0 * baseline * (1-baseline))) - z_alpha AS z_power
FROM calc;

Sample doubles → z_power increases by sqrt(2).

Закрепи формулу statistical power в Карьернике
Запомнить надолго — 5 коротких сессий с задачами на эту тему. Бесплатно
Тренировать statistical power в Telegram

Post-hoc power

Power после теста (controversial — many statisticians не recommend):

WITH observed AS (
    SELECT
        COUNT(DISTINCT user_id) FILTER (WHERE variant = 'A') AS n_a,
        COUNT(DISTINCT user_id) FILTER (WHERE variant = 'A' AND converted) AS x_a,
        COUNT(DISTINCT user_id) FILTER (WHERE variant = 'B') AS n_b,
        COUNT(DISTINCT user_id) FILTER (WHERE variant = 'B' AND converted) AS x_b
    FROM ab_test
)
SELECT
    x_a::NUMERIC / n_a AS p_a,
    x_b::NUMERIC / n_b AS p_b,
    n_a, n_b,
    -- Post-hoc power calc
    ABS((x_b::NUMERIC/n_b) - (x_a::NUMERIC/n_a)) * SQRT(n_a / (2.0 * (x_a::NUMERIC/n_a) * (1 - (x_a::NUMERIC/n_a)))) - 1.96 AS z_power
FROM observed;

Частые ошибки

Ошибка 1. Underpowered tests. Power 50% — coin flip. Wasteful.

Ошибка 2. Post-hoc power не actionable. Если result negative, post-hoc power = function of p-value. Tautology.

Ошибка 3. Power для wrong metric. Powered для primary metric, не для guardrails / secondary.

Ошибка 4. Variance overlooked. Continuous: power depends on σ. Underestimate σ → underpowered.

Ошибка 5. Multiple variants. 4-arm test → divide sample by 4 → power halves.

Связанные темы

FAQ

Какой Power ok?

80% convention. 90-95% для high-stakes decisions.

Power vs significance?

Power = 1-β = true positive rate. Significance = 1-α = true negative rate.

Increase power как?

  1. Larger sample. 2) Bigger expected effect. 3) Lower σ (cleaner metric). 4) Stronger α (но trade-off).

Power calculator vs SQL?

SQL для быстрых проверок. Real planning — calculator (statsig, evan miller).

Power decay в multi-armed test?

Yes — each comparison gets fewer users. Sample carefully.