SQL для воронок конверсии

Карьерник — квиз-тренажёр в Telegram с 1500+ вопросами для собесов аналитика. SQL, Python, A/B, метрики. Бесплатно.

Зачем это знать

«Постройте воронку signup → activation → purchase» — classic task на собесе. Middle-analyst должен писать без помощи.

Funnels — основа product analytics. Показывают bottlenecks, опасные drops.

Что такое воронка

Sequence steps:

Landing → Signup → Activation → Purchase
100%     → 30%    → 12%        → 5%

Conversion rate каждого step → identify weak points.

Подходы к SQL funnel

1. Simple count (no sequence)

Count users, кто did X, Y, Z:

SELECT
    COUNT(DISTINCT CASE WHEN event = 'landing' THEN user_id END) AS landed,
    COUNT(DISTINCT CASE WHEN event = 'signup' THEN user_id END) AS signed_up,
    COUNT(DISTINCT CASE WHEN event = 'purchase' THEN user_id END) AS purchased
FROM events;

Плохо: не учитывает sequence (signup без landing?) или time.

2. Sequential с event_time

WITH funnel AS (
    SELECT
        user_id,
        MAX(CASE WHEN event = 'landing' THEN created_at END) AS t_landing,
        MAX(CASE WHEN event = 'signup' THEN created_at END) AS t_signup,
        MAX(CASE WHEN event = 'purchase' THEN created_at END) AS t_purchase
    FROM events
    GROUP BY user_id
)
SELECT
    COUNT(*) FILTER (WHERE t_landing IS NOT NULL) AS landed,
    COUNT(*) FILTER (WHERE t_landing IS NOT NULL
                      AND t_signup > t_landing) AS signed_up,
    COUNT(*) FILTER (WHERE t_landing IS NOT NULL
                      AND t_signup > t_landing
                      AND t_purchase > t_signup) AS purchased
FROM funnel;

Respects sequence.

3. Оконные функции

WITH ranked AS (
    SELECT
        user_id,
        event,
        created_at,
        LAG(event) OVER (PARTITION BY user_id ORDER BY created_at) AS prev_event
    FROM events
    WHERE event IN ('landing', 'signup', 'purchase')
)
-- process dependent on prev

Flexible но сложный.

Conversion rates

WITH funnel AS (
    SELECT
        user_id,
        MAX(CASE WHEN event = 'landing' THEN 1 ELSE 0 END) AS landed,
        MAX(CASE WHEN event = 'signup' THEN 1 ELSE 0 END) AS signed,
        MAX(CASE WHEN event = 'purchase' THEN 1 ELSE 0 END) AS purchased
    FROM events
    GROUP BY user_id
)
SELECT
    SUM(landed) AS step1,
    SUM(signed) AS step2,
    SUM(purchased) AS step3,
    SUM(signed) * 100.0 / SUM(landed) AS step1_to_2,
    SUM(purchased) * 100.0 / SUM(signed) AS step2_to_3,
    SUM(purchased) * 100.0 / SUM(landed) AS overall
FROM funnel;

Time-constrained funnel

«Activation в 7 дней после signup»:

WITH funnel AS (
    SELECT
        user_id,
        MAX(CASE WHEN event = 'signup' THEN created_at END) AS signup_at,
        MAX(CASE WHEN event = 'activate' AND
                 created_at <= signup_at + INTERVAL '7 days'
            THEN 1 ELSE 0 END) AS activated_7d
    FROM events
    GROUP BY user_id
)
SELECT AVG(activated_7d) AS d7_activation_rate
FROM funnel WHERE signup_at IS NOT NULL;

Funnel по сегментам

SELECT
    country,
    COUNT(*) FILTER (WHERE landed) AS landed,
    COUNT(*) FILTER (WHERE signed) AS signed,
    COUNT(*) FILTER (WHERE purchased) AS purchased
FROM funnel
GROUP BY country
ORDER BY purchased DESC;

Identify underperforming segments.

A/B tests funnel

Comparison между variants:

SELECT
    variant,
    COUNT(DISTINCT CASE WHEN signed THEN user_id END) * 1.0 /
    COUNT(DISTINCT CASE WHEN landed THEN user_id END) AS signup_cr,
    COUNT(DISTINCT CASE WHEN purchased THEN user_id END) * 1.0 /
    COUNT(DISTINCT CASE WHEN signed THEN user_id END) AS purchase_cr
FROM funnel
GROUP BY variant;

Drop-off analysis

Where users exit:

WITH steps AS (
    SELECT
        user_id,
        CASE
            WHEN MAX(CASE WHEN event = 'purchase' THEN 1 END) = 1 THEN 'purchased'
            WHEN MAX(CASE WHEN event = 'checkout' THEN 1 END) = 1 THEN 'dropped_at_checkout'
            WHEN MAX(CASE WHEN event = 'cart' THEN 1 END) = 1 THEN 'dropped_at_cart'
            WHEN MAX(CASE WHEN event = 'product' THEN 1 END) = 1 THEN 'dropped_at_product'
            ELSE 'no_product_view'
        END AS exit_step
    FROM events
    GROUP BY user_id
)
SELECT exit_step, COUNT(*) FROM steps GROUP BY 1;

Where are users exiting most?

Time to convert

SELECT
    AVG(EXTRACT(EPOCH FROM (t_purchase - t_signup)) / 3600) AS avg_hours_to_purchase
FROM funnel
WHERE t_purchase IS NOT NULL;

Median — better для skewed.

Funnel по cohort

Funnel для users who signed в each month:

SELECT
    DATE_TRUNC('month', signup_at) AS cohort,
    COUNT(*) AS signed,
    COUNT(*) FILTER (WHERE activated_d7) AS activated,
    COUNT(*) FILTER (WHERE purchased_d30) AS purchased
FROM funnel
GROUP BY 1;

Tracking improvement monthly.

Частые ошибки

Not respecting sequence

User may purchase without signup if event tracking imperfect. Filter properly.

Double counting

Multiple purchases per user — если funnel tracks events, inflate. Use DISTINCT user_id.

Wrong denominator

«CR from A to B» — denominator = количество A, не total.

Slow queries

Big event tables — aggregate первый, потом funnel logic.

На собесе

Задача: «signup → activation → payment. Conversion rates каждого step».

Walk through:

  1. CTE with user-level flags (signed, activated, paid)
  2. SUM / COUNT aggregation
  3. Ratios для CR

Express SQL clearly.

Connected concepts

Path analysis

«What paths users take?» — more general than funnel.

Retention

Different: «who returns later?», not «who converts in funnel?».

Segmentation

Funnels by segment — insight.

Связанные темы

FAQ

Amplitude vs SQL funnel?

Amplitude quick. SQL — flexible, custom logic.

Time window обязателен?

Best practice: yes. Иначе «signup 2020 → purchase 2024» inflates CR.

Non-linear funnels?

Sankey diagram, path analysis lib (Amplitude, PostHog).


Тренируйте SQL — откройте тренажёр с 1500+ вопросами для собесов.