Как построить funnel в SQL и Python

Карьерник — квиз-тренажёр в Telegram с 1500+ вопросами для собесов аналитика. SQL, Python, A/B, метрики. Бесплатно.

Зачем это знать

Funnel — ultimate product analytics pattern. Understanding conversion step-by-step — foundation.

На собесах expected.

SQL approach

Step 1: Event data

SELECT
    user_id,
    event_name,
    created_at
FROM events
WHERE created_at BETWEEN '2026-04-01' AND '2026-04-30';

Step 2: User-level flags

Users reached каждый step:

WITH user_funnel AS (
    SELECT
        user_id,
        MAX(CASE WHEN event_name = 'landing' THEN created_at END) AS t_landing,
        MAX(CASE WHEN event_name = 'signup' THEN created_at END) AS t_signup,
        MAX(CASE WHEN event_name = 'activation' THEN created_at END) AS t_activation,
        MAX(CASE WHEN event_name = 'purchase' THEN created_at END) AS t_purchase
    FROM events
    GROUP BY user_id
)
SELECT * FROM user_funnel;

Step 3: Sequential filter

Only count if reached в order:

SELECT
    COUNT(*) AS landed,
    COUNT(CASE WHEN t_signup > t_landing THEN 1 END) AS signed,
    COUNT(CASE WHEN t_activation > t_signup THEN 1 END) AS activated,
    COUNT(CASE WHEN t_purchase > t_activation THEN 1 END) AS purchased
FROM user_funnel
WHERE t_landing IS NOT NULL;

Step 4: Conversion rates

SELECT
    landed,
    signed,
    activated,
    purchased,
    signed * 100.0 / landed AS landing_to_signup_pct,
    activated * 100.0 / signed AS signup_to_activation_pct,
    purchased * 100.0 / activated AS activation_to_purchase_pct,
    purchased * 100.0 / landed AS overall_cr_pct
FROM funnel_counts;

Time-constrained funnel

«Activation в 7 days после signup»:

WITH user_funnel AS (
    SELECT
        user_id,
        MAX(CASE WHEN event = 'signup' THEN created_at END) AS signup_at,
        MAX(CASE WHEN event = 'activation'
                 AND created_at <= signup_at + INTERVAL '7 days'
            THEN 1 ELSE 0 END) AS activated_7d
    FROM events
    GROUP BY user_id
)
SELECT
    AVG(activated_7d::float) AS d7_activation_rate
FROM user_funnel
WHERE signup_at IS NOT NULL;

Python approach

With pandas

import pandas as pd

# Load events
df = pd.read_sql(query, conn)

# Pivot к user × event
funnel = df.pivot_table(
    index='user_id',
    columns='event_name',
    values='created_at',
    aggfunc='min'
)

# Check sequence
funnel['landed'] = funnel['landing'].notna()
funnel['signed'] = (funnel['signup'] > funnel['landing'])
funnel['activated'] = (funnel['activation'] > funnel['signup'])
funnel['purchased'] = (funnel['purchase'] > funnel['activation'])

# Aggregate
result = funnel[['landed', 'signed', 'activated', 'purchased']].sum()
print(result)

Conversion rates

result['signup_cr'] = result['signed'] / result['landed']
result['activation_cr'] = result['activated'] / result['signed']
# etc.

Visualization

matplotlib funnel chart

import matplotlib.pyplot as plt

stages = ['Landing', 'Signup', 'Activation', 'Purchase']
values = [10000, 3000, 1500, 450]

fig, ax = plt.subplots(figsize=(10, 6))

# Width proportional value, centered
for i, (stage, value) in enumerate(zip(stages, values)):
    width = value / max(values)
    x = (1 - width) / 2
    ax.barh(i, width, left=x, height=0.8, color='#3498db')
    ax.text(0.5, i, f'{stage}: {value:,} ({value/values[0]*100:.1f}%)',
            ha='center', va='center')

ax.set_xlim(0, 1)
ax.set_ylim(-0.5, len(stages) - 0.5)
ax.invert_yaxis()
ax.axis('off')
plt.title('Conversion Funnel')
plt.show()

plotly

import plotly.graph_objects as go

fig = go.Figure(go.Funnel(
    y=['Landing', 'Signup', 'Activation', 'Purchase'],
    x=[10000, 3000, 1500, 450]
))
fig.show()

Interactive.

Advanced analyses

Funnel по segments

for segment in df['segment'].unique():
    seg_df = df[df['segment'] == segment]
    cr = compute_cr(seg_df)
    print(f"{segment}: {cr}")

Compare platforms, countries.

Drop-off analysis

Where biggest drop?

result = compute_funnel()
drops = result.diff() / result.shift()

# Biggest drop-off — optimize priority
worst = drops.idxmin()
print(f"Worst drop: {worst}")

Time-to-convert

funnel['time_to_purchase'] = (funnel['purchase'] - funnel['landing']).dt.total_seconds() / 3600
print(funnel['time_to_purchase'].describe())

Hours to convert. Median useful.

Cohort funnel

Funnel per signup month.

df['cohort'] = df['signup_at'].dt.to_period('M')
result = df.groupby('cohort').apply(compute_funnel)

Conversion over time

Daily funnel metrics:

df['day'] = df['created_at'].dt.date
daily_cr = df.groupby('day').apply(compute_cr)
daily_cr.plot()

Watch trends.

Pitfalls

Not respecting sequence

User may purchase без signup в data (bug).

Filter: only valid sequences.

Double counting

Per-event counts ≠ per-user.

Use COUNT(DISTINCT user_id).

Time window

No time limit → inflated CRs.

Always constrain.

Multiple sessions

User reached step в different sessions. Count correctly.

Events ordering

Same timestamp multiple events — which first?

Use sub-second timestamps или order carefully.

Amplitude / Mixpanel

Built-in funnel

Specify steps, time window. Visual output.

Easier but less flexible.

SQL / Python

More custom logic possible.

Performance

Index events table

On user_id, event_name, created_at.

Pre-aggregate

Daily funnel values precomputed.

Sample

For exploration — 10% sample.

На собесе

«Build funnel в SQL».

Walk through:

  1. Event-level data
  2. User-level aggregation (pivot)
  3. Sequential filter
  4. Count per step
  5. Compute CRs

10 minutes для middle.

«Report funnel, что обнаружили?»

  • Biggest drops
  • Segments wider differences
  • Recommendations

Связанные темы

FAQ

Amplitude vs SQL funnel?

Amplitude quick. SQL custom.

Как handle non-linear funnels?

Path analysis > traditional funnel. Sankey diagram.

Events order определяют?

Timestamps. Если одинаковый — convention (e.g., event type priority).


Тренируйте — откройте тренажёр с 1500+ вопросами для собесов.