Как сделать Customer Segments в SQL
Содержание:
Зачем Customer Segments
Не все клиенты одинаковые. Segmentation — основа personalization, pricing, marketing. «One size fits all» = subpar.
Value segmentation
WITH user_value AS (
SELECT
user_id,
SUM(amount) AS lifetime_value
FROM transactions
WHERE status = 'paid'
GROUP BY user_id
)
SELECT
user_id,
lifetime_value,
CASE
WHEN NTILE(10) OVER (ORDER BY lifetime_value) = 10 THEN 'whale (top 10%)'
WHEN NTILE(10) OVER (ORDER BY lifetime_value) >= 8 THEN 'high-value'
WHEN NTILE(10) OVER (ORDER BY lifetime_value) >= 5 THEN 'mid-value'
ELSE 'low-value'
END AS value_segment
FROM user_value;Behavior segmentation
WITH user_behavior AS (
SELECT
user_id,
COUNT(DISTINCT DATE(activity_time)) AS active_days_30d,
AVG(session_minutes) AS avg_session_min,
COUNT(*) FILTER (WHERE event = 'purchase') AS purchases_30d
FROM events
WHERE activity_time >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY user_id
)
SELECT
user_id,
CASE
WHEN active_days_30d >= 20 AND purchases_30d >= 3 THEN 'power user'
WHEN active_days_30d >= 10 THEN 'engaged'
WHEN active_days_30d >= 3 THEN 'casual'
ELSE 'dormant'
END AS behavior_segment
FROM user_behavior;Lifecycle segmentation
WITH user_lifecycle AS (
SELECT
u.user_id,
u.created_at,
CURRENT_DATE - u.created_at::DATE AS days_since_signup,
EXISTS (SELECT 1 FROM events WHERE user_id = u.user_id AND DATE::DATE >= CURRENT_DATE - 7) AS active_7d,
EXISTS (SELECT 1 FROM transactions WHERE user_id = u.user_id AND status = 'paid') AS ever_paid
FROM users u
)
SELECT
user_id,
CASE
WHEN days_since_signup < 7 THEN 'new (week 1)'
WHEN days_since_signup < 30 AND NOT active_7d THEN 'churning new'
WHEN active_7d AND ever_paid THEN 'active paying'
WHEN active_7d AND NOT ever_paid THEN 'active non-paying'
WHEN NOT active_7d AND ever_paid THEN 'lapsed paying'
ELSE 'dormant'
END AS lifecycle_segment
FROM user_lifecycle;RFM
Recency, Frequency, Monetary:
WITH rfm_stats AS (
SELECT
user_id,
CURRENT_DATE - MAX(transaction_date)::DATE AS recency_days,
COUNT(*) AS frequency,
SUM(amount) AS monetary
FROM transactions
WHERE status = 'paid'
AND transaction_date >= CURRENT_DATE - INTERVAL '12 months'
GROUP BY user_id
),
rfm_scored AS (
SELECT
user_id,
recency_days,
frequency,
monetary,
6 - NTILE(5) OVER (ORDER BY recency_days) AS r_score, -- lower recency = higher score
NTILE(5) OVER (ORDER BY frequency) AS f_score,
NTILE(5) OVER (ORDER BY monetary) AS m_score
FROM rfm_stats
)
SELECT
user_id,
r_score, f_score, m_score,
r_score * 100 + f_score * 10 + m_score AS rfm_code,
CASE
WHEN r_score = 5 AND f_score >= 4 AND m_score >= 4 THEN 'champions'
WHEN r_score >= 3 AND f_score >= 3 THEN 'loyal'
WHEN r_score <= 2 AND f_score >= 3 THEN 'at risk'
WHEN r_score <= 2 AND f_score <= 2 THEN 'lost'
ELSE 'other'
END AS rfm_segment
FROM rfm_scored;Cross-segment funnel
SELECT
value_segment,
behavior_segment,
COUNT(*) AS users,
AVG(lifetime_value) AS avg_ltv
FROM user_segments
GROUP BY value_segment, behavior_segment
ORDER BY users DESC;Частые ошибки
Ошибка 1. One segmentation universal. Different segments для different decisions. Value vs behavior — both useful.
Ошибка 2. Static segments. Users move between segments. Re-segment regularly.
Ошибка 3. Too many segments. 20+ segments → cannot act on. Aim 4-8 actionable.
Ошибка 4. Defining boundaries. Whale threshold $1000? $5000? Data-driven (percentiles) лучше arbitrary.
Ошибка 5. Action divorced from segmentation. Segments без actions — vanity. Each segment → specific campaign/tactic.
Связанные темы
- Как посчитать RFM segmentation в SQL
- Как посчитать LTV в SQL
- Как посчитать retention в SQL
- Как посчитать propensity score в SQL
FAQ
Какая segmentation важнее?
Lifecycle для retention. Value для prioritization. Use многосторонне.
Static vs dynamic?
Dynamic (refresh daily). Users move segments.
How many segments?
4-8 actionable. >10 — too many.
Segment boundaries arbitrary?
Use percentiles (deciles, quartiles) — adaptive.
Personalization across segments?
Email content, pricing, support tier — все adjust by segment.