Как посчитать cumulative sum в SQL

Закрепи формулу cumulative sum в Карьернике
Запомнить надолго — 5 коротких сессий с задачами на эту тему. Бесплатно
Тренировать cumulative sum в Telegram

Зачем cumulative sum

Daily revenue показывает spikes. Cumulative — total накопленный over time. Хорошо для growth charts, milestone reporting, comparison with goals.

Базовый расчёт

SELECT
    DATE,
    daily_revenue,
    SUM(daily_revenue) OVER (
        ORDER BY DATE
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS cumulative_revenue
FROM daily_revenue
ORDER BY DATE;

Simpler form:

SUM(daily_revenue) OVER (ORDER BY DATE)

(default frame для ORDER BY = UNBOUNDED PRECEDING to CURRENT ROW.)

По группам

Per-channel cumulative:

SELECT
    DATE,
    channel,
    daily_revenue,
    SUM(daily_revenue) OVER (
        PARTITION BY channel
        ORDER BY DATE
    ) AS cumulative_revenue_channel
FROM channel_revenue
ORDER BY channel, DATE;

PARTITION BY isolates cumulative per channel.

Cumulative MRR

Track total revenue accumulated:

WITH daily AS (
    SELECT
        DATE_TRUNC('day', created_at) AS day,
        SUM(amount) AS daily_revenue
    FROM transactions
    WHERE status = 'paid'
    GROUP BY 1
)
SELECT
    day,
    daily_revenue,
    SUM(daily_revenue) OVER (ORDER BY day) AS lifetime_revenue,
    -- Milestones
    CASE
        WHEN SUM(daily_revenue) OVER (ORDER BY day) >= 1000000 THEN 'past $1M'
        WHEN SUM(daily_revenue) OVER (ORDER BY day) >= 100000 THEN 'past $100k'
        WHEN SUM(daily_revenue) OVER (ORDER BY day) >= 10000 THEN 'past $10k'
        ELSE 'starting'
    END AS milestone
FROM daily;
Закрепи формулу cumulative sum в Карьернике
Запомнить надолго — 5 коротких сессий с задачами на эту тему. Бесплатно
Тренировать cumulative sum в Telegram

Cumulative distinct users

Tricky — COUNT(DISTINCT) не работает с window functions в большинстве БД. Workaround:

WITH first_seen AS (
    SELECT
        user_id,
        MIN(created_at::DATE) AS first_date
    FROM events
    GROUP BY user_id
)
SELECT
    DATE,
    COUNT(*) FILTER (WHERE first_date = DATE) AS new_users,
    COUNT(*) AS cumulative_users
FROM (
    SELECT
        d.DATE,
        u.user_id,
        u.first_date
    FROM (
        SELECT generate_series(
            (SELECT MIN(first_date) FROM first_seen),
            CURRENT_DATE,
            '1 day'::INTERVAL
        )::DATE AS DATE
    ) d
    JOIN first_seen u ON u.first_date <= d.DATE
) joined
GROUP BY DATE
ORDER BY DATE;

Cumulative percent (Pareto)

SELECT
    customer_id,
    revenue,
    SUM(revenue) OVER () AS total,
    SUM(revenue) OVER (ORDER BY revenue DESC) AS cum_revenue,
    SUM(revenue) OVER (ORDER BY revenue DESC) * 100.0 / SUM(revenue) OVER () AS cum_pct
FROM customers
ORDER BY revenue DESC;

Pareto: top 20% customers = 80% revenue (check cum_pct ≈ 80% at row #N).

Частые ошибки

Ошибка 1. ORDER BY забыли. SUM OVER без ORDER BY = grand total на every row. Wrong.

Ошибка 2. Cumulative DISTINCT. Большинство БД не поддерживают. Workaround нужен.

Ошибка 3. Missing dates. Daily revenue с gaps → cumulative тоже broken. Generate_series для full series.

Ошибка 4. Reset на partition? PARTITION BY automatically resets. Если ожидал continuous → no partition.

Ошибка 5. RANGE vs ROWS. RANGE BETWEEN UNBOUNDED PRECEDING может merge ties. ROWS — strict row count.

Связанные темы

FAQ

SUM OVER без ORDER BY — что считает?

Grand total (same value во all rows). Not cumulative.

Cumulative DISTINCT работает?

В PostgreSQL — нет. Workaround через self-join или first_seen pattern.

UNBOUNDED PRECEDING зачем?

Explicit frame. Default same, но explicit clearer для readers.

Cumulative resets по group?

С PARTITION BY — yes. Без — global.

Cumulative для daily активных?

Cumulative DAU = total unique ever active. Different from running total of DAU.