Как посчитать Seasonality в SQL

Закрепи формулу seasonality в Карьернике
Запомнить надолго — 5 коротких сессий с задачами на эту тему. Бесплатно
Тренировать seasonality в Telegram

Зачем Seasonality

Many businesses имеют seasonal patterns: e-com → Q4 peak, gyms → January, ice cream → summer. Без seasonal adjustment trend interpretation broken. Compare seasons + extract «true» trend.

Seasonal Index

Seasonal Index = avg_value_in_period / overall_avg

1 = above-average period. <1 = below.

Базовый расчёт

WITH monthly AS (
    SELECT
        DATE_TRUNC('month', DATE) AS month,
        EXTRACT(MONTH FROM DATE) AS month_num,
        SUM(revenue) AS revenue
    FROM daily_revenue
    WHERE DATE >= CURRENT_DATE - INTERVAL '24 months'
    GROUP BY 1, 2
),
overall AS (
    SELECT AVG(revenue) AS overall_avg
    FROM monthly
)
SELECT
    month_num,
    AVG(revenue) AS avg_revenue,
    AVG(revenue) / overall_avg AS seasonal_index,
    CASE
        WHEN AVG(revenue) / overall_avg > 1.2 THEN 'high season'
        WHEN AVG(revenue) / overall_avg < 0.8 THEN 'low season'
        ELSE 'normal'
    END AS season_type
FROM monthly, overall
GROUP BY month_num, overall_avg
ORDER BY month_num;

YoY comparison

SELECT
    EXTRACT(YEAR FROM DATE) AS year,
    EXTRACT(MONTH FROM DATE) AS month,
    SUM(revenue) AS revenue,
    LAG(SUM(revenue), 12) OVER (ORDER BY DATE_TRUNC('month', DATE)) AS revenue_year_ago,
    SUM(revenue) - LAG(SUM(revenue), 12) OVER (ORDER BY DATE_TRUNC('month', DATE)) AS yoy_change,
    (SUM(revenue) - LAG(SUM(revenue), 12) OVER (ORDER BY DATE_TRUNC('month', DATE)))::NUMERIC * 100
    / NULLIF(LAG(SUM(revenue), 12) OVER (ORDER BY DATE_TRUNC('month', DATE)), 0) AS yoy_pct
FROM daily_revenue
WHERE DATE >= CURRENT_DATE - INTERVAL '24 months'
GROUP BY 1, 2, DATE_TRUNC('month', DATE)
ORDER BY 1, 2;

YoY compares same period — strips seasonality.

Закрепи формулу seasonality в Карьернике
Запомнить надолго — 5 коротких сессий с задачами на эту тему. Бесплатно
Тренировать seasonality в Telegram

Deseasonalized data

Adjust for seasonality:

WITH seasonal_indices AS (
    SELECT
        EXTRACT(MONTH FROM DATE) AS month_num,
        AVG(revenue) / (SELECT AVG(revenue) FROM daily_revenue) AS sidx
    FROM daily_revenue
    GROUP BY month_num
)
SELECT
    d.DATE,
    d.revenue,
    s.sidx,
    d.revenue / s.sidx AS deseasonalized_revenue
FROM daily_revenue d
JOIN seasonal_indices s ON s.month_num = EXTRACT(MONTH FROM d.DATE)
ORDER BY d.DATE;

Deseasonalized трend cleaner — easier to spot real growth.

Day-of-week seasonality

SELECT
    EXTRACT(DOW FROM DATE) AS day_of_week,
    CASE EXTRACT(DOW FROM DATE)
        WHEN 0 THEN 'Sun' WHEN 1 THEN 'Mon' WHEN 2 THEN 'Tue'
        WHEN 3 THEN 'Wed' WHEN 4 THEN 'Thu' WHEN 5 THEN 'Fri' WHEN 6 THEN 'Sat'
    END AS day_name,
    AVG(revenue) AS avg_revenue,
    AVG(revenue) / (SELECT AVG(revenue) FROM daily_revenue) AS sidx
FROM daily_revenue
WHERE DATE >= CURRENT_DATE - INTERVAL '90 days'
GROUP BY day_of_week
ORDER BY day_of_week;

E-com: weekends often higher. B2B: weekdays.

Частые ошибки

Ошибка 1. MoM compare через seasonal. Compare Dec to Jan — always drops (post-holiday). Not «decline».

Ошибка 2. Single year data. Need 2+ years для reliable seasonal pattern.

Ошибка 3. Trend mixed. Year-over-year actually shows trend + seasonal. Deseasonalize for pure trend.

Ошибка 4. Outlier years. COVID 2020 — anomalous. Exclude or treat separately.

Ошибка 5. Multiple seasonalities. Day-of-week + monthly + holidays. Decompose carefully (STL).

Связанные темы

FAQ

Как detect seasonality?

Plot data. Or compute autocorrelation. R / Python seasonal_decompose helpful.

Seasonal Index range?

Typically 0.5 - 2.0. > 2.0 — extreme spike. < 0.3 — extreme low.

YoY adequate для seasonality?

Yes если pattern stable. Но не captures sub-month seasonality.

Multi-year seasonality?

Some products 3-year cycles (semiconductor industry). Beyond standard yearly.

Holiday adjustment?

Day-of-week + holiday flag (Christmas, Easter) — explicit terms в model.