Как посчитать Seasonality в SQL
Содержание:
Зачем Seasonality
Many businesses имеют seasonal patterns: e-com → Q4 peak, gyms → January, ice cream → summer. Без seasonal adjustment trend interpretation broken. Compare seasons + extract «true» trend.
Seasonal Index
Seasonal Index = avg_value_in_period / overall_avg1 = above-average period. <1 = below.
Базовый расчёт
WITH monthly AS (
SELECT
DATE_TRUNC('month', DATE) AS month,
EXTRACT(MONTH FROM DATE) AS month_num,
SUM(revenue) AS revenue
FROM daily_revenue
WHERE DATE >= CURRENT_DATE - INTERVAL '24 months'
GROUP BY 1, 2
),
overall AS (
SELECT AVG(revenue) AS overall_avg
FROM monthly
)
SELECT
month_num,
AVG(revenue) AS avg_revenue,
AVG(revenue) / overall_avg AS seasonal_index,
CASE
WHEN AVG(revenue) / overall_avg > 1.2 THEN 'high season'
WHEN AVG(revenue) / overall_avg < 0.8 THEN 'low season'
ELSE 'normal'
END AS season_type
FROM monthly, overall
GROUP BY month_num, overall_avg
ORDER BY month_num;YoY comparison
SELECT
EXTRACT(YEAR FROM DATE) AS year,
EXTRACT(MONTH FROM DATE) AS month,
SUM(revenue) AS revenue,
LAG(SUM(revenue), 12) OVER (ORDER BY DATE_TRUNC('month', DATE)) AS revenue_year_ago,
SUM(revenue) - LAG(SUM(revenue), 12) OVER (ORDER BY DATE_TRUNC('month', DATE)) AS yoy_change,
(SUM(revenue) - LAG(SUM(revenue), 12) OVER (ORDER BY DATE_TRUNC('month', DATE)))::NUMERIC * 100
/ NULLIF(LAG(SUM(revenue), 12) OVER (ORDER BY DATE_TRUNC('month', DATE)), 0) AS yoy_pct
FROM daily_revenue
WHERE DATE >= CURRENT_DATE - INTERVAL '24 months'
GROUP BY 1, 2, DATE_TRUNC('month', DATE)
ORDER BY 1, 2;YoY compares same period — strips seasonality.
Deseasonalized data
Adjust for seasonality:
WITH seasonal_indices AS (
SELECT
EXTRACT(MONTH FROM DATE) AS month_num,
AVG(revenue) / (SELECT AVG(revenue) FROM daily_revenue) AS sidx
FROM daily_revenue
GROUP BY month_num
)
SELECT
d.DATE,
d.revenue,
s.sidx,
d.revenue / s.sidx AS deseasonalized_revenue
FROM daily_revenue d
JOIN seasonal_indices s ON s.month_num = EXTRACT(MONTH FROM d.DATE)
ORDER BY d.DATE;Deseasonalized трend cleaner — easier to spot real growth.
Day-of-week seasonality
SELECT
EXTRACT(DOW FROM DATE) AS day_of_week,
CASE EXTRACT(DOW FROM DATE)
WHEN 0 THEN 'Sun' WHEN 1 THEN 'Mon' WHEN 2 THEN 'Tue'
WHEN 3 THEN 'Wed' WHEN 4 THEN 'Thu' WHEN 5 THEN 'Fri' WHEN 6 THEN 'Sat'
END AS day_name,
AVG(revenue) AS avg_revenue,
AVG(revenue) / (SELECT AVG(revenue) FROM daily_revenue) AS sidx
FROM daily_revenue
WHERE DATE >= CURRENT_DATE - INTERVAL '90 days'
GROUP BY day_of_week
ORDER BY day_of_week;E-com: weekends often higher. B2B: weekdays.
Частые ошибки
Ошибка 1. MoM compare через seasonal. Compare Dec to Jan — always drops (post-holiday). Not «decline».
Ошибка 2. Single year data. Need 2+ years для reliable seasonal pattern.
Ошибка 3. Trend mixed. Year-over-year actually shows trend + seasonal. Deseasonalize for pure trend.
Ошибка 4. Outlier years. COVID 2020 — anomalous. Exclude or treat separately.
Ошибка 5. Multiple seasonalities. Day-of-week + monthly + holidays. Decompose carefully (STL).
Связанные темы
- Как посчитать YoY в SQL
- Как посчитать moving average в SQL
- Как посчитать anomaly detection в SQL
- Window functions advanced
FAQ
Как detect seasonality?
Plot data. Or compute autocorrelation. R / Python seasonal_decompose helpful.
Seasonal Index range?
Typically 0.5 - 2.0. > 2.0 — extreme spike. < 0.3 — extreme low.
YoY adequate для seasonality?
Yes если pattern stable. Но не captures sub-month seasonality.
Multi-year seasonality?
Some products 3-year cycles (semiconductor industry). Beyond standard yearly.
Holiday adjustment?
Day-of-week + holiday flag (Christmas, Easter) — explicit terms в model.