Apache Superset для аналитика
Карьерник — квиз-тренажёр в Telegram с 1500+ вопросами для собесов аналитика. SQL, Python, A/B, метрики. Бесплатно.
Зачем это знать
Superset — открытый BI из Apache Foundation (создан в Airbnb). Мощнее Metabase, дешевле Tableau. Много russian компаний (Yandex, Ozon) используют. В enterprise и scale-ups часто встречается.
Что такое Superset
Apache Superset — open-source BI для data exploration и visualization.
- Create dashboards
- SQL lab (query editor)
- Rich chart types (40+)
- Alerting, scheduling
- Embed в приложения
Originally built by Maxime Beauchemin в Airbnb (также создатель Airflow).
Архитектура
- Frontend: React
- Backend: Python (Flask)
- DB support: любая через SQLAlchemy
- Caching: Redis
Плюсы
- Огромный chart library
- SQL Lab — powerful query editor
- Dashboards интерактивные
- Self-hosted free
- Active dev (Apache)
Минусы
- Complex setup для production
- Steeper learning curve vs Metabase
- Maintenance overhead для self-host
Setup
# Через Docker
docker pull apache/superset
docker run -d -p 8088:8088 apache/superset
# Создать admin
docker exec -it superset superset fab create-adminОфициальный docker-compose на github.
Основные концепции
Databases
Connect к data source — Postgres, BigQuery, Snowflake, ClickHouse, etc.
Datasets
Definition таблиц / views + columns + metrics (measures).
Charts
Individual visualization. 40+ types:
- Line, bar, area
- Pivot
- Heatmap
- Sankey (flow)
- Country map
- Custom JS
Dashboards
Collection charts + filters + markdown.
SQL Lab
Interactive SQL editor:
- Multiple query tabs
- Results history
- Save query to dashboard
Pre-aggregated metrics
В Dataset definition:
metric: daily_orders_count
expression: COUNT(DISTINCT order_id)Reuse across charts.
Filters
Dashboard-level filters applied к все charts sharing column.
Cross-filtering — click на chart фильтрует другие.
Alerts и reports
Schedule reports:
- Send to email
- Attach chart screenshots
- CSV exports
Alerting — if metric crosses threshold, notify.
Access control
- Roles (Admin, Gamma, Alpha)
- Database-level permissions
- Row-level security (RLS) — enterprise
vs Metabase
Superset
- More chart types
- Enterprise features
- Larger scale
Metabase
- Easier setup / UX
- Smaller teams
- Quick adoption
Superset — closer к Tableau в capability, но open-source.
vs Tableau
Superset
- Free
- Self-hosted
- Integration с modern data stack
Tableau
- Richer UX
- Expensive
- More polished
Use cases
Internal BI
Dashboards для teams, business.
Data exploration
SQL Lab — powerful для analysts.
Embedded analytics
Superset charts в customer-facing products.
Cross-DB
Single tool для multiple data sources.
Частые задачи
DAU chart
SQL Lab:
SELECT day, COUNT(DISTINCT user_id) FROM events
GROUP BY 1 ORDER BY 1;Save as chart → line visualization.
Cohort retention
Dataset with cohort_month, days_since_signup, retention_rate. Heatmap visualization.
Funnel
Conditional aggregation + funnel chart type.
Интеграция с другими tools
Airflow
Schedule refresh datasets via Airflow.
dbt
dbt models exposed как Superset datasets.
Jupyter
Embed SQL Lab queries or export to notebook.
Russian context
- Yandex использует Superset (внутри).
- Ozon — internal BI часто Superset.
- Русскоязычные tutorials ограничены — в основном английские.
Performance
Caching
Redis-backed для query results.
Dashboards load
Cache expiration configurable. Balance freshness vs speed.
Queries
Superset — thin layer. Performance зависит от underlying DB.
На собесе
«Superset experience?» Какие dashboards, charts, SQL Lab.
«Superset vs Metabase?» Superset enterprise-grade, Metabase simpler/easier.
«Self-host complexity?» Docker simple для dev. Production — нужны DevOps.
Связанные темы
FAQ
Бесплатный full-features?
Yes. Open-source Apache.
Production-ready?
С правильным deployment — yes.
SaaS alternative?
Preset (managed Superset by co-founder).
Тренируйте BI — откройте тренажёр с 1500+ вопросами для собесов.