SQL вопросы на собесе в Яндекс
Карьерник — квиз-тренажёр в Telegram с 1500+ вопросами для собесов аналитика. SQL, Python, A/B, метрики. Бесплатно.
Зачем это знать
Yandex SQL bar — один из высоких в Russia. Подготовка — different от average.
Задачи real-world, оптимизация матерiтся, advanced patterns.
Процесс
Usually 60-90 min technical round:
- Live coding (shared whiteboard / editor)
- 2-4 задачи уровня middle-senior
- Think aloud требование
Типы задач
1. Window functions
Expected fluent:
- ROW_NUMBER, RANK, DENSE_RANK
- LAG, LEAD
- SUM OVER, AVG OVER
- Frame specification (ROWS / RANGE)
2. Complex JOINs
- Self-joins
- LATERAL
- Multiple conditions
- Anti-joins
3. Optimization
- EXPLAIN interpretation
- Index strategy
- Query rewriting
4. CTE / recursive
- Hierarchies
- Date sequences
- Graph traversal
5. ClickHouse specific
- Arrays
- quantile() functions
- sumIf / countIf
- materialized views
Пример задача 1
«Таблица events(user_id, event_type, created_at). Найти сессии пользователей — session = events without > 30 min gap».
Solution
WITH events_gap AS (
SELECT
user_id,
event_type,
created_at,
EXTRACT(EPOCH FROM (created_at - LAG(created_at) OVER (PARTITION BY user_id ORDER BY created_at))) / 60 AS gap_minutes
FROM events
),
session_id AS (
SELECT
user_id,
event_type,
created_at,
SUM(CASE WHEN gap_minutes > 30 OR gap_minutes IS NULL THEN 1 ELSE 0 END)
OVER (PARTITION BY user_id ORDER BY created_at) AS session_num
FROM events_gap
)
SELECT
user_id,
session_num,
MIN(created_at) AS session_start,
MAX(created_at) AS session_end,
COUNT(*) AS events_count
FROM session_id
GROUP BY user_id, session_num;Sessionization classic.
Пример задача 2
«Top-3 purchases каждого user за последний год».
Solution
WITH ranked AS (
SELECT
user_id,
purchase_id,
amount,
DENSE_RANK() OVER (PARTITION BY user_id ORDER BY amount DESC) AS rnk
FROM purchases
WHERE created_at >= CURRENT_DATE - 365
)
SELECT user_id, purchase_id, amount
FROM ranked
WHERE rnk <= 3;Windowed rank.
Пример задача 3
«Retention D7 для cohort апреля 2026».
Solution
WITH cohort AS (
SELECT user_id, DATE(MIN(created_at)) AS signup_date
FROM users
WHERE MIN(created_at) BETWEEN '2026-04-01' AND '2026-04-30'
GROUP BY user_id
),
active_d7 AS (
SELECT DISTINCT user_id
FROM events
WHERE DATE(created_at) = signup_date + 7
)
SELECT
COUNT(*) AS cohort_size,
COUNT(DISTINCT a.user_id) AS retained,
COUNT(DISTINCT a.user_id) * 1.0 / COUNT(*) AS d7_retention
FROM cohort c
LEFT JOIN active_d7 a USING (user_id);Пример задача 4
«Running 7-day active users».
Solution
SELECT
day,
COUNT(DISTINCT user_id) AS dau,
-- Rolling 7-day (need generative approach)
(SELECT COUNT(DISTINCT user_id)
FROM events
WHERE DATE(created_at) BETWEEN t.day - 6 AND t.day) AS wau
FROM (
SELECT DISTINCT DATE(created_at) AS day FROM events
) t
ORDER BY day;Или с window (tricky для distinct).
Пример задача 5
«Funnel: signup → activation → purchase в sequence».
Solution
WITH user_events AS (
SELECT
user_id,
MAX(CASE WHEN event = 'signup' THEN created_at END) AS signup_at,
MAX(CASE WHEN event = 'activation' THEN created_at END) AS activation_at,
MAX(CASE WHEN event = 'purchase' THEN created_at END) AS purchase_at
FROM events
GROUP BY user_id
)
SELECT
COUNT(*) FILTER (WHERE signup_at IS NOT NULL) AS signed,
COUNT(*) FILTER (WHERE signup_at IS NOT NULL AND activation_at > signup_at) AS activated,
COUNT(*) FILTER (WHERE signup_at IS NOT NULL AND activation_at > signup_at AND purchase_at > activation_at) AS purchased
FROM user_events;Sequential filter.
Пример задача 6
«Оптимизировать query, running 10 min».
Process:
- Run EXPLAIN ANALYZE
- Identify:
- Seq Scan → need index
- Nested Loop → could hash join
- Sort → ORDER BY на indexed col?
- Add indexes
- Consider partitioning
- Rewrite если возможно
- Materialize if repeated
Narrate thinking.
Пример задача 7
«Recursive CTE — organization hierarchy».
WITH RECURSIVE hierarchy AS (
SELECT id, name, manager_id, 1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT e.id, e.name, e.manager_id, h.level + 1
FROM employees e
JOIN hierarchy h ON e.manager_id = h.id
)
SELECT * FROM hierarchy ORDER BY level, name;ClickHouse-specific
Yandex heavy ClickHouse. Expected:
Arrays
SELECT user_id, arrayJoin(tags) FROM users_with_tags;Quantiles
SELECT quantile(0.95)(response_time) FROM events;sumIf
SELECT
sumIf(revenue, country = 'RU') AS ru_rev,
sumIf(revenue, country = 'US') AS us_rev
FROM transactions;Materialized views
CREATE MATERIALIZED VIEW daily_summary
ENGINE = SummingMergeTree()
ORDER BY day
AS SELECT toStartOfDay(ts) AS day, count() AS events
FROM raw_events
GROUP BY day;Communication
Think aloud
Verbalize thought process.
«I need use LAG для sessionization. Let me first understand gap».
Show structured thinking.
Clarify
«Sessions defined как? > 30 min gap?».
Before coding.
Edge cases
«What if user has one event?» «NULL handling?»
Show thoroughness.
Optimization
«This works, но could be faster».
Demonstrate awareness.
Red flags
- Jump to code без thinking
- Miss edge cases
- Не use windows when appropriate
- Forget NULL handling
- No query optimization knowledge
Green flags
- Clarify requirements
- Multiple approaches
- Optimization thoughts
- Clean syntax
- Edge cases considered
Prep resources
- LeetCode Database Medium-Hard
- StrataScratch real company questions
- ClickHouse documentation tutorial
- Карьерник Russian SQL задачи
Specific к Yandex teams
Yandex.Search
Ranking, session analysis.
Yandex.Ads
Campaign metrics, budget.
Yandex.Market
Marketplace.
Yandex.Taksi
Two-sided, geo.
Yandex.Cloud
Resource usage, billing.
Different focus per team.
На собесе
Before
Сверьте basics. Memory refresh syntax.
During
- Greet friendly
- Clarify
- Walk approach
- Code carefully
- Test
- Explain trade-offs
После
Ask about team, role.
Зарплаты
Yandex top:
- Middle: 200-300k
- Senior: 300-450k
- Lead: 450-700k
Plus bonus, RSU.
Подготовка
Plus Yandex-specific:
- Learn ClickHouse basics
- Practice optimization questions
- Sessionization patterns
Связанные темы
FAQ
English or Russian?
Russian usually. English sometimes.
Calculator allowed?
Usually. But mental math preferred для simple.
Hints given?
Sometimes if stuck. Don't wait too long.
Готовьтесь — откройте тренажёр с 1500+ вопросами для собесов.