SQL для логистики и доставки
Карьерник — квиз-тренажёр в Telegram с 1500+ вопросами для собесов аналитика. SQL, Python, A/B, метрики. Бесплатно.
Зачем это знать
Delivery companies (Yandex Еда, Delivery, Ozon Fresh, CDEK, Сбермаркет) — huge analytics needs. Time-sensitive data. SLA compliance critical.
На собесах в delivery / logistics — specific SQL questions ожидаются.
Основные таблицы
-- orders
(id, user_id, placed_at, delivered_at, status)
-- couriers
(id, assigned_at, vehicle_type)
-- deliveries
(order_id, courier_id, picked_up_at, delivered_at, distance_km)
-- routes
(delivery_id, lat, lon, TIMESTAMP)Time-related metrics
Average delivery time
SELECT
AVG(EXTRACT(EPOCH FROM (delivered_at - placed_at)) / 60) AS avg_minutes
FROM orders
WHERE status = 'delivered'
AND placed_at >= CURRENT_DATE - 30;P95 delivery time
SELECT
PERCENTILE_CONT(0.95) WITHIN GROUP (
ORDER BY EXTRACT(EPOCH FROM (delivered_at - placed_at)) / 60
) AS p95_minutes
FROM orders
WHERE status = 'delivered';Tail matters в delivery.
Delivery time by hour
SELECT
EXTRACT(HOUR FROM placed_at) AS hour,
AVG(EXTRACT(EPOCH FROM (delivered_at - placed_at)) / 60) AS avg_min,
COUNT(*) AS orders
FROM orders
WHERE status = 'delivered'
GROUP BY hour
ORDER BY hour;Peak hours — usually slow.
SLA compliance
«Orders delivered в < 30 min»:
SELECT
SUM(CASE WHEN EXTRACT(EPOCH FROM (delivered_at - placed_at)) / 60 < 30 THEN 1 ELSE 0 END) * 1.0 /
COUNT(*) AS sla_compliance
FROM orders
WHERE status = 'delivered';Courier metrics
Utilization
WITH courier_hours AS (
SELECT
courier_id,
SUM(EXTRACT(EPOCH FROM (delivered_at - picked_up_at)) / 3600) AS active_hours,
shift_hours
FROM deliveries d
JOIN courier_shifts s USING (courier_id, shift_date)
GROUP BY courier_id, shift_hours
)
SELECT
courier_id,
active_hours,
shift_hours,
active_hours / shift_hours AS utilization
FROM courier_hours;Deliveries per shift
SELECT
courier_id,
COUNT(*) AS deliveries,
AVG(distance_km) AS avg_distance
FROM deliveries
WHERE delivered_at >= CURRENT_DATE - 7
GROUP BY courier_id;Order funnel
SELECT
COUNT(*) AS placed,
SUM(CASE WHEN status != 'cancelled' THEN 1 ELSE 0 END) AS accepted,
SUM(CASE WHEN status = 'delivered' THEN 1 ELSE 0 END) AS delivered,
SUM(CASE WHEN status = 'cancelled' THEN 1 ELSE 0 END) AS cancelled
FROM orders
WHERE placed_at >= CURRENT_DATE - 1;Geographic analysis
Orders per zone
SELECT
zone_id,
COUNT(*) AS orders,
AVG(delivery_time_min) AS avg_time
FROM orders
JOIN zones USING (zone_id)
WHERE placed_at >= CURRENT_DATE - 30
GROUP BY zone_id
ORDER BY orders DESC;Distance distribution
SELECT
CASE
WHEN distance_km < 2 THEN '< 2 km'
WHEN distance_km < 5 THEN '2-5 km'
WHEN distance_km < 10 THEN '5-10 km'
ELSE '> 10 km'
END AS distance_bucket,
COUNT(*) AS deliveries,
AVG(delivery_time_min) AS avg_time
FROM deliveries
GROUP BY 1;Operations insights
Late deliveries
SELECT
o.*,
EXTRACT(EPOCH FROM (o.delivered_at - o.promised_at)) / 60 AS late_minutes
FROM orders o
WHERE o.delivered_at > o.promised_at
ORDER BY late_minutes DESC;Bottleneck phase
Average each step:
SELECT
AVG(EXTRACT(EPOCH FROM (accepted_at - placed_at)) / 60) AS acceptance_min,
AVG(EXTRACT(EPOCH FROM (picked_up_at - accepted_at)) / 60) AS pickup_min,
AVG(EXTRACT(EPOCH FROM (delivered_at - picked_up_at)) / 60) AS transit_min
FROM orders
WHERE status = 'delivered';Where time lost? Optimize worst phase.
Fraud / quality
Cancellation analysis
SELECT
cancellation_reason,
COUNT(*) AS count,
COUNT(DISTINCT courier_id) AS couriers_involved
FROM cancelled_orders
GROUP BY 1
ORDER BY count DESC;Courier performance anomalies
Couriers с много cancellations, returns.
SELECT
courier_id,
COUNT(*) AS deliveries,
SUM(CASE WHEN status = 'cancelled_by_courier' THEN 1 ELSE 0 END) AS cancellations,
SUM(CASE WHEN status = 'cancelled_by_courier' THEN 1 ELSE 0 END) * 1.0 / COUNT(*) AS cancel_rate
FROM deliveries
WHERE delivered_at >= CURRENT_DATE - 7
GROUP BY courier_id
HAVING cancel_rate > 0.10
ORDER BY cancel_rate DESC;Weather / external
Correlate weather с delivery time
External API data integrated.
SELECT
w.weather,
AVG(d.delivery_time_min) AS avg_time,
COUNT(*) AS orders
FROM deliveries d
JOIN weather w ON DATE(d.delivered_at) = w.DATE
GROUP BY 1;Rain / snow = slower. Plan.
Peak load
Daily / weekly patterns
SELECT
EXTRACT(DOW FROM placed_at) AS day_of_week,
EXTRACT(HOUR FROM placed_at) AS hour,
COUNT(*) AS orders
FROM orders
WHERE placed_at >= CURRENT_DATE - 30
GROUP BY 1, 2;For courier scheduling / infrastructure.
ETA accuracy
SELECT
AVG(EXTRACT(EPOCH FROM (actual_delivery_at - estimated_delivery_at)) / 60) AS avg_error_min,
PERCENTILE_CONT(0.95) WITHIN GROUP (
ORDER BY ABS(EXTRACT(EPOCH FROM (actual_delivery_at - estimated_delivery_at)) / 60)
) AS p95_error
FROM deliveries;Close to 0 — good ETAs.
Return / refund
SELECT
COUNT(*) * 1.0 /
(SELECT COUNT(*) FROM orders WHERE placed_at BETWEEN ... AND ...) AS return_rate
FROM returns
WHERE created_at BETWEEN ...;Target < 2% обычно.
На собесе
«Analyze delivery metrics»:
- Time metrics (avg, P95, SLA)
- Courier utilization
- Geographic / peak patterns
- Quality (cancellations, returns)
Business context: matters для customer satisfaction + profitability.
«Optimize pickup time»:
- Which stores slow?
- Which timings?
- Infrastructure issues?
- Courier behavior?
Companies
Yandex Еда / Лавка
Food delivery analytics.
Самокат
Quick commerce.
Ozon Fresh
Online grocery.
СДЭК
Long-distance delivery.
Delivery
Multi-merchant.
Different scale / complexities.
Связанные темы
FAQ
Real-time dashboard?
Yes. Operations need live metrics.
ClickHouse?
Common для delivery platforms — high volume events.
Geo computations?
PostGIS (Postgres), or H3 hexagons.
Тренируйте — откройте тренажёр с 1500+ вопросами для собесов.