SQL для логистики и доставки

Карьерник — квиз-тренажёр в Telegram с 1500+ вопросами для собесов аналитика. SQL, Python, A/B, метрики. Бесплатно.

Зачем это знать

Delivery companies (Yandex Еда, Delivery, Ozon Fresh, CDEK, Сбермаркет) — huge analytics needs. Time-sensitive data. SLA compliance critical.

На собесах в delivery / logistics — specific SQL questions ожидаются.

Основные таблицы

-- orders
(id, user_id, placed_at, delivered_at, status)

-- couriers
(id, assigned_at, vehicle_type)

-- deliveries
(order_id, courier_id, picked_up_at, delivered_at, distance_km)

-- routes
(delivery_id, lat, lon, TIMESTAMP)

Time-related metrics

Average delivery time

SELECT
    AVG(EXTRACT(EPOCH FROM (delivered_at - placed_at)) / 60) AS avg_minutes
FROM orders
WHERE status = 'delivered'
    AND placed_at >= CURRENT_DATE - 30;

P95 delivery time

SELECT
    PERCENTILE_CONT(0.95) WITHIN GROUP (
        ORDER BY EXTRACT(EPOCH FROM (delivered_at - placed_at)) / 60
    ) AS p95_minutes
FROM orders
WHERE status = 'delivered';

Tail matters в delivery.

Delivery time by hour

SELECT
    EXTRACT(HOUR FROM placed_at) AS hour,
    AVG(EXTRACT(EPOCH FROM (delivered_at - placed_at)) / 60) AS avg_min,
    COUNT(*) AS orders
FROM orders
WHERE status = 'delivered'
GROUP BY hour
ORDER BY hour;

Peak hours — usually slow.

SLA compliance

«Orders delivered в < 30 min»:

SELECT
    SUM(CASE WHEN EXTRACT(EPOCH FROM (delivered_at - placed_at)) / 60 < 30 THEN 1 ELSE 0 END) * 1.0 /
        COUNT(*) AS sla_compliance
FROM orders
WHERE status = 'delivered';

Courier metrics

Utilization

WITH courier_hours AS (
    SELECT
        courier_id,
        SUM(EXTRACT(EPOCH FROM (delivered_at - picked_up_at)) / 3600) AS active_hours,
        shift_hours
    FROM deliveries d
    JOIN courier_shifts s USING (courier_id, shift_date)
    GROUP BY courier_id, shift_hours
)
SELECT
    courier_id,
    active_hours,
    shift_hours,
    active_hours / shift_hours AS utilization
FROM courier_hours;

Deliveries per shift

SELECT
    courier_id,
    COUNT(*) AS deliveries,
    AVG(distance_km) AS avg_distance
FROM deliveries
WHERE delivered_at >= CURRENT_DATE - 7
GROUP BY courier_id;

Order funnel

SELECT
    COUNT(*) AS placed,
    SUM(CASE WHEN status != 'cancelled' THEN 1 ELSE 0 END) AS accepted,
    SUM(CASE WHEN status = 'delivered' THEN 1 ELSE 0 END) AS delivered,
    SUM(CASE WHEN status = 'cancelled' THEN 1 ELSE 0 END) AS cancelled
FROM orders
WHERE placed_at >= CURRENT_DATE - 1;

Geographic analysis

Orders per zone

SELECT
    zone_id,
    COUNT(*) AS orders,
    AVG(delivery_time_min) AS avg_time
FROM orders
JOIN zones USING (zone_id)
WHERE placed_at >= CURRENT_DATE - 30
GROUP BY zone_id
ORDER BY orders DESC;

Distance distribution

SELECT
    CASE
        WHEN distance_km < 2 THEN '< 2 km'
        WHEN distance_km < 5 THEN '2-5 km'
        WHEN distance_km < 10 THEN '5-10 km'
        ELSE '> 10 km'
    END AS distance_bucket,
    COUNT(*) AS deliveries,
    AVG(delivery_time_min) AS avg_time
FROM deliveries
GROUP BY 1;

Operations insights

Late deliveries

SELECT
    o.*,
    EXTRACT(EPOCH FROM (o.delivered_at - o.promised_at)) / 60 AS late_minutes
FROM orders o
WHERE o.delivered_at > o.promised_at
ORDER BY late_minutes DESC;

Bottleneck phase

Average each step:

SELECT
    AVG(EXTRACT(EPOCH FROM (accepted_at - placed_at)) / 60) AS acceptance_min,
    AVG(EXTRACT(EPOCH FROM (picked_up_at - accepted_at)) / 60) AS pickup_min,
    AVG(EXTRACT(EPOCH FROM (delivered_at - picked_up_at)) / 60) AS transit_min
FROM orders
WHERE status = 'delivered';

Where time lost? Optimize worst phase.

Fraud / quality

Cancellation analysis

SELECT
    cancellation_reason,
    COUNT(*) AS count,
    COUNT(DISTINCT courier_id) AS couriers_involved
FROM cancelled_orders
GROUP BY 1
ORDER BY count DESC;

Courier performance anomalies

Couriers с много cancellations, returns.

SELECT
    courier_id,
    COUNT(*) AS deliveries,
    SUM(CASE WHEN status = 'cancelled_by_courier' THEN 1 ELSE 0 END) AS cancellations,
    SUM(CASE WHEN status = 'cancelled_by_courier' THEN 1 ELSE 0 END) * 1.0 / COUNT(*) AS cancel_rate
FROM deliveries
WHERE delivered_at >= CURRENT_DATE - 7
GROUP BY courier_id
HAVING cancel_rate > 0.10
ORDER BY cancel_rate DESC;

Weather / external

Correlate weather с delivery time

External API data integrated.

SELECT
    w.weather,
    AVG(d.delivery_time_min) AS avg_time,
    COUNT(*) AS orders
FROM deliveries d
JOIN weather w ON DATE(d.delivered_at) = w.DATE
GROUP BY 1;

Rain / snow = slower. Plan.

Peak load

Daily / weekly patterns

SELECT
    EXTRACT(DOW FROM placed_at) AS day_of_week,
    EXTRACT(HOUR FROM placed_at) AS hour,
    COUNT(*) AS orders
FROM orders
WHERE placed_at >= CURRENT_DATE - 30
GROUP BY 1, 2;

For courier scheduling / infrastructure.

ETA accuracy

SELECT
    AVG(EXTRACT(EPOCH FROM (actual_delivery_at - estimated_delivery_at)) / 60) AS avg_error_min,
    PERCENTILE_CONT(0.95) WITHIN GROUP (
        ORDER BY ABS(EXTRACT(EPOCH FROM (actual_delivery_at - estimated_delivery_at)) / 60)
    ) AS p95_error
FROM deliveries;

Close to 0 — good ETAs.

Return / refund

SELECT
    COUNT(*) * 1.0 /
        (SELECT COUNT(*) FROM orders WHERE placed_at BETWEEN ... AND ...) AS return_rate
FROM returns
WHERE created_at BETWEEN ...;

Target < 2% обычно.

На собесе

«Analyze delivery metrics»:

  • Time metrics (avg, P95, SLA)
  • Courier utilization
  • Geographic / peak patterns
  • Quality (cancellations, returns)

Business context: matters для customer satisfaction + profitability.

«Optimize pickup time»:

  • Which stores slow?
  • Which timings?
  • Infrastructure issues?
  • Courier behavior?

Companies

Yandex Еда / Лавка

Food delivery analytics.

Самокат

Quick commerce.

Ozon Fresh

Online grocery.

СДЭК

Long-distance delivery.

Delivery

Multi-merchant.

Different scale / complexities.

Связанные темы

FAQ

Real-time dashboard?

Yes. Operations need live metrics.

ClickHouse?

Common для delivery platforms — high volume events.

Geo computations?

PostGIS (Postgres), or H3 hexagons.


Тренируйте — откройте тренажёр с 1500+ вопросами для собесов.