7 мая 2026 г.·2 мин чтения

Curriculum learning на собеседовании Data Scientist

Проверь себя · 1/3разбор после ответа

У вас есть только проценты конверсии по группам, но нет абсолютных размеров групп. Можно ли корректно провести тест хи-квадрат на независимость?

Содержание:

Идея curriculum learning

Like humans — easier examples first, harder later. Improves convergence, generalization.

Bengio 2009 — formalized. Show that order of training data matters.

Sort training data по difficulty. Start с easy, gradually add hard.

Difficulty measures:

Epoch 1: top 30% easiest.
Epoch 5: top 60%.
Epoch 10: all.

Anti-curriculum. Hard first — sometimes works для robustness.

Model decides what's easy / hard itself.

loss = main_loss + λ · regularizer(weights, sample_difficulty)

Samples с low loss → high weight (used). High loss → low weight (skipped).

λ decreased over training → model gradually accepts harder.

Готовься к собесу аналитика как в Duolingo

10 минут в день — SQL, Python, A/B, метрики. 1700+ вопросов в Telegram

LLM training. Order data по quality / complexity. Common practice большие labs.

RL. Curriculum environments — start с easy levels.

Speech / NLP. Short utterances first.

Imitation learning. Demonstrations с increasing complexity.

Robotics. Easy tasks → general → specific.

Нет. Статья основана на работах Bengio 2009, Kumar 2010 (self-paced).

Тренируйте Data Science — откройте тренажёр с 1500+ вопросами для собесов.