Temperature и sampling в LLM на собеседовании Data Scientist
Карьерник — Duolingo для аналитиков: 10 минут в день тренируй SQL, Python, A/B, статистику, метрики и ещё 3 темы собеса. 1500+ вопросов в Telegram-боте. Бесплатно.
Содержание:
Greedy decoding
Pick highest-probability token каждый step.
Pros: deterministic, fast.
Cons: может trap в local optima, repetitive.
Temperature
Adjust softmax sharpness.
P(token) ∝ exp(logit / T)T = 1 — natural distribution.
T → 0 — greedy (one token dominates).
T > 1 — flatter, more diversity.
Practical.
T = 0— code generation, factual.T = 0.7— creative writing, brainstorming.T = 1.0— varied generation.
Top-k
Sample только из top-k probability tokens. Rest zeroed out.
top_k = 50 # consider 50 most likely
sample from those.Filters very unlikely tokens. Avoids non-coherent outputs.
Top-p (nucleus)
Sample из tokens covering cumulative probability p.
top_p = 0.9
sort tokens by prob descending.
take while cumulative ≤ 0.9.
sample from these.Adaptive — number tokens varies. Better чем fixed top-k для varying confidence.
Common combo: temp=0.7, top_p=0.9.
Beam search
Track top-N candidate sequences. Expand each, keep top-N.
Pros: can find higher-prob sequences чем greedy.
Cons:
- Computationally expensive.
- Tends to bland / repetitive output.
- Less common modern LLMs.
Used в machine translation, summarization.
Repetition penalty
Penalize tokens recently generated.
P(token) /= repetition_penalty^(count of token recently)repetition_penalty = 1.1-1.5 — typical.
Prevents «repetition loops» where model says same thing.
Связанные темы
- BERT vs GPT для DS
- Transformer для DS
- Hallucinations и LLM evals для DS
- Prompt engineering для DS
- Подготовка к собесу Data Scientist
FAQ
Это официальная информация?
Нет. Статья основана на работах Holtzman 2020 (top-p), документации OpenAI / HuggingFace.
Тренируйте Data Science — откройте тренажёр с 1500+ вопросами для собесов.