Lead scoring для аналитика

Карьерник — квиз-тренажёр в Telegram с 1500+ вопросами для собесов аналитика. SQL, Python, A/B, метрики. Бесплатно.

Зачем это знать

В B2B / SaaS / fintech аналитики строят lead scoring — ранжируют потенциальных customers по вероятности закрытия. Хороший scoring = sales team focuses на right leads = больше revenue.

На собесах в B2B companies (Контур, SkyEng, B2B-focused teams) lead scoring — part of daily work.

Что такое lead scoring

Lead = potential customer (показал interest).

Lead scoring = assign score each lead, indicating conversion probability.

Sales uses score → priority. 100/100 = hot lead, call immediately. 20/100 = nurture, not immediate.

Types

Rule-based

Manual rules:

  • Downloaded whitepaper: +10
  • Visited pricing page: +15
  • Enterprise company: +20
  • No budget info: -10

Total = sum points.

Plus: simple, interpretable. Minus: inflexible, может miss patterns.

Model-based

ML model predicts conversion probability.

Features:

  • Behavioral (pages visited, emails opened)
  • Demographic (company size, industry, role)
  • Engagement (time on site, frequency)
  • Explicit (budget, timeline, need)

Target: did lead convert within N days?

Model: logistic regression, random forest, XGBoost.

Hybrid

Rules для guardrails + model для fine-tuning.

Build process

1. Define «conversion»

  • Closed-won deal?
  • Trial started?
  • First payment?

Different targets → different models.

2. Historical data

Leads who converted vs those who didn't.

Timeframe matters: leads from 3 years назад — different market.

3. Features

Brainstorm + experiment.

Behavioral

  • Website pages visited
  • Content downloaded
  • Emails opened / clicked
  • Product demo watched
  • Session count
  • Days since last visit

Demographic

  • Industry
  • Company size
  • Seniority (C-level, manager, IC)
  • Geography

Explicit

  • Budget
  • Timeline
  • Need articulated

Engagement score

Combines multiple behaviors.

4. Train model

from sklearn.linear_model import LogisticRegression

X = leads[['page_views', 'demos_watched', 'company_size', ...]]
y = leads['converted']

model = LogisticRegression()
model.fit(X, y)

# Predict
leads['score'] = model.predict_proba(X)[:, 1] * 100

5. Validate

  • AUC on holdout
  • Calibration (predicted vs actual rates)
  • Business uplift (sales close rate)

6. Deploy

Score recalculated daily/real-time.

Integration с CRM (Salesforce, Hubspot).

Metrics

Offline

  • AUC / Gini
  • Precision@top-N (top-20% leads conversion rate)
  • Lift (conversion rate в top decile vs average)

Business

  • Close rate of scored leads
  • Sales time to close
  • Win rate by score bucket
  • Revenue per lead

Score interpretation

Bucket leads:

  • 90-100: hot. Immediate call.
  • 70-89: warm. Call this week.
  • 50-69: nurture. Email cadence.
  • 0-49: cold. Marketing content only.

Sales team prioritizes top buckets.

BANT

Classic qualification framework:

  • Budget
  • Authority
  • Need
  • Timeline

Features для score: «has budget», «decision-maker», «active need», «has timeline».

Частые ошибки

Survivorship bias

Train только на «converted» — lose info about «never converted».

Include both.

Feature leakage

Feature «demo watched» — correlates с conversion потому что sales уже engaged. Use only upstream features.

Wrong label

«Converted» нужно consistent definition. Timeframe важна.

Model drift

Market changes → model stale. Retrain quarterly.

MQL vs SQL

MQL (Marketing Qualified Lead)

Shown interest — passed to sales.

SQL (Sales Qualified Lead)

Sales confirmed potential. Ready для real conversation.

Lead scoring часто определяет MQL threshold.

Lead scoring for B2C

Less common, но tools (e.g., subscription upgrades) могут.

«Который user upgrade в next 30 days?» → scoring.

Integration

CRM

Salesforce, HubSpot — score fields.

Marketing automation

Hubspot, Marketo — trigger campaigns based on score.

Analytics

Amplitude / Mixpanel — segment by score.

На собесе

«Как построить lead scoring?» Define target → features → train model → validate → deploy.

«Rule vs model?» Rule для simple/small, model для scale/complex.

«Metrics?» Business (close rate by bucket) + model (AUC).

«Частые ошибки?» Feature leakage, label definition, model drift.

Связанные темы

FAQ

Для startup нужен?

Sales volume должен оправдать effort. < 100 leads/month — манульно ok.

Python vs ML tools?

Python — flexibility. CRM tools имеют built-in scoring, но basic.

Update частоту?

Daily realtime ideal. Минимум weekly.


Тренируйте ML — откройте тренажёр с 1500+ вопросами для собесов.