Как писать commit messages аналитику
discount по убыванию. Поле discount может быть NULL (скидки нет). Чтобы товары без скидки всегда оказывались внизу независимо от настроек СУБД, какой вариант сортировки выбрать?Зачем это знать
Analyst 2026 uses git. dbt projects, shared notebooks, version-controlled queries. Poor git habits → team friction, lost work.
На собесах могут спросить basic git. Quality commit messages — professional signal.
Git basics
Repository
Project с tracked files.
Commit
Snapshot changes.
Branch
Parallel line работы.
Merge
Combine branches.
Pull / Push
Sync remote.
Daily commands
# Status
git status
# Add / stage
git add file.sql
git add . # all changes
# Commit
git commit -m "message"
# Push
git push origin branch-name
# Pull
git pull origin main
# Branch
git checkout -b new-feature
git checkout existing-branch
# Merge / update
git merge mainCommit messages
Structure
<type>: <short summary>
<optional body explaining what и why>
<optional footer: references, breaking changes>Types
- feat: new feature
- fix: bug fix
- refactor: code change без feature / bug
- docs: documentation
- test: adding tests
- chore: maintenance
- style: formatting
Good examples
feat(dashboard): add revenue trend chart к executive dashboard
fix(retention_model): correct cohort offset calculation
refactor(dim_customer): simplify SCD logicBad examples
update
fix things
wip
asdf
commitToo vague. Later no idea what changed.
Granularity
Per logical change
Each commit = one thought.
Good:
- feat(metrics): add new_visitor_rate metric
- fix(metrics): handle null session duration
- docs(metrics): document new_visitor_rate definition
Bad:
- refactor everything
Atomic
Можно revert individually.
Frequency
Commit often. Push regularly.
Don't commit hundreds files once.
Branches
Naming
Convention:
feature/add-revenue-dashboard
fix/cohort-offset-bug
refactor/metric-definitions
analysis/q2-retention-deep-diveLifecycle
- Create from main
- Work, commit
- Push
- Pull request / merge request
- Review
- Merge
- Delete branch
Don't
- Work directly на main
- Long-lived feature branches (stale)
- Unmerged work local weeks
Pull requests
Title
Clear summary.
Description
- What changed
- Why
- Related ticket / issue
- Screenshots (if UI)
Review
Ask peers check. Analyst reviews other analyst SQL / dbt.
Pre-submit
- Self-review diff
- Run tests
- Clean up
dbt git workflow
Typical
# New model
git checkout -b feat/add-weekly-revenue-model
# Edit dbt model
# Run tests
dbt run --select weekly_revenue
dbt test --select weekly_revenue
# Commit
git add models/marts/weekly_revenue.sql
git commit -m "feat(marts): add weekly_revenue aggregation"
# Push
git push origin feat/add-weekly-revenue-model
# Open PR для reviewJupyter notebooks git
Challenges
- Binary-ish (JSON с outputs)
- Diff painful
- Outputs не important
Solutions
- Clear outputs before commit
nbstripouttool- Text-based alternative (Jupytext)
Pattern
# Before commit
jupyter nbconvert --clear-output notebook.ipynb
git add notebook.ipynb
git commit -m "analysis: Q2 retention deep-dive"Merge conflicts
Occur when
Two branches modify same line.
Resolve
- Pull latest main
- Merge / rebase
- Resolve conflict markers
<<<<,====,>>>> - Test
- Commit
Avoid
- Pull frequently
- Small branches
- Team communication
Reverting
Undo commits
git revert <commit-hash> # Creates new commit undoing
git reset --hard <commit-hash> # Rewinds (destructive)Undo changes не commited
git checkout -- file.sql # Reverts file
git stash # Save in-progressTools
CLI
Built-in. Main interface.
GUI
- VS Code integrated
- GitKraken
- SourceTree
- GitHub Desktop
Help for visual thinkers.
Platforms
- GitHub
- GitLab
- Bitbucket
Usually tied к company choice.
Security
Secrets
Never commit:
- Passwords
- API keys
- Customer data
If accidentally → rotate key, git history cleanup.
.gitignore
Exclude:
.env
*.pyc
__pycache__
.DS_Store
credentials.json
notebook_outputs/Collaboration
1. Clone
git clone <url>2. Sync
git pull # Before starting work3. Work
Feature branch. Commit often.
4. Share
git push5. Review
PR. Discuss. Iterate.
6. Merge
Approved → merge.
7. Cleanup
Delete branch.
dbt + git + CI
Modern setup:
- dbt models в git
- PR triggers CI
- CI runs dbt tests
- Block merge если fail
- Merge → deploy
Best practice reliable pipelines.
Common mistakes
Not committing часто enough
Lose work if laptop fails.
Vague messages
«fix» × 20. Useless.
Force push
Overwrites others. Don't на shared branches.
Committing generated files
Output CSVs, cache files. Bloats repo.
Not pulling
Conflicts multiply.
Analyst-specific
Query versioning
Keep SQL files в git. Не just в BI tool.
Analysis notebooks
Commit даже EDA. Future reference.
Tracking changes
«Why metric changed definition?». Git log — answer.
Time travel
git checkout old-commit — ссылка на archival state.
Workflow examples
Simple analyst
- Main branch: working queries
- Ad-hoc: feature branches
- PR to main
dbt team
- Dev environment (personal)
- Staging (PR builds)
- Production (main merge)
Notebooks
- Per-analyst directory
- Explore freely
- Promote к shared когда done
Learning
Resources
- GitHub tutorials
- Atlassian Git Tutorials
- Oh Shit, Git!?! (fun troubleshooting)
- Pro Git book (free online)
Practice
- Personal projects
- Contribute к OSS
- Learn while working
На собесе
«Git experience?»
Show:
- Branching strategy
- PR process
- Commit style
«How handle conflicts?»
Resolve: understand, edit, test, commit.
«.gitignore?»
Exclude secrets, generated files.
Basic fluency expected.
Связанные темы
FAQ
Git Flow vs GitHub Flow?
GitHub Flow simpler (main + feature branches). GitFlow more formal (develop, release, hotfix).
Rebase vs merge?
Merge preserves history. Rebase cleaner. Team preference.
Large files?
Git LFS (Large File Storage). For datasets.