Fine Tuning Foundational Models
When fine-tuning is the right tool (and when it isn't), data prep, LoRA vs full fine-tune, eval on a fine-tune, and the alternatives ladder (prompt → RAG → fine-tune → distil).
Interview talking points
- When would you fine-tune? Only after RAG + better prompts have been measured. Fine-tune for: domain vocabulary, output format, latency (smaller model + tune).
- LoRA vs full fine-tune. LoRA: 95% of the gains, ~1% of the cost, easier to roll back.
- How do you eval a fine-tune? Held-out set + human eval on edge cases + behavioural eval (does it still refuse appropriately?).
- RLHF / DPO? Mention briefly — out of scope for most platform roles unless you've shipped one.
Files in this folder
| File | Title |
|---|---|
| 00-mangaassist_fine_tuning_topic_scenario_map.md | MangaAssist Fine-Tuning Topic Scenario Map |
| 01_fine_tuning_dry_run_mangaassist.md | Fine-Tuning Dry Run Document — Intent Classifier (DistilBERT) for MangaAssist |
| mangaassist_document_validation_report_v2.md | Validation Report for the MangaAssist Fine-Tuning Documents — Updated |
| README.md | Fine-Tuning Foundational Models for MangaAssist |
| SCENARIO_TEMPLATE.md | Scenario Template — Research-Grade 8-File Pattern for MangaAssist Fine-Tuning Topics |
Back to the home page.