Local Model Fine-Tuning with Unsloth
1. Executive Summary
Target: Fine-tune qwen2.5-coder:14b on personal AsciiDoc conventions, STEM syntax, and documentation patterns
Hardware: RTX 5090 Mobile (24GB VRAM) + 64GB RAM
Method: QLoRA (4-bit quantized LoRA) via Unsloth
Investment: ~10-15 hours (data curation + training + evaluation)
Foundation Assets:
-
Working Aider + Ollama pipeline (tested, A- quality with prompting)
-
15+ domus-* repos with thousands of AsciiDoc files as training source
-
16 document templates with established patterns
-
106-problem math curriculum demonstrating target STEM syntax
2. Strategic Alignment
Fine-tuning is the next step after prompting hits a ceiling. Current state:
| Approach | Quality | Limitation |
|---|---|---|
Raw model (no config) |
C+ — wrong syntax, ignored templates |
No awareness of conventions |
Prompting + CONVENTIONS.md |
A- — correct syntax, follows templates |
Still needs explicit instructions per session |
Fine-tuned model |
Target: A — natively knows patterns |
Requires data curation upfront |
Fine-tuning makes sense when:
-
The model makes the same mistake repeatedly despite prompting
-
You have a repetitive workflow (daily worklogs, math pages, case studies)
-
You want to learn ML fundamentals (career investment)
4. Improvement Proposals
|
Proposals from ecosystem audit — 2026-04-04. For team review and prioritization. |
| Priority | Proposal | Rationale | Effort |
|---|---|---|---|
P2 |
Model comparison table (benchmarks, sizes, use cases) |
Document tested models with: parameter count, VRAM requirement, inference speed, quality score per task. Prevents re-evaluating the same models. |
M |
P2 |
Hardware requirements reference |
Map model sizes to GPU requirements: 7B models on 8GB VRAM, 13B on 16GB, 70B on 2x24GB, etc. Include CPU fallback performance. |
S |
P3 |
Training data preparation guide |
Document the data pipeline: collection, cleaning, formatting (Alpaca, ShareGPT, ChatML), tokenization, and validation steps. |
M |
P3 |
Evaluation metrics documentation |
Define how to measure fine-tuned model quality: perplexity, BLEU, task-specific benchmarks, human evaluation rubrics. |
M |
4.1. Resources
| Resource | Type | Notes |
|---|---|---|
Tool |
2-5x faster QLoRA, designed for consumer GPUs |
|
Model |
Base model for fine-tuning |
|
Docs |
Examples, notebooks, guides |
|
Docs |
Supervised fine-tuning trainer API |