Running a 26B MoE model on an 8GB GPU
A practical note from a real homelab experiment: what mattered when running a 26B MoE model on an RTX 2070 SUPER with 8GB of VRAM.
Topic
Domestic infrastructure, local models, and systems with too many doors.
A practical note from a real homelab experiment: what mattered when running a 26B MoE model on an RTX 2070 SUPER with 8GB of VRAM.
A short story about what happens when a personal homelab grows enough doors that even the owner can end up knocking on the wrong one.