Running a 32B quantised Llama model on a Mac Mini takes 34 seconds to write 400 words. Claude Sonnet in the cloud does it in 7 seconds. And the local model is meaningfully dumber. Unless you've fully systematised your workflow around dedicated hardware, local AI inference is slower, weaker, and barely more private.