Running a Local LLM on Kubernetes — A Home Lab Setup
· 10 min read
In Part 1 I ran Ollama directly on a Linux machine and wired it up through an MCP layer to a small web app. It worked. But bare-metal has friction — if the process crashes, it stays down. Adding Open-WebUI means managing another process. Resource limits are manual. There's no clean internal networking between services.
This post moves the whole thing into Kubernetes. The goal isn't enterprise-grade infrastructure — it's a home lab setup that's reliable, easy to extend, and honest about its limitations.
Manifests are in the ollama-mcp-starter repo under backend/k8s-deployment/.
