ollama SLOW performance and troubleshooting it
I am running recent (2/2026) umbrel (1.5?) on mini PC with 16GB memory. I have basic BTC node and few supporting apps (LNGg, mempool, …) but hardly ever used more than 6 GB of memory.
I have installed ollama and open web UI apps. I am trying to learn about it, tweak models, and host them locally so my questions are not free for all.
In theory, this should work excellent, with 1 TB of free SSD space on 2TB disk, and 9-10 GB of free ram, it should be sweet spot.
But it’s painfully slow. We are talking 34 minutes of thinking on trivial test question such as: “tell me fun fact about roman empire” with qwen3.5:4b and CPU load at 80-100%.
What could be possibly causing it, how do I best address it?
Context: I have installed a few (4-5) models from open web UI interface. I dont know if they are all in the memory at once, or even if they are available. I have not seen any way how do I modify any setting for ollama app, where do i configure it, put some memory limits, load models or test its functionality? Can someone help and share their experience with running and configuring and troubleshooting ollama app on umbrel?