AI on umbrel - ollama SLOW performance and troubleshooting

ollama SLOW performance and troubleshooting it

I am running recent (2/2026) umbrel (1.5?) on mini PC with 16GB memory. I have basic BTC node and few supporting apps (LNGg, mempool, …) but hardly ever used more than 6 GB of memory.

I have installed ollama and open web UI apps. I am trying to learn about it, tweak models, and host them locally so my questions are not free for all.

In theory, this should work excellent, with 1 TB of free SSD space on 2TB disk, and 9-10 GB of free ram, it should be sweet spot.

But it’s painfully slow. We are talking 34 minutes of thinking on trivial test question such as: “tell me fun fact about roman empire” with qwen3.5:4b and CPU load at 80-100%.

What could be possibly causing it, how do I best address it?

Context: I have installed a few (4-5) models from open web UI interface. I dont know if they are all in the memory at once, or even if they are available. I have not seen any way how do I modify any setting for ollama app, where do i configure it, put some memory limits, load models or test its functionality? Can someone help and share their experience with running and configuring and troubleshooting ollama app on umbrel?