I’m considering a hardware upgrade for my Umbrel setup, as my current hardware feels somewhat slow, especially with processing lightning transactions and starting LND after a reboot (even though I’m using a quad-core Intel Core i7-3720QM, 16GB of DDR3 memory, and a 2TB Samsung SATA SSD). I hope a hardware upgrade could help improve this performance.
However, I’m curious to know which component I should prioritize. Which part is the most critical for a smooth experience with UmbrelOS, and especially with LND? Is it the RAM, the CPU, or the SSD? Which component is worth investing more in?
I’m currently considering a setup with an AMD Ryzen 7 3700X, 32GB of DDR4 memory, and the same 2TB SSD. What are your thoughts on this setup? Or would it be worth going further and investing in DDR5 RAM? I’m also wondering how much an upgrade to an NVMe SSD would help (not sure about the migration process though).
First off, I’d say your current HW on CPU and RAM is more than sufficient. The highest performing nodes get along fine with an equivalent you have, and hundreds of channels.
What’s the bottleneck for LND and particularly reboot is: channel.db handling. It’s boltd and needs to be
loaded into RAM for every reboot
can’t handle on-the-fly compacting (like postgres can) so when rebooting and compacting, it takes time
Faster CPU and RAM won’t accelerate this process, which you can observe with htop or system monitoring software when rebooting. What likely is the bottleneck instead: I/O Ops of your SSD
So an uber fast NVME will likely improve since their throughput is better than SSD. And LND has ~ 10-50x more read-processes than write in the usual LND-is-running mode, which should handle the NVME faster, too.
But with boltd, you will always have a longer restart, and need to calculate longer down time for compacting. Mitigation: Restart as rarely as possible. My compaction is ~ once per quarter, and roughly not more often restarting.
For restart acceleration without compacting, try the following in another terminal window when restarting LND:
([INF] LTND: Database(s) now open (time_to_open=5m3.790720607s))
what cat does is loading the channel.db into RAM, which usually is still a lot faster than NVME.
For migration, that’s something which needs careful planning. But the gist is: Stop everything, disable LND auto restart => clone the SSD => NVME offline, come back online and doublecheck everything is in place before starting LND.
OK, so I am buying a new M.2 NVMe SSD.
I will also try that trickery with loading channel.db and sphinxreplay.db into RAM.
Maybe do you know of any good guide for the migration process to the new SSD? I really don’t want to mess this up. Force-close with penalty is the last thing I’d wish for.
I am also considering running umbrelOS in Proxmox ZFS mirror. Is that a good idea?
(Proxmox running on small SATA SSD + umbrel VM running on 2x2TB NVMe ZFS mirror)
Yeah, understand that this is a goosebump process. I’ve done mine after meticulous preparation, but probably much more change involved, when moving from SSD + small NVMe to SSD + large NVMe & LVMS & Raid.
FWIW my current setup looks like this:
NVME and SSD:
/boot: 1GB, type “primary,” mount point /boot (to be replicated)
/: 100GB, type “primary,” mount point /
/home: 120GB, type “primary,” mount point /home
/data/lnd: 120GB, type “primary,” mount point /data/lnd (to be replicated)
SSD Only:
/data/bitcoin: 1.5TB, type “primary,” no mount point (not replicated)
SSD Only (for Swap):
/swap: 20GB, type “primary,” no mount point (not replicated)
I’ve done all this in no rush because I was able to move my main node to another hardware until everything was set and ready.
If you just want to swap the SSD with the NVMe, I’d probably do the following
if the target is larger, either resize while cloning, or afterwards with gparted
shutdown after successful completion, get the SSD out, boot
check everything starts properly (bitcoind), size is alright (lsblk and df -h) and only then try either manually starting LND or by sudo systemctl restart lnd.service while watching it’s logs in a second terminal
Best I could provide as a “guide”, but as far as I know, plenty of folks have followed the clonezilla approach.
I’m not familiar with proxmox nor ZFS, but know a few running this way. ZFS is not for the faint hearted, prolly only go down this way if you know what you’re doing. At least 2 runners I know toasted their nodes due to troubleshooting mishaps with ZFS.
I can only speak from my experience: My board allows Raid1 too, but I went for LVM + mdadm software raid, since it’s more flexible to manage. If you care, I can share you my notes setup via DM