LND randomly becomes unavailable (RPC services)

Hi, I am running LND 0.18.0 on umbrelOS 1.2.0.
The issue is that LND randomly goes offline and stays offline until I restart it.
When I troubleshoot, I see “RPC services not available”.

More specifically I get many of these errors (port numbers change):

lightning_lnd_1        | 2024/07/01 09:48:09 http: TLS handshake error from 10.21.0.37:xxxxx: EOF
lightning_lnd_1        | 2024-07-01 09:48:10.176 [ERR] RPCS: [/lnrpc.Lightning/GetInfo]: waiting to start, RPC services not available

Any idea how to get rid of this? This is really annoying.

1 Like

Hey @Django

This relates to a problem with the secure connection between your Lightning Node and the IP specified. It could be that there’s a problem with one of the certificates, mismatched TLS version between the nodes, or possibly just corrupted data.

As for the Lightning Node going offline, it could be a networking issue or perhaps a problem with the configuration.

Hey, thanks for the reply.

I am running a second instance of LND on a different device but on the same network.
Could that be causing problems?

Possibly. Try isolating the Umbrel from the network / use it on a different subnet if you can or stop the second LND instance if you’re able to and see it that helps.

I’m personally not very familliar with Lightning or Bitcoin for that matter so I’m limited in the help I can give but I do recommend having a look at the Umbrel troubleshooting guide if you haven’t already here.

(Official Umbrel Troubleshooting Guide and FAQ)

Additionally, you might find this GitHub issue helpful in resolving the problem.

(RPC services not available · Issue #267 · lightninglabs/lightning-terminal · GitHub)

1 Like

Mine started doing this today. I’m on umbrelOS 1.1 but haven’t upgraded to LND 0.18.0.

I can see in the log this error

22:39:06.615645657      30 ssl_transport_security.cc:1245] Handshake failed with fatal error SSL_ERROR_SSL: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed.
lightning_app_1        | Waiting for LND...

Nothing on my network has changed nor do I have issues with any of my other devices. Bitcoin is synced. I even restarted (probably not a good idea) without any luck.

This is not related to RPC services.

I think you could maybe try to solve this by deleting the certificate and restarting LND. I’ve come across this advice on other forums, but I’ve never tried it myself. I would definitely back up the certificate file though.

It looks like I was actually able to solve this, it seems like a performance issue. I turned on the Database Sync in LND’s configuration (sync-freelist = true) and this problem no longer happens. I also now Delete Canceled Invoices in Real-time (gc-canceled-invoices-on-the-fly = true).

Ah OK. That is what RTL and the logs were showing as well. Anyway, it resolved itself while I was asleep and all is well again!

1 Like

May I ask how you did that? The file umbrel-lnd.conf is overwritten once I restart lnd, so I can’t make any changes to it that will persist. I have the same problem as the OP with lnd not starting, hoping this solution will fix my problem…

You can either change this in the web UI or you create a new file called lnd.conf in the same directory as umbrel-lnd.conf.

I tried the file method but it doesn’t work. I’m trying to add this configuration:

--db.batch-commit-interval=1m

I also tried giving the lnd.conf file full read-write permissions. Still no dice.