Downloading block chain issues on 0.5.0

Hello all!

I’m pretty new to the Umbrel community but not new to *nix, command line, or Docker so I know enough to break anything. :laughing:

I’ve installed Umbrel 0.5.0 on a Ubuntu 20.04 VM in ESXi. My ESXi server has ~4TB SSD (not HDD) for storage and has lots of headroom for CPU and memory. I’ve been playing with Umbrel for a few days trying to get the blockchain to sync. I’m currently on my third build from scratch to try and troubleshoot my node. On my 1st build I went through all the troubleshooting steps like reseting user data, bad shutdown steps, etc. None of that worked. By my third build I’ve been extremely careful to run the node extremely slim and not “play” with the node as its syncing. It only has the “Bitcoin Node” app installed. My problem seems to be the same on all three of my build attempts so at least I’ve been able to pinpoint the problem area as the same problem repeats itself.

When the node is first installed everything works great and will sync to ~40%-60% in the first 48hrs before it fails. Again, this has been verified 3 times. On fail, the bitcoin_bitcoind_1 container goes into a reboot loop. Once it fails, waiting for +4 days shows no difference in syncing, used drive space, or block number being synced. This is due to a corrupted blockchain sync but I’m not sure why it continues to get corrupted. Removing ~/umbrel/app-data/bitcoin/data/bitcoin/* (after shutting down the containers) and restarting the containers starts the bitcoin container working again and starts the syncing process again from scratch. Any thoughts on how to fix a corrupted blockchain w/out having to restart the sync from scratch?

=====================
= Umbrel debug info =
=====================

Umbrel version
--------------
0.5.0

Memory usage
------------

  •          total        used        free      shared  buff/cache   available*
    

Mem: 8.0G 874M 184M 1.0M 6.9G 6.8G
Swap: 4.1G 225M 3.9G

total: 11.0%
bitcoin: 8.4%
system: 2%
tor: 0.6%
lnd: 0%
electrs: 0%
bitcoin: 0%

Memory monitor logs
-------------------
2022-06-10 14:26:21 Memory monitor running!

  • 12101 tty1 S+ 0:00 bash ./scripts/memory-monitor*
    Memory monitor is already running
    2022-06-10 14:57:28 Memory monitor running!

Filesystem information
----------------------
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/ubuntu–vg-ubuntu–lv 785G 225G 528G 30% /
/dev/mapper/ubuntu–vg-ubuntu–lv 785G 225G 528G 30% /

Karen logs
----------

{“message”:“Successfully uploaded backup 1655480293845.tar.gz.pgp for backup ID 40015d557ab9dc08a2ced85c945c14e5d3b668265e3227d0e37e0913b6de72c1”}
=============================
====== Backup success =======
=============================
Got signal: backup
karen is getting triggered!
Deriving keys…
Creating backup…
Adding random padding…
1+0 records in
1+0 records out
7571 bytes (7.6 kB, 7.4 KiB) copied, 0.000260492 s, 29.1 MB/s
Creating encrypted tarball…
backup/
backup/.padding
Uploading backup…

  • % Total % Received % Xferd Average Speed Time Time Time Current*

  •                             Dload  Upload   Total   Spent    Left  Speed*
    
  • 0 0 0 0 0 0 0 0 --:–:-- --:–:-- --:–:-- 0*

  • 0 0 0 0 0 0 0 0 --:–:-- --:–:-- --:–:-- 0*
    100 8077 0 0 100 8077 0 5141 0:00:01 0:00:01 --:–:-- 5138
    100 8223 100 146 100 8077 72 4002 0:00:02 0:00:02 --:–:-- 4074
    {“message”:“Successfully uploaded backup 1655485669077.tar.gz.pgp for backup ID 40015d557ab9dc08a2ced85c945c14e5d3b668265e3227d0e37e0913b6de72c1”}
    =============================
    ====== Backup success =======
    =============================
    Got signal: change-password
    karen is getting triggered!
    This script must only be run on Umbrel OS
    Got signal: backup
    karen is getting triggered!
    Deriving keys…
    Creating backup…
    Adding random padding…
    1+0 records in
    1+0 records out
    6699 bytes (6.7 kB, 6.5 KiB) copied, 0.000246273 s, 27.2 MB/s
    Creating encrypted tarball…
    backup/
    backup/.padding
    Uploading backup…

  • % Total % Received % Xferd Average Speed Time Time Time Current*

  •                             Dload  Upload   Total   Spent    Left  Speed*
    
  • 0 0 0 0 0 0 0 0 --:–:-- --:–:-- --:–:-- 0*

  • 0 0 0 0 0 0 0 0 --:–:-- --:–:-- --:–:-- 0*

  • 0 0 0 0 0 0 0 0 --:–:-- 0:00:01 --:–:-- 0*
    100 7216 0 0 100 7216 0 2910 0:00:02 0:00:02 --:–:-- 2909
    100 7362 100 146 100 7216 52 2618 0:00:02 0:00:02 --:–:-- 2670
    {“message”:“Successfully uploaded backup 1655496190528.tar.gz.pgp for backup ID 40015d557ab9dc08a2ced85c945c14e5d3b668265e3227d0e37e0913b6de72c1”}
    =============================
    ====== Backup success =======
    =============================
    Got signal: change-password
    karen is getting triggered!
    This script must only be run on Umbrel OS
    Got signal: debug
    karen is getting triggered!

Docker containers
-----------------
NAMES STATUS
manager Up 7 days
dashboard Up 7 days
tor Up 7 days
auth Up 7 days
bitcoin_server_1 Up 7 days
bitcoin_tor_server_1 Up 7 days
bitcoin_app_proxy_1 Up 7 days
bitcoin_bitcoind_1 Up 41 seconds
middleware Up 7 days
nginx Up 7 days

Umbrel logs
-----------

Attaching to manager
manager | ::ffff:10.21.21.2 - - [Fri, 17 Jun 2022 21:12:42 GMT] “GET /v1/apps?installed=1 HTTP/1.0” 304 - “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.5 Safari/605.1.15”
*manager | *
manager | umbrel-manager
manager | ::ffff:10.21.21.2 - - [Fri, 17 Jun 2022 21:12:42 GMT] “GET /v1/apps HTTP/1.0” 304 - “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.5 Safari/605.1.15”
*manager | *
manager | umbrel-manager
manager | ::ffff:10.21.21.2 - - [Fri, 17 Jun 2022 21:12:42 GMT] “GET /v1/system/get-update HTTP/1.0” 304 - “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.5 Safari/605.1.15”
*manager | *
manager | umbrel-manager
manager | ::ffff:10.21.0.2 - - [Fri, 17 Jun 2022 21:12:42 GMT] “GET /v1/account/token?token=6c433e9465b385323a1cb7eeca17b61bb47f34897552d0be64bd5f80f127eb08 HTTP/1.1” 200 16 “-” “app-proxy/0.0.1”
*manager | *
manager | umbrel-manager
manager | ::ffff:10.21.0.2 - - [Fri, 17 Jun 2022 21:12:43 GMT] “GET /v1/account/token?token=6c433e9465b385323a1cb7eeca17b61bb47f34897552d0be64bd5f80f127eb08 HTTP/1.1” 200 16 “-” “app-proxy/0.0.1”
*manager | *
manager | umbrel-manager
manager | ::ffff:10.21.21.2 - - [Fri, 17 Jun 2022 21:12:45 GMT] “POST /v1/system/debug HTTP/1.0” 200 17 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.5 Safari/605.1.15”
*manager | *
manager | umbrel-manager
manager | ::ffff:10.21.0.2 - - [Fri, 17 Jun 2022 21:12:45 GMT] “GET /v1/account/token?token=6c433e9465b385323a1cb7eeca17b61bb47f34897552d0be64bd5f80f127eb08 HTTP/1.1” 200 16 “-” “app-proxy/0.0.1”
*manager | *
manager | umbrel-manager
manager | ::ffff:10.21.21.2 - - [Fri, 17 Jun 2022 21:12:46 GMT] “GET /v1/system/debug-result HTTP/1.0” 200 23 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.5 Safari/605.1.15”
*manager | *
manager | umbrel-manager
manager | ::ffff:10.21.21.2 - - [Fri, 17 Jun 2022 21:12:47 GMT] “GET /v1/system/debug-result HTTP/1.0” 304 - “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.5 Safari/605.1.15”
*manager | *
manager | umbrel-manager
manager | ::ffff:10.21.21.2 - - [Fri, 17 Jun 2022 21:12:48 GMT] “GET /v1/system/debug-result HTTP/1.0” 304 - “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.5 Safari/605.1.15”
*manager | *
manager | umbrel-manager

Tor logs
--------

Attaching to tor
tor | Jun 17 20:57:14.000 [notice] Heartbeat: Tor’s uptime is 7 days 6:00 hours, with 89 circuits open. I’ve sent 8.20 GB and received 224.48 GB. I’ve received 42284 connections on IPv4 and 0 on IPv6. I’ve made 1793 connections with IPv4 and 0 with IPv6.
tor | Jun 17 20:57:14.000 [notice] While bootstrapping, fetched this many bytes: 640998 (consensus network-status fetch); 14097 (authority cert fetch); 4995457 (microdescriptor fetch)
tor | Jun 17 20:57:14.000 [notice] While not bootstrapping, fetched this many bytes: 7333727 (consensus network-status fetch); 1775 (authority cert fetch); 6280288 (microdescriptor fetch)
tor | Jun 17 20:57:14.000 [notice] Average packaged cell fullness: 28.875%. TLS write overhead: 3%
tor | Jun 17 21:00:47.000 [notice] Have tried resolving or connecting to address ‘[scrubbed]’ at 3 different places. Giving up.
tor | Jun 17 21:07:47.000 [notice] Have tried resolving or connecting to address ‘[scrubbed]’ at 3 different places. Giving up.
tor | Jun 17 21:11:06.000 [notice] Have tried resolving or connecting to address ‘[scrubbed]’ at 3 different places. Giving up.
tor | Jun 17 21:11:33.000 [notice] Have tried resolving or connecting to address ‘[scrubbed]’ at 3 different places. Giving up.
tor | Jun 17 21:12:47.000 [notice] Have tried resolving or connecting to address ‘[scrubbed]’ at 3 different places. Giving up.

App logs
--------

bitcoin

tor_server_1 | Jun 17 03:10:58.000 [notice] No circuits are opened. Relaxed timeout for circuit 27666 (a Hidden service: Uploading HS descriptor 4-hop circuit in state doing handshakes with channel state open) to 60000ms. However, it appears the circuit has timed out anyway. [1 similar message(s) suppressed in last 6900 seconds]
tor_server_1 | Jun 17 04:14:57.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 1000 buildtimes.
tor_server_1 | Jun 17 04:16:41.000 [notice] No circuits are opened. Relaxed timeout for circuit 27772 (a Hidden service: Uploading HS descriptor 4-hop circuit in state doing handshakes with channel state open) to 60000ms. However, it appears the circuit has timed out anyway. [1 similar message(s) suppressed in last 3960 seconds]
tor_server_1 | Jun 17 05:31:58.000 [notice] No circuits are opened. Relaxed timeout for circuit 27891 (a Hidden service: Uploading HS descriptor 4-hop circuit in state doing handshakes with channel state open) to 60000ms. However, it appears the circuit has timed out anyway. [1 similar message(s) suppressed in last 4560 seconds]
tor_server_1 | Jun 17 06:01:42.000 [warn] Failed to find node for hop #1 of our path. Discarding this circuit.
tor_server_1 | Jun 17 06:01:42.000 [notice] Our circuit 0 (id: 28103) died due to an invalid selected path, purpose Hidden service: Pre-built vanguard circuit. This may be a torrc configuration issue, or a bug.
tor_server_1 | Jun 17 06:01:49.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 273 buildtimes.
tor_server_1 | Jun 17 07:55:17.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 267 buildtimes.
tor_server_1 | Jun 17 08:36:55.000 [notice] No circuits are opened. Relaxed timeout for circuit 28486 (a Hidden service: Establishing introduction point 4-hop circuit in state doing handshakes with channel state open) to 60000ms. However, it appears the circuit has timed out anyway. [1 similar message(s) suppressed in last 11100 seconds]
tor_server_1 | Jun 17 08:57:19.000 [notice] Heartbeat: Tor’s uptime is 6 days 18:00 hours, with 29 circuits open. I’ve sent 286.31 MB and received 130.40 MB. I’ve received 0 connections on IPv4 and 0 on IPv6. I’ve made 656 connections with IPv4 and 0 with IPv6.
tor_server_1 | Jun 17 08:57:19.000 [notice] While bootstrapping, fetched this many bytes: 640998 (consensus network-status fetch); 14097 (authority cert fetch); 4994624 (microdescriptor fetch)
tor_server_1 | Jun 17 08:57:19.000 [notice] While not bootstrapping, fetched this many bytes: 4674176 (consensus network-status fetch); 1775 (authority cert fetch); 6066406 (microdescriptor fetch)
tor_server_1 | Jun 17 10:14:01.000 [notice] No circuits are opened. Relaxed timeout for circuit 28678 (a Hidden service: Uploading HS descriptor 4-hop circuit in state doing handshakes with channel state open) to 60000ms. However, it appears the circuit has timed out anyway. [7 similar message(s) suppressed in last 5880 seconds]
tor_server_1 | Jun 17 13:12:54.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 531 buildtimes.
tor_server_1 | Jun 17 14:19:14.000 [notice] No circuits are opened. Relaxed timeout for circuit 29071 (a Measuring circuit timeout 4-hop circuit in state doing handshakes with channel state open) to 60000ms. However, it appears the circuit has timed out anyway. [3 similar message(s) suppressed in last 14760 seconds]
tor_server_1 | Jun 17 14:57:19.000 [notice] Heartbeat: Tor’s uptime is 7 days 0:00 hours, with 29 circuits open. I’ve sent 293.16 MB and received 133.33 MB. I’ve received 0 connections on IPv4 and 0 on IPv6. I’ve made 662 connections with IPv4 and 0 with IPv6.
tor_server_1 | Jun 17 14:57:19.000 [notice] While bootstrapping, fetched this many bytes: 640998 (consensus network-status fetch); 14097 (authority cert fetch); 4994624 (microdescriptor fetch)
tor_server_1 | Jun 17 14:57:19.000 [notice] While not bootstrapping, fetched this many bytes: 4853668 (consensus network-status fetch); 1775 (authority cert fetch); 6159063 (microdescriptor fetch)
tor_server_1 | Jun 17 15:25:53.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 294 buildtimes.
tor_server_1 | Jun 17 16:35:12.000 [notice] No circuits are opened. Relaxed timeout for circuit 29371 (a Hidden service: Establishing introduction point 4-hop circuit in state doing handshakes with channel state open) to 75393ms. However, it appears the circuit has timed out anyway. [4 similar message(s) suppressed in last 7440 seconds]
tor_server_1 | Jun 17 19:00:48.000 [notice] No circuits are opened. Relaxed timeout for circuit 29558 (a Hidden service: Establishing introduction point 4-hop circuit in state doing handshakes with channel state open) to 60000ms. However, it appears the circuit has timed out anyway. [1 similar message(s) suppressed in last 8760 seconds]
tor_server_1 | Jun 17 19:02:00.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 360 buildtimes.
tor_server_1 | Jun 17 20:24:31.000 [notice] No circuits are opened. Relaxed timeout for circuit 29791 (a Hidden service: Establishing introduction point 4-hop circuit in state doing handshakes with channel state open) to 91896ms. However, it appears the circuit has timed out anyway.
tor_server_1 | Jun 17 20:57:19.000 [notice] Heartbeat: Tor’s uptime is 7 days 6:00 hours, with 21 circuits open. I’ve sent 299.89 MB and received 136.34 MB. I’ve received 0 connections on IPv4 and 0 on IPv6. I’ve made 670 connections with IPv4 and 0 with IPv6.
tor_server_1 | Jun 17 20:57:19.000 [notice] While bootstrapping, fetched this many bytes: 640998 (consensus network-status fetch); 14097 (authority cert fetch); 4994624 (microdescriptor fetch)
tor_server_1 | Jun 17 20:57:19.000 [notice] While not bootstrapping, fetched this many bytes: 5007684 (consensus network-status fetch); 1775 (authority cert fetch); 6287173 (microdescriptor fetch)
================
==== Result ====
================
The debug script did not automatically detect any issues with your Umbrel.

If it helps here is the Docker container log for bitcoin_bitcoind_1:

2022-06-18T18:03:17Z Fatal LevelDB error: Corruption: not an sstable (bad magic number)
2022-06-18T18:03:17Z You can use -debug=leveldb to get more complete diagnostic messages
2022-06-18T18:03:17Z


EXCEPTION: 15dbwrapper_error
Fatal LevelDB error: Corruption: not an sstable (bad magic number)
bitcoin in scheduler

Just an update on my own post: My issue ended up being a bad drive. I started seeing errors directly from the drive and FS. Changed out the drive and all the issues went away. I even ran through the whole process twice as a test.

The new system now runs from a iSCSI RAID10 VM and a SMB share. :smiley: