Stuck verifying backend data (at start of backup)

Hi,
Duplicati backed up my files twice a week. Last week I noticed that the last successful backup was on 2023-06-19 (took 40 minutes). I tried manually starting the backup, a few reboots of my system (Windows 8.1 Pro), updated to 2.0.7.1_beta_2023-05-25, waited many hours, … No success.
About 5 minutes after starting a backup, UI shows that it verifies backend data. And this message never disappears…
Verbose Live Log in the UI shows the following entries (it’s been 15 hours since the last log message - still showing that it verifies backend data):

    1. Aug. 2023 19:27: Backend event: List - Started: ()
    1. Aug. 2023 19:21: Die Operation Backup wurde gestartet
    1. Aug. 2023 19:20: C:\ProgramData\Duplicati\updates\2.0.7.1\win-tools\gpg.exe
    1. Aug. 2023 19:19: Server gestartet und hört auf 127.0.0.1, Port 8200

My backup destination is FTP. I tried accessing the FTP-Server manually and noticed that Listing all files in the folder takes around 30-60 seconds as there are 45,000 .zip.aes files. Maybe that’s enough time to run into issues?
In the backup configuration I pressed the button to check the FTP connection. Then verbose live log showed me:

    1. Aug. 2023 09:52: Cannot open WMI provider \localhost\root\virtualization\v2. Hyper-V is probably not installed.
    1. Aug. 2023 09:52: Cannot find any MS SQL Server instance. MS SQL Server is probably not installed.

Thanks
Michael

You can try to switch to alternative FTP, maybe that behaves better with long access times.

I think those messages are not related to the problem.

The pre-backup verify does the following:

  • List backend files (this is started in your logs)
  • Check that all found files are recorded in the database
  • If any temporary or files that need to be deleted are there, delete them from the backend
  • If the backend supports it, check the quota (FTP doesn’t)

To me, this does not seem like it should take long, unless the list operation somehow hangs.

Did the test work? Internally, this also runs list, but it only reads a single file entry from the result.

Dear @Jojo-1000

Did the test work? Internally, this also runs list, but it only reads a single file entry from the result.

Yes, Connection test using FTP works fine.

You can try to switch to alternative FTP, maybe that behaves better with long access times.

Unfortunately, I can not get alternative FTP working…
Connection test using alternative FTP shows error “Failed to connect: Must use AUTH TLS”. I think this message comes from the FTP server (which is a FritzBox with external hard drive connected)? Maybe the problem is the missing checkbox for “Use SSL” when alternativ FTP is selected? See attached screenshots. It’s also not available as additional parameter --use-ssl in the GUI underneath.


Yes, the GUI is not really polished for alternative FTP. You need to use the advanced options on the bottom:

--aftp-encryption-mode=Explicit
--aftp-ssl-protocols=Tls12

Dear @Jojo-1000 ,
news from my last backup: With alternative FTP the backup really started but failed after a few hours. If I am interpreting the log correctly, the backup fails at the end when deleting old backups.

Verbose log shows:

    1. Aug. 2023 02:10: Die Operation Backup ist mit folgenden Fehler fehlgeschlagen: Timed out trying to read data from the socket stream!
    1. Aug. 2023 02:10: E-Mail erfolgreich über den Server gesendet: smtp://mail.gmx.net:587/?starttls=when-available
    1. Aug. 2023 02:09: Fatal error
    1. Aug. 2023 02:09: Backend event: List - Failed: ()
    1. Aug. 2023 02:09: Operation List with file attempt 5 of 5 failed with message: Timed out trying to read data from the socket stream!
    1. Aug. 2023 02:03: Backend event: List - Started: ()
    1. Aug. 2023 02:03: Backend event: List - Retrying: ()
    1. Aug. 2023 02:03: Operation List with file attempt 4 of 5 failed with message: Timed out trying to read data from the socket stream!
    1. Aug. 2023 01:57: Backend event: List - Started: ()
    1. Aug. 2023 01:57: Backend event: List - Retrying: ()
    1. Aug. 2023 01:57: Operation List with file attempt 3 of 5 failed with message: Timed out trying to read data from the socket stream!
    1. Aug. 2023 01:52: Backend event: List - Started: ()
    1. Aug. 2023 01:52: Backend event: List - Retrying: ()
    1. Aug. 2023 01:52: Operation List with file attempt 2 of 5 failed with message: Timed out trying to read data from the socket stream!
    1. Aug. 2023 01:46: Backend event: List - Started: ()
    1. Aug. 2023 01:46: Backend event: List - Retrying: ()
    1. Aug. 2023 01:46: Operation List with file attempt 1 of 5 failed with message: Timed out trying to read data from the socket stream!
    1. Aug. 2023 01:40: Backend event: List - Started: ()
    1. Aug. 2023 01:40: Deleted 2 files, which reduced storage by 1,98 KB
    1. Aug. 2023 01:40: Backend event: Delete - Completed: duplicati-ibc3fe9ee6d754f2087ad695c1d05d1b3.dindex.zip.aes (925 Bytes)
    1. Aug. 2023 01:40: Backend event: Delete - Started: duplicati-ibc3fe9ee6d754f2087ad695c1d05d1b3.dindex.zip.aes (925 Bytes)
    1. Aug. 2023 01:40: Backend event: Delete - Completed: duplicati-bcf3303e06e8a4e8e83d869a0961f2777.dblock.zip.aes (1,08 KB)
    1. Aug. 2023 01:40: Backend event: Delete - Started: duplicati-bcf3303e06e8a4e8e83d869a0961f2777.dblock.zip.aes (1,08 KB)
    1. Aug. 2023 01:40: Compacting because there are 1 fully deletable volume(s)
    1. Aug. 2023 01:40: Found 7 volume(s) with a total of 0,00% wasted space (24,13 MB of 1,20 TB)
    1. Aug. 2023 01:40: Found 16 small volumes(s) with a total size of 33,20 MB
    1. Aug. 2023 01:40: Found 1 fully deletable volume(s)
    1. Aug. 2023 01:38: Deleted 2 remote fileset(s)
    1. Aug. 2023 01:38: Backend event: Delete - Completed: duplicati-20230611T230000Z.dlist.zip.aes (107,26 MB)
    1. Aug. 2023 01:38: Backend event: Delete - Started: duplicati-20230611T230000Z.dlist.zip.aes (107,26 MB)
    1. Aug. 2023 01:38: Backend event: Delete - Completed: duplicati-20230604T230000Z.dlist.zip.aes (107,25 MB)
    1. Aug. 2023 01:38: Backend event: Delete - Started: duplicati-20230604T230000Z.dlist.zip.aes (107,25 MB)
    1. Aug. 2023 01:36: Deleting 2 remote fileset(s) …
    1. Aug. 2023 01:36: All backups to delete: 12.06.2023 01:00:00, 05.06.2023 01:00:00
    1. Aug. 2023 01:36: Backups outside of all time frames and thus getting deleted:
    1. Aug. 2023 01:36: Backups to consider: 30.07.2023 05:10:53, 02.07.2023 05:10:55, 25.06.2023 05:10:50, 12.06.2023 01:00:00, 05.06.2023 01:00:00, 21.05.2023 05:10:50, 07.05.2023 05:10:57, 10.04.2023 01:00:00, 06.02.2023 00:00:00, 26.01.2023 00:00:01, 01.01.2023 05:10:55, 19.12.2022 00:00:01, 27.11.2022 23:10:52, 20.11.2022 23:10:49, 02.06.2022 18:08:59
    1. Aug. 2023 01:36: Time frames and intervals pairs: 14.00:00:00 / 1.00:00:00, 56.00:00:00 / 7.00:00:00, 730.00:00:00 / 31.00:00:00
    1. Aug. 2023 01:36: Start checking if backups can be removed
    1. Aug. 2023 01:36: Backend event: Put - Completed: duplicati-20230810T144052Z.dlist.zip.aes (104,73 MB)
    1. Aug. 2023 01:34: Backend event: Put - Started: duplicati-20230810T144052Z.dlist.zip.aes (104,73 MB)
    1. Aug. 2023 01:34: Backend event: Put - Completed: duplicati-i6f0b958367d54a0eaa230347acd463c9.dindex.zip.aes (1,29 MB)
    1. Aug. 2023 01:34: Backend event: Put - Started: duplicati-i6f0b958367d54a0eaa230347acd463c9.dindex.zip.aes (1,29 MB)
    1. Aug. 2023 01:33: Backend event: Put - Completed: duplicati-b16c351b4caa64410a47b9bba1d538923.dblock.zip.aes (7,88 MB)
    1. Aug. 2023 01:33: Backend event: Put - Started: duplicati-b16c351b4caa64410a47b9bba1d538923.dblock.zip.aes (7,88 MB)
    1. Aug. 2023 01:32: Skipped checking file, because no metadata was updated E:\Archiv (bis 2020)\Arbeit Schule\Bewerbungen\Bewerbung (nach Uni)\Bewerbungsfotos\S11_5440.JPG
    1. Aug. 2023 01:32: Skipped checking file, because no metadata was updated E:\Archiv (bis 2020)\Arbeit Schule\Bewerbungen\Bewerbung (nach Uni)\Bewerbungsfotos\S11_5435.JPG
    1. Aug. 2023 01:32: Including path as no filters matched: E:\Archiv (bis 2020)\Arbeit Schule\Bewerbungen\Bewerbung (nach Uni)\Bewerbungsfotos\S11_5440.JPG
  • …many other files …

Do you have ideas how to fix this?

Hello

I’m afraid that Alternative FTP may not be fully compatible out of the box with your Ftp server. Ftp is a complex protocol with many quirks unfortunately. When it works it’s fine but sometimes it don’t and it’s with some piece of software but yes with another. If you want to plough on with it, I’d advise to try to set aftp-connection-type to EPSV. I had success with this fix with an Alternative FTP connection with some kind of Ftp server (Proftpd BTW; I don’t know if it’s used by the FritzBox - this kind of hardware is rather thin on the ground in my country - but I’d not be surprised as it’s common enough to be the preferred choice by people offering ftp service in a professional context, that is, on Linux boxes).

I’d advise to try it first with a test backup. If it works, try to fix your existing backend. Unfortunately, default Duplicati setup implies automatic compact and repair, this is fine if your driver is known to work and be reliable as is the case usually, but if your new driver is not working correctly it may have damaged your backend data :frowning:

It looks like the delete worked, but it is again the list operation that timed out. So the backup should be good, but the final verify step did not run.
The timeout is probably set to 60s and by chance the first verify ran faster, but the second one took longer. Unfortunately, there is currently no way to increase the timeout in either of the FTP backends.

You can try the Test option of the alternative FTP, this one does a full list of all the files and also writes a test file. If that fails sometimes but not always, the timeout is probably the issue.

One benefit of the alternative backend seems to be that it does not get stuck at least, so you should see when an error occurs.

In my test (only a few MB over local network), the FritzBox worked fine with the alternative backend. I don’t think there is a protocol issue.

this was precisely the case with the problem I had seen. It was working but only with a few MB.

Edit: BTW since you are interested in this backend / server combination, there is a design problem in Duplicati with FTP. Everywhere Duplicati use the same scheme: Put is done with Async, since for Put it fires a send and then waits for the resulting status, but for Get it just calls and wait for the result, since it’s the same yes ? but no, it breaks for FTP, because FTP has 2 channels, one to get data, and the command channel to get the result.
For ‘normal’ FTP this is never a problem since the detail of the Ftp protocol is managed by Microsoft, however for the Alternate FTP, this is not the case. What can happen is that from time to time Duplicati get the ‘get data’ result when waiting for the result of the next command.
This is timing dependent and don’t seem to happen with FtpZilla.
I have seen it with Proftpd server for the test connection function, clear as day since in my test tree I have enabled Ftp trace. It’s mostly hidden by the retries managed by Duplicati at the higher level.
I have posted my change to my private Github repo:

I don’t know if this is the problem that can be worked around with EPSV. I did not have the time to do much experiments with this.

On the Net5 branch, get is also done with async, so hopefully this will change at some point.

I see that we use OpenRead() which is low-level, maybe we should use DownloadStream() instead so the library can handle that (also chunking)? Can you try if that also fixes the bug you experienced? I hope we won’t need to pay so much attention to the FTP specifics then.

If the FTP library is too old, there is a branch with an updated version on the main GitHub.

EPSV isn’t working for me. Connection check shows “Failed to connect: Timed out trying to read data from the socket stream!”.

I found some hints regarding hardware acceleration of my FritzBox slowing down FTP. I have now disabled it (resets on FritzBox reboot…) and running a Duplicati test backup with normal FTP (non-alternative) at the moment.

However, that unstable backup procedure is really inconvenient… It takes so much time debugging and I want to rely on my backups.
Can you recommend a cheap cloud storage provider?
I have roughly 1.25 TB “archive” data (need them 1-2 times a year), 150 GB “sometimes” data (needed maybe 5-10 times a year) and 75 GB “daily” data (used daily, synced on all devices, even here only a few MB per day are changed).
I am currenctly syncing all my devices with Syncthing and creating backups using my 24/7 Windows “server” (Windows 8.1) in location 1 to a USB hard drive connected to the FritzBox in location 2.
Maybe I should move all my files to a cloud with some nice features like collaboration (office files), sharing of folders, file history (does this replace a backup?). Any recommendations on that? Sounds like the simplest solution to me.

It depends a bit on how file history is done at whatever you have in mind, but generally the goals are conflicting between easy access for users but no access for the malware that got onto your systems.

Some cloud drives such as Microsoft’s paid plans allow roll back of everything to a pre-malware date, however this is a little broad, whereas picking date for every individual file will likely be overwhelming.

For more opinions on the topic, you can search for something like does cloud drive replace a backup.

Pure sync without any versioning is the worst, as any file delete or malware damage gets everywhere.

It might be best to separate the uses if that’s possible. In some cases, convenience may be important. Other times you might be willing to restore from backup. That storage should be less accessible (less risk) so malware has a harder time destroying the backup as easily as it destroyed the original source.

For storage that doesn’t pretend to be a drive, Backblaze B2 and Wasabi might be reasonable options. Large providers such as Google Cloud (not the same as Google Drive) and Amazon have options with prices aimed more at backup or archival use. Don’t use true cold storage that needs cloud-side restore before Duplicati can get the file. Hot storage (access any time) that just plays pricing games will be OK.

There might be other options, including letting archive data be single version, done outside of backups, perhaps even raw files to cold storage, accessed once or twice and year, and with some updating plan.

One plan doesn’t fit all uses well, but you’ll have to decide your needs, maybe based on the above hints.

You might be at default 50 MB Remote volume size on the Options screen. If so a painful possible workaround is to raise remote volume size and let compact repack the files into larger ones in small enough number that maybe the file listing will run. You probably need lots of local storage to do that.

Another awkward FTP workaround might be to see if rclone FTP can do a list. If yes, use Duplicati’s Rclone storage type to have Duplicati do those (and file transfers) with rclone (adding another link to perhaps go wrong, but built-in links don’t seem to be working with FritzBox, so there’s another idea).

I think a web search shows people using SSH based transfers on FritzBox. Would that be possible?

Thanks for sharing all the ideas.

I now removed all my old backups and started a new run with remote volume size 250 MB. Hopefully this solves my FTP problem… Will have an answer in about 1 day^^

SSH based transfers to FritzBox aren’t possible. Called the support yesterday to ask this :wink: