Various positive + negative feedback

No WebDAV for Office 365 Germany available. So how I will pull the Duplicati Canary via Docker on the NAS now.

I managed to install the D2 Docker container and start it. For some reason port 8200 is claimed to already be used, so I used port 18200. I cannot access Duplicati via web-browser via that port on the NAS, though, so something seems to be missing in my configuration.

Started another thread about the Docker installation. Going back to feedback:

  • Repair seems to start single-threaded only, bottle-necking the process on a single CPU core.

For some reason my NAS keeps freezing its SMB server, likely because of the WD Mybook external USB drive. This just happened while my Windows PC backed up files from the NAS to Sharepoint. The backup job then stated that over 5000 files were missing and that a repair had to be done.

This repair is just happening and it is completely bottle-necked by 100% CPU load on a single core. Disk I/O is not happening at the time of this writing (local database) and network I/O is only around 1.2 mbit (32 kb/s send, 2.5 kb/s receive).

D2ā€™s progress bar keeps displaying ā€œStarting Backupā€ while this is happening.

Before I hit Repair I switched the Source back and forth from the local Pictures folder directly to the NAS folder (via \IP\folder) and then back to the local Pictures folder that was originally backed up. The repair failed ā€œGot 2514 error(s)ā€.

I then tried to delete and repair, but got told that the 5000+ files are missing, even though they were still present. I then deleted the database and started a new backup before all old files were deleted by the Onedrive app, Duplicati then told me that there were still old files present. Since I already started the deletion I did not pursue this further, but will test it again later.

Now I am in the middle of another backup, being bottlenecked by my upload bandwidth of 3.8 mb/s. There were some small stalls at the beginning of the backup, but currently it keeps uploading without interruptions, so maybe the Canary fixed that part. We will see in a few more hours into the backup.

I just did the following using the Canary: Started a new backup job on a Windows PC backing up a NAS SMB share to Sharepoint. The job ran halfway through the upload, then I cancelled it (ā€œafter current fileā€). Then I restarted the job without looking to close what happened, but it did start uploading at some point again.

Next my NAS decided to freeze the SMB share due to some incompatibility with its external USB drive, thus the job was cancelled again due to connection loss with the SMB source.

Duplicati did not ask for a Repair, so I just reconnected the SMB shares and then restarted the job.

Unfortunately, it seems that Duplicati reads (hashes?) through all source files again instead of just checking date changes. In practice my NAS SMB connection can deliver around 110 mb/s for single large files, but Duplicati only reads at about 80 mb/s while running 16 (+) threads of light CPU load (9900K at 5 Ghz). The total CPU load of said threads sums up to what a single CPU core would achieve (around 6.25% total CPU load), but these are still multiple threads happening in parallel on different ā€œIdeal Processorsā€ and for very short periods of time Duplicatiā€™s total CPU load increases towards 10%.

Only once it went through the content of all the already backed up files does Duplicati start upload the other half of the formerly cancelled backup again.

Is this supposed to work like this?

Normally it will only reprocess file contents if it detects metadata changes (such as a timestamp change).

Not sure if the fact that the backup was interrupted and restarted has anything to do with your experience, but Iā€™m guessing that must be what it was. You can test on a smaller backup to confirm more easily - let it finish, then modify or add one file, then run the backup again and it should only process that one new/changed file.

Once D2 started re-uploading I cancelled it again manually (Stop after current file), then restarted. Indeed it reads through all file content again despite the files not having changed a bit since the seconds I stopped it last.

For a manual (=error free) cancellation this seems a bit ungraceful and unnecessarily resource-intensive. Especially considering that this is not a Repair, but only a restarted backup.

Edit: It just finished going through all files and refused to start uploading afterwards (Got 2674 error(s)). So it seems that now a real repair is in order.

Overall I am not entirely happy with how cancelled and/or interrupted backups are handled. More tests are in order before I can trust this.

Yeah itā€™s a known weak point and there are efforts underway to fix it.

  • Deleting a backup job + data from Sharepoint left the data on the server (over 5000 files).

I will delete those manually now. The fastest way to delete so many files is to just delete the folder from the web-interface. Every other mechanic tries to delete one file at a time. Ouch.

Yeah I would manually delete on the back end. I think one reason Duplicati doesnā€™t delete the folder itself is because it may not be safe to do so. There may be other contents unrelated to this jobā€™s backup data.

I did not mean for Duplicati to delete the folder, but it should have deleted the (5000+) backup files. Taking ages to do so was expected (and what I was testing for), but the files stayed on the server when D2 claimed to be finished.

Ah, interesting. The few times I have used the feature where Duplicati deletes the back end data, it worked properly. Maybe itā€™s an issue with its SharePoint back end implementation only (which I have never tested or used).

Deleting many files is were various operating-systems and others struggle. Sending several thousand deletes over an internet connection is a challenge and I mean to have read that Onedrive (and likely Sharepoint) has more trouble with many small files.

Even the Onedrive client takes a very long time to delete so many files from my end. Deleting the files via web front-end is faster, but you can only mark 30 files at once or so. That is why I usually try to delete the whole folder and why I keep different backup destinations in different folder with Duplicati, since it does not create sub-folders for its many files by itself.

  • After over 17 hours of uninterrupted upload the backup on my Windows PC finished stating: ā€œFound 8521 files that are missing from the remote storage, please run repair.ā€

The last two Remote log entries list:

Oct 10, 2019 6:50 PM: list
[



]
Oct 10, 2019 6:50 PM: list
[



]

At the same time there are 8523 files in the destination Sharepoint folder.

What to make out of this?

Did a repair, took 3 hours since my last post here. Result:

ā€œGot 4260 error(s)ā€.

All the Remote job log is telling me, is those two empty list entries again.

The ā€œverboseā€ log tells me:

" * Oct 10, 2019 10:56 PM: The operation Repair has completed

  • Oct 10, 2019 10:56 PM: Backend event: Put - Completed: duplicati-i664a0393d4494976a666897d8eb492af.dindex.zip.aes (10,47 KB)

  • Oct 10, 2019 10:56 PM: Backend event: Put - Started: duplicati-i664a0393d4494976a666897d8eb492af.dindex.zip.aes (10,47 KB)

  • Oct 10, 2019 10:56 PM: Failed to perform cleanup for missing file: duplicati-b995eb2af964f45d596b7827c212dffef.dblock.zip.aes, message: Repair not possible, missing 89 blocks. If you want to continue working with the database, you can use the ā€œlist-broken-filesā€ and ā€œpurge-broken-filesā€ commands to purge the missing data from the database and the remote storage.

  • Oct 10, 2019 10:56 PM: This may be fixed by deleting the filesets and running repair again

  • Oct 10, 2019 10:56 PM: duplicati-20191009T233246Z.dlist.zip.aes

  • Oct 10, 2019 10:56 PM: Repair cannot acquire 89 required blocks for volume duplicati-b995eb2af964f45d596b7827c212dffef.dblock.zip.aes, which are required by the following filesets:"

The source data did not change or move, so I wonder whatā€™s with those 89 missing blocks? ā€œrebuild-missing-dblock-filesā€ is set in the global settings.

My goodnessā€¦ I think Iā€™d give up on SharePoint for the back end at this point :frowning:

1 TB space on Sharepoint (aka Onedrive Business) is part of Office 365, so itā€™s available for ā€œfreeā€. Additionally Office 365 ā€œGermanyā€ is using German data-centers and following more strict data privacy rules. The only German based cloud service coming even close to all the US based ones in function and price is Strato ā€œHidriveā€ (supported by Synology HyperBackup).

Tried a ā€œDelete and Recreateā€ on the database. Result:

ā€œNo files were found at the remote location, perhaps the target url is incorrect?ā€

Connection works (Test Connection), files are there, Duplicati, log shows an empty list again. Restarting Duplicati didnā€™t help. Curious case.

This is probably hitting the SharePoint 5000 item list view limit. Iā€™m not that familiar with the internals of the Microsoft product offerings, but If youā€™re not using Microsoft SharePoint v2, you could try that instead.

ā€˜No files were found at the remote locationā€™ using Onedrive for business has some discussion, and in this case switching to ā€œv2ā€ version helped. Iā€™m not sure it will always. Logic and usecase for remote subfolders would be the long-term solution, but isnā€™t done yet. Meanwhile, large remote volume size is a workaround.

I tried to restore the back directly from the Sharepoint source. Same result, Duplicati claims to find no files. Tried to access the files from a Docker installation, same result.

Thing is, I was messing around with a Docker installation while the Windows PC uploaded. At one point I pointed the Docker installation to the same Sharepoint folder as the ongoing Windows installation. It may well be possible that I deleted some file that was needed for Duplicati to identify the backup.

I am not downloading the whole Sharepoint (Onedrive Business) folder via Onedrive application and will try to access the backup from the local Onedrive folder instead.

I remember to have read about that just a few days ago, so you are likely onto something. Still strange, though, that Duplicatiā€™s ā€œListā€ record is empty and thus claims to find no files. There are 8523 files present, so well above 5000.

Switching to ā€œv2ā€ is no option, because the idea here is to test how well Duplicati play with the ā€œfreeā€ (as in part of Office 365) Onedrive Business Germany (even though this service wonā€™t be around for too long anymore).

Large remote volume size can bring its own problems with Duplicati, but it would get around the 5000 files limit.