Backups take very long with large data sets

romanodog · August 7, 2019, 3:14am

I’m not sure if I have something configured incorrectly or if this is normal but I have a couple backup sets set up for movies and tv shows that are 3tb and 4.5tb, respectively. I am running duplicati in a docker in unraid. I use google drive as my destination. The initial backups took a long time, as expected, but when it runs backups now they can each take up to 2 days to complete. It looks like it is scanning/hash checking every single file in the backup set even if they have not been modified… is that right? Can I have it skip over these files and only touch new ones? From what I’ve been able to research it should skip over files that it doesn’t see as modified, but it’s doing something to each one. Movie files that are around 15gb are being processed for many minutes. I’m using basically the default settings, I only adjusted the schedule and switched retention to smart backup retention. I just want to know if what I have is normal, and if it is, is there a way to speed this up?

ts678 · August 9, 2019, 2:19pm

Welcome to the forum @romanodog

I think the intent is that a file gets scanned *if* it “changes”. An example of the testing process is below:

2019-06-22 11:39:53 -04 - [Verbose-Duplicati.Library.Main.Operation.Backup.FilePreFilterProcess.FileEntry-CheckFileForChanges]: Checking file for changes C:\PortableApps\Notepad++Portable\App\Notepad++\backup\webpages.txt@2019-06-19_213642, new: False, timestamp changed: True, size changed: True, metadatachanged: True, 6/22/2019 3:20:41 PM vs 6/20/2019 1:42:25 AM

Viewing the Duplicati Server Logs at Verbose level might show you findings but could be swamped with output if you have many files. Advanced options –log-file and –log-file-log-level=verbose might be better.

Viewing the log files of a backup job will show you lots of statistics such as below. Can you post some?

ModifiedFiles: 3
ExaminedFiles: 422
OpenedFiles: 3
AddedFiles: 0
SizeOfModifiedFiles: 29902239
SizeOfAddedFiles: 0
SizeOfExaminedFiles: 402112849
SizeOfOpenedFiles: 29902239

You can see in above that in this backup only a few of the files seen were seen as modified, so scanned. Maybe your results are different, but statistics are a good initial way to see what other factors to look into.