Duplicati downloading huge amounts of data during backup

Hi all,

Using Duplicati 2.0.3.3_beta to back up a ~360GB dataset consisting of Hyper-V VM’s, to Wasabi, daily.

It’s been working fine for a few weeks but in the last few days it’s started downloading huge amounts of data during the backup process (i.e. 184GB in the last 24hrs - about half the dataset size!).

The backup logs show many ‘get’ and ‘delete’ commands throughout the backup, outnumbering the ‘put’ commands. Volume size is set to 150MB.

Final result in log shows the following:

DeletedFiles: 0
DeletedFolders: 0
ModifiedFiles: 0
ExaminedFiles: 40
OpenedFiles: 0
AddedFiles: 0
SizeOfModifiedFiles: 0
SizeOfAddedFiles: 0
SizeOfExaminedFiles: 388798312976
SizeOfOpenedFiles: 0
NotProcessedFiles: 0
AddedFolders: 0
TooLargeFiles: 0
FilesWithError: 0
ModifiedFolders: 0
ModifiedSymlinks: 0
AddedSymlinks: 0
DeletedSymlinks: 0
PartialBackup: False
Dryrun: False
MainOperation: Backup
CompactResults: null
DeleteResults:
    DeletedSets: []
    Dryrun: False
    MainOperation: Delete
    CompactResults: null
    ParsedResult: Success
    EndTime: 2/07/2018 7:14:13 AM (1530479653)
    BeginTime: 2/07/2018 7:12:10 AM (1530479530)
    Duration: 00:02:03.3996250
    BackendStatistics:
        RemoteCalls: 5
        BytesUploaded: 0
        BytesDownloaded: 157432231
        FilesUploaded: 0
        FilesDownloaded: 3
        FilesDeleted: 0
        FoldersCreated: 0
        RetryAttempts: 0
        UnknownFileSize: 0
        UnknownFileCount: 0
        KnownFileCount: 5651
        KnownFileSize: 443727193319
        LastBackupDate: 2/07/2018 4:30:08 AM (1530469808)
        BackupListCount: 9
        TotalQuotaSpace: 0
        FreeQuotaSpace: 0
        AssignedQuotaSpace: -1
        ReportedQuotaError: False
        ReportedQuotaWarning: False
        ParsedResult: Success
RepairResults: null
TestResults:
    MainOperation: Test
    Verifications: [
        Key: duplicati-20180628T120000Z.dlist.zip.aes
        Value: [],
        Key: duplicati-i6f59d4cf428f44c69b685f22008e7221.dindex.zip.aes
        Value: [],
        Key: duplicati-b9bcfb46274414e8b8a600f58e488bc03.dblock.zip.aes
        Value: []
    ]
    ParsedResult: Success
    EndTime: 2/07/2018 7:15:11 AM (1530479711)
    BeginTime: 2/07/2018 7:14:19 AM (1530479659)
    Duration: 00:00:51.9211435
ParsedResult: Success
EndTime: 2/07/2018 7:15:11 AM (1530479711)
BeginTime: 2/07/2018 7:11:07 AM (1530479467)
Duration: 00:04:03.5063981
Messages: [
    No remote filesets were deleted,
    removing file listed as Temporary: duplicati-b77e579c2664d4865be47679469918998.dblock.zip.aes,
    removing file listed as Temporary: duplicati-if9582dfea88f4d55a9de841a755bdc6f.dindex.zip.aes
]
Warnings: []
Errors: []

Any ideas why this would be happening?

Cheers
Tony

Hi @tbish, welcome to the forum!

I can think of a few things that could be causing this.

  1. You are using a custom very large --blocksize (Upload volume size) and/or have a custom high verification count (sorry, I forget the parameter name). Since Duplicati downloads a volume set after each backup for testing, this could be the source.

  2. Your --retention-policy (Keep versions) setting is flagging many blocks as “expired” so as part of the compacting process Duplicati is downloading volume sets and re-compressing them without the expired blocks.

We can try disabling the testing step and / or the compacting process to narrow down if either of them is the source.

I’ll try to come back and update this later with the parameters, but I think they’re something like --backup-test-samples=0 (if the issue goes away, it’s the testing step) and --no-auto-compact=true (if the issue goes away, it’s the space reclamation step).

Note that it could be a combination of issues.

Thanks @JonMikelV.

Block size and verification count should be just the defaults.

I was using the default smart retention policy, but your explanation makes a lot of sense. I don’t actually need many previous versions, so I’ve changed to just keep the last 7 daily backups. I’m quietly confident this will fix it, will keep you posted.

Thanks for your help.
Tony

That’s a good start, but I fear it likely won’t resolve the issue.

Because of how Duplicati backups up files as blocks (50kb each, by default) you could have a backup archive volume that contains parts of many different files. Unless every single block in the archive volume is “expired” (meaning the file can be outright deleted) it will eventually be downloaded, re-compressed, then re-uploaded.

I suspect the fact that you’re backing up VM disk images (OK, I’m assuming they’re disk images) compounds this issue as that means every little thing on the VM that gets saved to disk is considered a change that Duplicati needs to back up. (Think of all the temp files, cache folders, etc. that get written to every day.)

So Duplicati is likely seeing a large number of blocks changing every day, meaning your archive volumes are likely getting a high amount of “wasted space” very quickly.


If your current 7-day test doesn’t solve the issue long-term, I’d suggest some testing with parameters like:

--backup-test-samples (try setting it to 0)
After a backup is completed, some files are selected for verification on the remote backend. Use this option to change how many. If this value is set to 0 or the option --no-backend-verification is set, no remote files are verified
Default value: “1”

--threshold (try setting it very high)
As files are changed, some data stored at the remote destination may not be required. This option controls how much wasted space the destination can contain before being reclaimed. This value is a percentage used on each volume and the total storage.
Default value: “25”

--no-auto-compact (set it to true at which point you’ll need to manually run the Compact command to reclaim destination space for expired blocks)
If a large number of small files are detected during a backup, or wasted space is found after deleting backups, the remote data will be compacted. Use this option to disable such automatic compacting and only compact when running the compact command.
Default value: “false”

--log-file (set this to a local path so you can read more details of what Duplicati is doing)
Logs information to the file specified
Default value: “”

--log-level (try Verbose or Profiling)
Specifies the amount of log information to write into the file specified by --log-file
Default value: “Warning”

1 Like

Thank you @JonMikelV, the support here is awesome!

Makes perfect sense. I’ll disable auto compacting for now and see how we go.