Slow backup -especially deleting unwanted files and creating of bug report


#1

I have installed Duplicati 2.0.2.20 on an Synology 214play. It is using minio as target located on wan.
I think that Duplicati is very promising, but have some issues with performance.
I have a job conatining 467GB. The backup itself is running OK, but even when there is almost no change in the source, the job spends a lot of time “Deleting unwanted files” - approx 4 hours.
I have tried to generate a bug report, but it seems that Duplicai runs forever using almost 100%CPU, and never delivers the report.
Below I have pasted output from the result of the backup from the job log.

Do you have any suggestions?

DeletedFiles: 0
DeletedFolders: 0
ModifiedFiles: 1
ExaminedFiles: 270360
OpenedFiles: 1
AddedFiles: 0
SizeOfModifiedFiles: 380928
SizeOfAddedFiles: 0
SizeOfExaminedFiles: 501491408758
SizeOfOpenedFiles: 380928
NotProcessedFiles: 0
AddedFolders: 0
TooLargeFiles: 0
FilesWithError: 0
ModifiedFolders: 0
ModifiedSymlinks: 0
AddedSymlinks: 0
DeletedSymlinks: 0
PartialBackup: False
Dryrun: False
MainOperation: Backup
CompactResults:
    DeletedFileCount: 2
    DownloadedFileCount: 0
    UploadedFileCount: 0
    DeletedFileSize: 45765
    DownloadedFileSize: 0
    UploadedFileSize: 0
    Dryrun: False
    MainOperation: Compact
    ParsedResult: Success
    EndTime: 03/04/2018 22:58:30 (1520200710)
    BeginTime: 03/04/2018 22:54:35 (1520200475)
    Duration: 00:03:55.4859330
    BackendStatistics:
        RemoteCalls: 11
        BytesUploaded: 24479956
        BytesDownloaded: 76901372
        FilesUploaded: 3
        FilesDownloaded: 3
        FilesDeleted: 3
        FoldersCreated: 0
        RetryAttempts: 0
        UnknownFileSize: 0
        UnknownFileCount: 0
        KnownFileCount: 15620
        KnownFileSize: 409143409489
        LastBackupDate: 03/04/2018 18:30:14 (1520184614)
        BackupListCount: 4
        TotalQuotaSpace: 0
        FreeQuotaSpace: 0
        AssignedQuotaSpace: -1
        ReportedQuotaError: False
        ReportedQuotaWarning: False
        ParsedResult: Success
DeleteResults:
    DeletedSets: [
        Item1: 1
        Item2: 03/03/2018 19:47:56 (1520102876)
    ]
    Dryrun: False
    MainOperation: Delete
    ParsedResult: Success
    EndTime: 03/04/2018 22:58:30 (1520200710)
    BeginTime: 03/04/2018 21:10:09 (1520194209)
    Duration: 01:48:21.0982980
RepairResults: null
TestResults:
    MainOperation: Test
    Verifications: [
        Key: duplicati-20180304T173014Z.dlist.zip
        Value: [],
        Key: duplicati-i966f7ed58a6543d7b6776736bf12e66f.dindex.zip
        Value: [],
        Key: duplicati-b63f6a7c386ec4fa2bcac777ac6669142.dblock.zip
        Value: []
    ]
    ParsedResult: Success
    EndTime: 03/04/2018 23:00:06 (1520200806)
    BeginTime: 03/04/2018 22:59:28 (1520200768)
    Duration: 00:00:37.5544860
ParsedResult: Success
EndTime: 03/04/2018 23:00:06 (1520200806)
BeginTime: 03/04/2018 18:30:05 (1520184605)
Duration: 04:30:01.0574130
Messages: [
    [Retention Policy]: Start checking if backups can be removed,
    [Retention Policy]: Time frames and intervals pairs: 7.00:00:00 / 1.00:00:00, 28.00:00:00 / 7.00:00:00, 365.00:00:00 / 31.00:00:00,
    [Retention Policy]: Backups to consider: 03/03/2018 19:47:56, 03/03/2018 16:29:27, 03/01/2018 14:01:00, 02/27/2018 21:29:50,
    [Retention Policy]: Backups outside of all time frames and thus getting deleted: ,
    [Retention Policy]: All backups to delete: 03/03/2018 19:47:56,
...
]
Warnings: []
Errors: []

#2

Hello @Thomas_Boge, welcome to the forum!

I believe, among other things, the bug report scrubs a copy of your job database to remove secure info then compresses it to be included with the report. This is a fairly CPU intensive task and with 467G of source data I could see it taking quite a while…

Thanks for posting the log (oh, I edited your post by putting “~~~” before and after the log to make it easier to read). It shows only 3 files were deleted from the destination and doesn’t include any Deleting unwanted files messages.

Could you find any of those messages in your logs, click on them to open the details, then paste one or two of them here?

My guess is it’s just doing cleanup deletes based on your retention policy (daily for a week, weekly for a month, monthly for a year?) but they shouldn’t take very long to be processed.


#3

Thank you for spending time answering my post. I have made a new run, this time it finished after 2h50m. I have copied from Livelog, profiling. Some sql-statements are using a lot of time, e.g.

Mar 7, 2018 11:55 PM: ExecuteScalarInt64: SELECT COUNT(*) FROM "Block" WHERE "Size" > ? took 00:32:39.989
UNION SELECT DISTINCT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash")

Mar 8, 2018 1:25 AM: ExecuteNonQuery: INSERT INTO "DeletedBlock" ("Hash", "Size", "VolumeID") SELECT "Hash", "Size", "VolumeID" FROM "Block" WHERE "ID" NOT IN (SELECT DISTINCT "BlockID" AS "BlockID" FROM "BlocksetEntry" UNION SELECT DISTINCT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash") took 00:41:11.000

The rest of the log is here:
https://pastebin.com/06BWh3Hw


Long action logging
#4

I noticed my reporting taking a long time too after the deleting unwanted files step. Maybe it’s possible to increase the efficiency of the queries?

Another strange thing that I found was that even if I disabled reporting it still generated the report, although I don’t know if it sent it.


#5

When you say “disabled reporting” do you mean the “Usage statistics” in the main menu settings? If so, that’s a different beast.

Usage statistics are automatic (hence the need to be able to disable) while the bug report is only created with user interaction so it should always work when requested, even if usage statistics are set to “None”.


Thanks for catching those! With a 467G source and the standard block size of 100kb your database probably has almost 4.9 million records keeping track of blocks.

I expect this is why the ExecuteScalarInt64 call took over 32 minutes to run. Looking at the code, it appears that whole step is solely to determine if a user has changed the block size.

While an important thing to check, that much time spent checking it makes no sense. I’m not big into SQLite so I could be wrong but I’m guessing that Select * could be replaced with something like SELECT EXISTS(SELECT "Size" FROM "Block" WHERE "Size" > ? LIMIT 1).

Not selecting “*” might speed things up right off the bat, but if not then this probably only improves performance when a block that is NOT size ? exists. In that case, it might make sense to instead add an index of Size (ordered by size).

Ugh - I just realized something… @kenkendk, the GetBlocksLargerThan() method is being used from Utility.cs to check if the block size has been changed - but it only checks for an INCREASE in size! What happens if I change to a smaller block size?

It looks to hopefully improve performance AND fix this loophole we might need something like this:
SELECT EXISTS(SELECT "Size" FROM "Block" WHERE "Size" != ? LIMIT 1)


The ExecuteNonQuery command is recording blocks that are flagged for deletion. Again, I’d have to poke around but I think this SQL could be optimized in a few ways including:

  • remove the DISTINCT (it’s likely more expensive then just returning all records) for the NOT IN check
  • rewriting the NOT IN bit as a NOT EXISTS (I think meaning sqlite won’t be trying to generate a temp table full of IDs to then be checked against NOT IN)
  • rewriting the whole INSERT as an INSERT OR IGNORE INTO "DeletedBlock" without the NOT IN check & joins (note that this assumes the joins are there solely to check if the record has already been inserted, assumes “Size” is the same if “Hash” is the same, and it doesn’t matter if “VolumeID” is duplicated)

Backup states/phases - information reported to users about what their backup is doing is not helpful
#6

Yep, usage statistics. After I disabled it I still got log entries about SQL statements generating usage-reports, suggesting that it was still spending a significant amount of time on this step despite being configured to not send them.

Edit: Here’s a screenshot of the log during the Deleting Unwanted Files step.


UsageReport SQL running even if disabled?
#7

New UsageReporting topic started: :slight_smile:


#8

It’s what I should have done, but I kind of found out by accident and then I was too lazy that day to create the topic. I didn’t even have the exact log entry until I went back to find it today :slight_smile:

Thanks!


#9

Thank you for digging into this. I may add, that just before the ExecuteNonQuery mentioned above, theres is another very timeconsuming statement, that shared the same subquery. Maybe the total execution time on those to ExecuteNonQuery can be reduced by creating a temporary table with indexes from that specific subquery. The subquery is

SELECT DISTINCT "BlockID" AS "BlockID" FROM "BlocksetEntry" UNION SELECT DISTINCT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash"

And the timesonsuming statement

Mar 8, 2018 1:25 AM: ExecuteNonQuery: INSERT INTO "DeletedBlock" ("Hash", "Size", "VolumeID") SELECT "Hash", "Size", "VolumeID" FROM "Block" WHERE "ID" NOT IN (SELECT DISTINCT "BlockID" AS "BlockID" FROM "BlocksetEntry" UNION SELECT DISTINCT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash") took 00:41:11.000

In total, the 3 statements mentioned in this issue are respondible for approx 110 minutes out of the total runtime of the job of 170 minutes.


#10

I like that idea as a fallback if the subuqery optimization doesn’t return spectacular results. :slight_smile: