Yet, another "Deleting unwanted files …" stuck for days

vazquezjm · July 18, 2024, 1:21pm

This is not the first time it happens, but this time I was charged a fee because I exceeded my storage provider plan quota of 1TB (IDrive e2)

As you can see, for some reason, there are 5 versions, when Backup retention is set to 1.

Currently, the backup job is stuck at “Deleting unwanted files” since 07/15 (3 days)

Duplicati is running as Docker container on top of OMV 7.4

Any help will be much appreciated!

drwtsn32 · July 18, 2024, 3:12pm

How fast is your internet connection? A remote volume size of 2GB is quite large, so if Duplicati is trying to do a compaction it may have to download/repackage some of those remote volumes.

Normally I wouldn’t change that option from the default of 50MB unless you have a back end that has a limit on the number of files that can be stored.

Regarding the version count, that does seem odd. Normally Duplicati can keep many versions and store them efficiently due to its deduplication feature. Only keeping 1 version seems risky to me. Did you recently deselect a large amount of data for Duplicati to protect? Or delete data it was protecting?

ts678 · July 18, 2024, 3:38pm

Or recently change the retention setting from its default value of Keep all backups?
You can also look at your log files. Retention setting should be applied after a backup.
Basically, at the top of the screenshot, seeing it want to do 6 right now seems strange.

Your job logs show what your backups did, and any deletions that were done. Example:

That sounds like a storage size charge, although they can also charge if egress gets too high. Natural reaction to too much storage would be to lower retention policy. Is that the sequence?

Or are you saying they charged you (but for what?) right in the middle of the current slow run?
What OS is this? You can sometimes distinguish between stuck and slow by resource usages.
Even CPU use would be a clue.

vazquezjm · July 18, 2024, 3:40pm

Upload speed is usually around 90/100Mbps

Interesting… Will try that.

I’m using Duplicati for off-site backup, I also have local backups.

No, but now that I’m thinking, since backup jobs take too long to complete, OMV installs security updates automatically and most likely Docker gets restarted while the backup is still running, leaving “unfinished jobs” behind?

vazquezjm · July 18, 2024, 3:49pm

Sorry, I didn’t chose the right words. I’m pretty sure it is not stuck, based on the CPU usage (job started 07/13 at 12:00am and it looks it is still running). Usually the CPU usage is around 10%

Correct, they charged me because I overused my 1TB quota last month. That’s why I have retention set to 1 version.

ts678 · July 18, 2024, 3:59pm

OK, so it’s no surprise that Duplicati is doing a lot of cleanup on the first job after the change.
Suspicion is that SQLite is having a slow query due to blocksize being left at default 100 KB.
This should be more like 1 MB for a 1 TB backup. The default will be set to that next release.

Can you use ps or top or something that sees per-process stats? Read stats would be best, however CPU at one core’s worth occupied would suggest that SQLite is in some slow query.

I’m not sure if it sees queries retroactively, but there’s About → Show log → Live → Profiling.

EDIT:

iotop, if you have it or can get it (what OS?), might show I/O.

vazquezjm · July 18, 2024, 5:11pm

Installed on Debian 12

I noticed there’s also a BTRFS scrub job (OMV) running at the same time

vazquezjm · July 18, 2024, 5:13pm

ts678 · July 18, 2024, 5:40pm

Well, I guess that’s where it’s hard at work… I don’t know why live log sometimes can look back. Somebody who knows the logging code might have an idea. Regardless, nice to see some clue.

I’m not sure what to make of the iotop output. On Windows, Duplicati SQLite can get read rates extremely high at the request-to-OS level, but Windows can answer most from caches not drive.

These were generally SQLite temporary files, with etilqs (sqlite reversed) in the file name, and probably a result of large tables (big backup, small blocksize) and limited SQLite memory cache.

I’m not sure how to see the low level action better, short of running strace on mono. I suppose it might leave some etilqs files visible in /tmp or somewhere, or maybe they’re deleted so invisible.

The lsof command can see files that are deleted but still in use.

Short of waiting it out or giving up and trying to add cache (there’s an environment variable), I’m unsure what the best way is to look deeper, but there are ideas.

vazquezjm · July 19, 2024, 12:32pm

Aaaand, it finished

@ts678 do you suggest trying with a lower remote volume size? Size is about 300GB

ts678 · July 19, 2024, 9:38pm

The 50 MB default would be reasonable, however the link explains why you might want higher, however I think higher would usually be somewhat higher, but not 6000 (without some reason).

Choosing Sizes in Duplicati

Remote volume size is the relevant section for the above setting, but I think the more relevent overall is The block size because the slow SQL in the log is dealing with blocks not volumes.

DELETE
FROM "Block"
WHERE "ID" NOT IN (
        SELECT DISTINCT "BlockID"
        FROM "BlocksetEntry"
        
        UNION
        
        SELECT DISTINCT "ID"
        FROM "Block"
            ,"BlocklistHash"
        WHERE "Block"."Hash" = "BlocklistHash"."Hash"
        )

so at default 100 KB blocksize, that means a couple of 10 million row tables in the SQL above.
As mentioned, this tends to not work well with the default 2 MB memory cache (which we can maybe increase if you can spare memory). Using a 1 MB blocksize cuts DB rows by 10 times, however it can’t be changed on an existing backup. Remote volume size can change, but that probably won’t make any difference to the known slow spot. It might help some others though.

300 GB remote volumes (as mentioned in the article) can make for slow restores, as an entire volume must be downloaded to extract even a single byte needed for some file restore, and it potentially could need multiple remote volumes to collect everything to gather all data needed.

Some of this is usage dependent, and you have some really unusual looking source data stats.

so I guess these are few but big files?

Compacting files at the backend explains the general plan, as some old backups age out.

is interesting because this is the version of compact that doesn’t repackage remaining data.
Maybe your files tend to be extremely unique, so block level deduplication isn’t doing much.

If you have roughly 300 GB files with unique data, then compacts might always look like the screenshots above, where one source file goes into a source file then gets deleted later, so damage from being way oversize might not be that much but we don’t know that for certain.

Back to SQL, and don’t worry if the topic is too deep, but this delete area came up in below:

Just looking at the title, you can see that blocks are the concern, and it’s my concern here… Smaller blocksize (like default) save more space, but only if there’s actually duplicated data.

This actually might be a good way to test expanding memory for SQLite cache to gain speed.
I don’t use Docker (or OMV), but isn’t setting environment variables for containers very easy?

That could be a quick test without having to rearrange both of the other settings, or start fresh.

vazquezjm · July 21, 2024, 4:13pm

Thanks for the thorough explanation!

1st thing I’ll try is set the remote block size back to 50MB

Then:

It is easy, but I’m not sure how the SQLite cache can be modified for Duplicati as a container

ts678 · July 21, 2024, 5:24pm

Assuming that the environment variable passes as far as Duplicati, you could set
CUSTOMSQLITEOPTIONS_DUPLICATI to cache_size=-200000 to get 200 MB.

SQLite documentation explains the number as the negative of the kibibytes cache.
Default is -2000, so 2 MB. Expanding by less than 100 times might also speed up.

You can tell if the value gets in by watching About → Show log → Live → Verbose

github.com

duplicati/duplicati/blob/78ea50c8335772f2684ff39f8e44c22d2836322b/Duplicati/Library/SQLiteHelper/SQLiteLoader.cs#L148-L155


      
          	    // set custom Sqlite options

                      var opts = Environment.GetEnvironmentVariable("CUSTOMSQLITEOPTIONS_DUPLICATI");

                      if (opts != null) {

                          var topts = opts.Split(new char[]{';'}, StringSplitOptions.RemoveEmptyEntries);

                          if (topts.Length > 0) {

                              using (var cmd = con.CreateCommand()) {

                                  foreach (var opt in topts) {

                                      Logging.Log.WriteVerboseMessage(LOGTAG, "CustomSQLiteOption", @"Setting custom SQLite option '{0}'.", opt);

The message on the bottom line should come out at database opening at backup start.

vazquezjm · July 21, 2024, 9:22pm

Actually, there’s only one file of about 300GB generated by a Windows backup app (Easus) I need to backup offsite.

But, now that I’m thinking, probably that’s not the propper use of Duplicati (one single file that most likely will differ in size to what was previously backed up everytime the scheduled job runs).

ts678 · July 21, 2024, 9:30pm

Being a different size is no problem for deduplication, but if it also looks changed inside from a fixed length block point of view, then deduplication is defeated, so any little change might cause a lot to upload. This works, but is just far more wasteful of bandwidth and limited storage space.

shalmirane · July 22, 2024, 6:23pm

Then we might also need some tool to update it once the files are uploaded (which documentation now says it’s impossible). Because even small backups can grow over time and those TBs are already uploaded at the remote site (and not always feasible to delete and reupload). Or perhaps something, that can dynamically update that.

But it would explain why my database folder has 16GB for my backup sets, the biggest one having half of that

ts678 · July 22, 2024, 6:41pm

It’s extremely awkward to do, as blocks are identified by their mathematical hash, and these don’t combine mathematically if you combine blocks. You need to have all the data around to read, but it’s on the destination.

If you want an extremely experimental tool with limitations based on above limit, you can test this:

shalmirane · July 22, 2024, 9:01pm

Thanks, will check it out.

I run backups, so I do have all original data on local disks if that helps

One of the disks plays the role of the archive, but from backup perspective it’s same as other disks; all backuped online

vazquezjm · August 7, 2024, 1:01pm

I noticed a huge difference in backup times running them from my laptop directly (Duplicati Windows client) compared to having a different app (was Easus) creating the backup to my NAS (OMV), then having Duplicati do it’s job from the NAS to the cloud storage.

What would be the best approach for large file uploads?

ts678 · August 7, 2024, 1:09pm

Presumably fast is better? Which way is fast? Is this just a comment, or there a question?

Are there large files there? If so, please describe usages. Or do you mean large backups?