How to diagnose why files were not backed up?

Hi folks,

I’ve been using Duplicati for a year+ now, and seemingly everything has been going well. I even needed and successfully completed my first real, necessary, restore yesterday. My setup is Windows, running as a service, backups done every 3 hours (usn-policy = on).

However, today, after adding a few more “Exclude expression” items via the “source data” configuration screen, my next backup included a large number of removed, modified and added files. I’d done some removals of apps with Revo Uninstaller a couple of days ago, I don’t know why those only registered as removals in Duplicati today and the “modified” list seems ok.

What I can’t explain the “added” files. Most of it was files from November or December of this year. Some went back to February. Some of it was application installations so large groups of files from localized areas (“program files”). But then there are random other individual files that I would rarely touch that landed on the “added” list. Of course I went back to verify and those files really are not in the backups from just after they would have been created.

I don’t expect an answer to what happened (although that would be nice). What I’m hoping is that someone might have ideas on how to figure out what happened? As you can imagine I’m a bit freaked out wondering what else might have been missing.

Regards,
L

I’m not actually sure what you’re saying lol. It sounds like they weren’t backed up before but are now?

I’d like to note that the filters include/exclude thing has code that will change that value to something it thinks is correct but isn’t always. I have to change it myself back after it changes it to get the value I want and that works in the way I use it. If you’re not paying attention then the meaning will change. It might do it after click next and change the screen and you will never realize if that’s what happens.

I never finished gathering the data on that for a bug report. Oh well :)P It was well intentioned by Duplicati but is micro-managing and excessive. Also, includes bugs and more work on a project with not enough people to work on it and is overly complicated. They should dumb that down by all means.

Some other things may be figured out via the Duplicati logs and others via the remote logs eg FTP remote logs. Assuming you can get access to remote logs.

Of course, there’s also permission to access or some files may not have access when an application is open.

But, I think the first one might have got you here.

Sorry, I struggle with “concise vs thorough”! Basically, a set of files that has existed for a while (days or months) that, in theory, were part of the source file set, did not actually get backed up until today. What changed was that I happened to add a couple of new filters to the filter list. These new filters were not related, at all, to the files that had previously not been backed up.

The best I could come up with is that somehow, some filter was being interpreted wrong until adding new ones somehow “fixed” it. Except that the disparate locations of those “not backed up until today” files seems to make that unlikely.

If it was an access denied thing previously. Normally, if such as if the file is locked, Duplicati should post a warning after. Difficult to say if it should for all access issues.

If you relax on the filter use, the backup size and crawl speed will get worse, but it should cut down on the possibility of files missing from that at least.

You could try removing the filters you just added and then place a new file into the location you feel files were not backed up at, backup with that file there, check the restore process list if they’re there. If not then add the filters back in and see if they’re there after another backup. If so then you have a filter issue for sure.

You can also help deal with that by using multiple backup applications. I use at a minimum two at all times. Then I also make backup images and also occasional manual backups. Never put all your eggs in one basket if you can.

Do you have historical records, configuration exports, or anything beside current database?
A running log-file at a log-file-log-level such as verbose or beyond (which get big) says lots.

DB Browser for SQLite on a copy (for extra safety) of the database could also reveal things.
File view Path search is simpler (in a way) than using find and compare to examine things.
It also probably more directly shows the order and neighbor files that the backup processed.

This is far worse than a nice log which shows include and exclude as well as notes on USN.
The change journal can avoid full file scan, but its optimization might preserve old problems.
USN has its own limits, and sometimes must do a full scan anyway. A nice log would say so.

https://github.com/duplicati/duplicati/blob/master/Duplicati/Library/Main/Operation/BackupHandler.cs
https://github.com/duplicati/duplicati/blob/master/Duplicati/Library/Snapshots/USNJournal.cs
are some of the main files for USN, and below is one issue I found, but it’s been fixed awhile:

With --usn-policy=On, source deselection retains files and folders from removed items #4071
Fix issue 4071 - with USN enabled, source deselection incorrectly brought forward #4079

Or maybe USN has nothing to do with it and it’s really a Filters misunderstanding or filter bug.
The TEST-FILTERS command is a safe way to test out previous filters if you remember them.

Thanks for the help, folks. I didn’t have any good historical information, nor could I recreate the behavior (I know exactly which filters were added) with test backups or via test-filters. I can only guess that it was unrelated to the new filters, but who knows!