Understand backup behavior

Hi guys! In these days I try a restore process but I notice a very strange thing, I try to explain me with a example:

Suppose to have a backup job. The backup job have a retention policy like “keep last x versions”, with x = 4 and the frequency of backup is one a day. In the beginning the data to backup are two directories “FOLDER_A” and “FOLDER_B”, every direcotry have stuffs wich are modified every day.

Day 1: the job backup every two directories with their files.
Day 2: the job backup every two directories with their files - files are modified or deleted.

Now suppose to don’t want backup “FOLDER_B” and modify the backup configuration.

Day 3: the job backup only “FOLDER_A”
Day 4: the job backup only “FOLDER_A”
meanwhile FOLDER_B could be renamed or files were modified…

Now we have four versions of backup job. Now when I select the version to restore files I suppose to have the following configuration in the “explorer files box” at step one of restore procedure (in web UI):

day 1: I see the “FOLDER_A” and “FOLDER_B” with their files
day 2: I see the “FOLDER_A” and “FOLDER_B” with their files - if one or more files was deleted from day 1 they shouldn’t be appear
day 3: I see only “FOLDER_A”
day 4: I see only “FOLDER_A”

Instead I have the following situation

day 1: I see the “FOLDER_A” and “FOLDER_B” with their files
day 2: I see the “FOLDER_A” and “FOLDER_B” with their files
day 3: I see the “FOLDER_A” and “FOLDER_B” with their files
day 4: I see the “FOLDER_A” and “FOLDER_B” with their files

What happen? This trouble affect my backup job of local backup databases: In a first moment I store them in a directory under windows\system32 but next I move them in another directory. Of course I modify the backup job to point to the correct directory.

Please note: according to my settings my backup keep the last 15 versions but, I am sure, I have backuped more of 15 times the data from the date when I modify the directory to backup, so I see a directory wich I shouldn’t see and if I restore inner files I don’t know the date related.

This is very frustrating :frowning:

Update: I found this mistake to another backup but this time the problem is related to a different name of drive letter: this backup is for a USB driver and also in this scenario I modify the source path, in a first moment I use the drive letter and next the GUUID of hard disk.

I cannot reproduce this. I set up a new backup and selected “Folder A” and “Folder B”. I backed up the test data and confirmed that both folders appear when I go to Restore files.

I then edited the backup job and removed “Folder B” from the backup source list and ran another backup. If I go to restore files, “Folder B” is no longer shown for backup version 0 (most recent backup). But I do see it if I look at the older backup (version 1).

Ok, it sounds like a problem in my specific configuration. Unfortunately I am far from home (when backups are stored) andi I can’t access to it, moreover I can’t spent many time to investigate it now.
Anyway I notice this mistake even if I try a direct restore from backup file, this mean it isn’t a problem in local database.

If I try the purge command it result a “error 56” and refuse to purge files because it’s equal to clear the entire dataset.
If I try the purge command on the USB driver it clear only 3 dlist files (it make sense: data are the same, only the path is different) but the problem persist.

This is the configuration of one “broken” backup… do you see something strange?

I don’t know if it’s important but in global options I have use-block-cache=true

Does this mean you can’t look at the web UI right now? If you can, I was going to ask you to take a screen shot of the “Restore files” dialog with version 0 selected. You should only see “C:\Program Files\Duplicati 2\Data\” and its files listed.

I don’t think you need to mess around with purge. The older backups and older paths will prune off automatically at some point since you are only keeping 15 backup versions.

Fortunately I have my laptop with a similar configuration (and the same problem and the same history - the mistake is the same): it means I have Duplicati WEBUI =) and local database directory. I don’t have with me the .dblock and .dlist files.

and last but not least the actual C:\WINDOWS[…]\Duplicati floder (even if it’s irrelevant because the directory is unselected.

If I remembre correctly this backup is created with a canary between the latest 2 beta version and updated with canaries. Now I use the current beta and I think to remain in the beta channel.

Actually if the mistake is related only to the database backup I can simply start a new job (but I’m not sure - I need investigate) but if it affects other backups it could be a problem :-/

Does changing --usn-policy from On to Off fix this? I can get an error like this with USN On, testing with two files or folders. While USN change journal does record file delete, Duplicati might be applying USN analysis only to the selected files and folders to detect candidate changes in the selected source areas, maybe assuming that anything that didn’t change is unchanged – bad assumption if selection changed.

Thanks for that screen shot. Looks like backup version 0 was performed 4 days ago? Can you double check and confirm C:\Windows… is not in the source selection list and run one more backup?

@ts678 good guess! I create a dummy backup job (over my local HDD) and perform your suggestion. I reproduced the “bug” and next turn off usn-policy and all works as expected. In addition, next, I try to switch to on usn-policy and however all seems work correctly. Could be a workaround (for my backups) perform a backup with usn-policy=off and next reswitch it to on?

Can you confirm that this mess happen only if the source areas changed?
I note this problem also in a backup related to a external usb HDD. I specify the source data with HDD-UUID but the drive letters could be change. How can I handle the variations of letters? only force uns-policy to off or there are some trick?

@drwtsn32 thank you for your reply, anyway the answer of all your question is yes.

If you mean the folder (or file) is left when source area is removed, how else could that happen?

If you mean the purge mess, I haven’t been following that and don’t have any comment on it yet.

EDIT:

Are any of you able to come up with easy steps to reproduce, to file an Issue? If not, I can file…
All I had were two folders with one file each, and I’m not even sure the file-in-folder was needed.

EDIT 2:

What it’s supposed to do (I think) is notice when something has changed, and revert to full scan.

@ts678 In my mind figured a situation when a subdirectory of data source moved to another subdirectory (but within the same data source path)…

In my first post I explain the situation, I confirm that there are the step necessary to reproduce the bug. Please, fell free to open an issue at my place.

Issue filed, with more generic steps and some version test. The bug might be from May 2018.

With --usn-policy=On, source deselection retains files and folders from removed items #4071

This seems to run fine if “data source path” means a checkmarked folder, and subfolder under it gets renamed, whereas in original post, I assume original data source path got unchecked (it’s now gone).