Changed file paths after migration to new PC

MichaelB · August 11, 2018, 9:05am

I replaced my Win10 PC by a new one by transferring all files and programs 1:1 to the new PC.
On the new PC all paths have changed, because user name changed and I had to move one of the libraries from C drive to D drive.

C:\users\Michael… changed to D:\users\buent… and C:\users\buent

Now Duplicati recognises all files as new. Do I have to backup everything again (650MB to Onedrive) and delete my old backups? Or do I just have to recreate the database?

ts678 · August 11, 2018, 3:32pm

Hi @MichaelB

Duplicati will recognize the files as new because the paths are different, however the actual data in the files should not need to be uploaded again, because Duplicati de-duplicates on a file block basis. Because your seemingly new files haven’t actually changed, Duplicati “should” just have its new file list use the old blocks.

duplicate files are detected regardless of their names or locations, but there are still relatively small dlist and dindex files needed in order to be able to list the paths and locate the dblock files that contain the file blocks.

If you want to confirm this, you can use the dry run option to see whether it wants to upload many dblock files. Manual file transfer might have changed something (file times? permissions?), and I hope Duplicati still does deduplication of the data itself (which is generally larger). You could try testing one large file to see if it puts a small dblock in OneDrive, or one that would be reasonable for the large file after compression on it got done.

If you care about older versions, note you’ll need to use the old paths. If you limited version retention, the old paths (looking like deletions) will eventually age away, and you might even get a little OneDrive space back…

Edit: I should also add that I’m assuming you migrated Duplicati to the new computer in a way that has the old pathnames visible. Here is an example of that. If you just moved an exported job configuration, that’s different. Doing a move that includes the actual database cache of remote data is probably faster than regenerating it. Somewhat surprisingly, I couldn’t find a nice article on best ways to do common moves. I might have missed it.

MichaelB · August 12, 2018, 7:23am

Yes, I migrated the duplicati database.

I was using Laplink pro to transfer applications and data: Time stamps stayed the same. But probably permissions did change.

I checked the (aborted) backup job. It created a big amount of dblock files. So it does not look like Duplicati did dedup the files.

I was hoping that Duplicati was working with relative/virtual paths (%MY_DOCUMENTS%
%MY_PICTURES% %APPDATA%\XYZ), which stayed the same. But probably permissions changed anway.

I don’t think I have enough onedrive space (1TB) to store both backups and wait the old files to fade away. I looks like I have to delete the old backup and start over. I don’t care about the older versions…

Thanks for your detailed explanation!

ts678 · August 12, 2018, 12:04pm

Changed permissions would be stored in dblock files too (but fewer). The trick is to determine what’s in yours.

https://www.duplicati.com/articles/Backup-Process/#further-processing

talks about what you should see in terms of dblock output, but if the dblock for the one-large-file test is maybe hundreds of bytes (and there are no other new dblocks), then the deduplication of file content was successful.

Another option might be to see if there are new dblock files on backend storage from the aborted backup. You can then decrypt files with either SharpAESCrypt.exe included with Duplicati, or visit https://www.aescrypt.com/

or you can delete and start over, but I wanted to give some other options, and fix dedup if this case really fails, and also perhaps save you from some lack-of-backup exposure. That big backup could take awhile to upload, however one plan (actually applying to any new backup) would be to backup in a specific order of importance. If the source data configuration is too customized to touch, a temporary filter on it might offer a coarser option.

MichaelB · August 12, 2018, 3:25pm

Thanks for all this insight. I had a look into the documentation and even one of the index and dblock files.

Then I ran backup again and it completed successfullly after 2.5h. I’m sure now that dedup worked as intended. Excellent!

JonMikelV · August 14, 2018, 4:12am

Wow - I didn’t realize that tools was still around!

Marxsal · September 5, 2018, 9:24pm

I have the same problem, only on linux. All names changed from /home/media/joe/data… to /home/media/mark/data… I found where to set the dry run option. But after a run, the log results don’t seem to list anything relating to dblock files. How do I see those? Do I need to add another logging option?

Thanks!

ts678 · September 6, 2018, 12:02am

The normal job log logger doesn’t appear to be extensible, but the “Commandline” job option can do this.

Set option --log-level to Information, keep --dry-run set, and you should get “Would upload volume” lines.

On newer Duplicati versions, --log-level would be --console-log-level for this (or else a warning is issued).

Example output from Windows of a small update:

[Dryrun]: Would upload volume: duplicati-ba3c3f3632d7b41f491724fc486b9c7bc.dblock.zip, size: 510 bytes
[Dryrun]: Would upload volume: duplicati-i7626c71eab79420f9efc9a6a20c5b973.dindex.zip, size: 525 bytes
[Dryrun]: Would upload volume: duplicati-20180905T232724Z.dlist.zip, size: 795 bytes

Marxsal · September 6, 2018, 6:26pm

Thanks for your reply! This process of changing source file names feels scary … it looks like it’s about to delete thousands of files and start over. I’m surprised there isn’t a more direct way to change source file paths.

But – it seemed to work.

Thanks again!
– Mark

JonMikelV · September 6, 2018, 10:54pm

Well, that’s exactly how Duplicati sees it.

It will basically day “Oh, /joe/ and all contents has been deleted and /mark/ has appeared. I’ll flag all /joe/ contents for deletion but won’t actually delete any of it because I see the exact same content being used by /mark/”.

Things to keep in mind include:

pre-mark versions will still exist in /joe/ and want to be restored there unless you tell Duplicati otherwise
pre-mark versions exist in the /joe/ path so if you look at version history for a /mark/ path it will START after /joe/ history end
because the history has been split, some retention policy settings might act a bit strange for example, even if you said keep monthly versions forever, because /joe/ is seen as deleted that history will end up being purged.

In other words, assume your retention policy is starting over. @kenkendk, is this something that should be addressed in some way?

kenkendk · September 7, 2018, 7:16am

That would be really tricky to get right. Since we do not integrate with the OS or filesystem, we can only look at the contents. When we see data has disappeared, we can only assume it has been deleted, and when we see new data we see it as that: new data.

In the simple case with a rename, all data stays the same so we could detect it, but we don’t know if it was a copy+delete and we could find multiple copies. And then, what if some data has changed before/after being renamed but in between backups? I don’t want to go down that path.

But I see the issue with retention policy removing stuff. Previously it was a very simple deletion strategy: always delete old versions. With that simple approach, you would simply have a switching point.
But now that we delete versions in between, it could get a little messy, but will eventually make sense.

Not sure what we can do about it?

JonMikelV · September 7, 2018, 12:52pm

I completely agree about the first part, I was mostly asking about the retention policy stuff.

To a smaller extent it applies to simple renames (such as a file or folder name change) or file moves as they’ll also reset history since it looks like a new item to Duplicati.

I don’t know how many people would care about it, but the few that do probably REALLY care. For them, I’m thinking a note about how this happening would probably be enough.

Maybe a short something like “Note that file or folder moves / renames will start a fresh history at the new location”. That should be enough to catch the eye of those that would care and hopefully prompt a visit here for more detail.