Can I clone my entire backup to a new location?

JonMikelV · November 7, 2017, 6:15am

Unfortunately, this is a very small team and most of the effort is still being put towards development of the next stable release. Working so closely with a product for a long time can make it difficult to gauge where new, or even intermediate, users might need help - so questions like yours posed in this forum can help us identify what areas of existing functionality need more documentation.

If you are trying to move your backed up files from one destination to another (say from Google Drive to BackBlaze) then yes, cloning the files and pointing your existing config to the new destination should work just fine. People have also used this to “seed” a local USB drive then move the seed to the ultimate destination (usually an offsite SSH or FTP server).

If you’re trying to make a second backup to a second destination then the cloning will work for the initial “split” but when you import the cloned configuration you’ll have to do a database rebuild to get the cloned configuration to recognize the archive files from the original job.

If you mean in the sense of code version control (such as GIT or SVN) it’s technically doable, but not very straight forward. In theory you could make a script that checks out two versions of a file (or backup) into two locations then run your diff tool (I like BeyondCompare, myself) against those two checkouts. However, the way Duplicati is written you’re not likely to find a tool that can “plug in” to it like many do with GIT/SVN.

If you mean not have to use local disk space to store your file parts until they fill up an entire dblock (archive) file, then sort of - kenkendk is looking to add a parameter to canary versions that will allow this step to be done in memory instead of on physical disk. In the mean time, some users have achieved a similar effect by creating a RAM drive and pointed their Duplicati temp folder at that.

The dangers are relatively minimal as long as you don’t try to mix multiple backups into a single destination folder (it’s doable, but can get messy).

Adjusting sources also shouldn’t cause problems, but it should be noted that a change in path will be seen by Duplicati as a delete of the original files and a create of the new ones. If all you did was rename a folder, then the deduplication process will see the guts of the files haven’t changed so won’t re-upload anything however HISTORY (versions) won’t “transfer”. This means the history of the old folder is now done and the history of the new folder is starting at version 1. Depending on your retention settings (keep-versions and the like) the history of the old folder eventually be removed from the backups.

Sorry, we’re still working on that feature.

AName · November 8, 2017, 3:29am

Yeah, makes sense.

If you’re trying to make a second backup to a second destination then the cloning will work for the initial “split” but when you import the cloned configuration you’ll have to do a database rebuild to get the cloned configuration to recognize the archive files from the original job.

Was referring to this. That would be kind of sucky, to be honest. Assuming one wants an exact clone of the backup, they ought to be able to just copy the raw backed-up (.aes) files and Duplicati should work the same way on/at another device/location, no?

In the mean time, some users have achieved a similar effect by creating a RAM drive and pointed their Duplicati temp folder at that.

That’s pretty smart. I’ve used RAM drives before, they’re pretty easy. Of course, both of those options you mentioned would require a device with lots of RAM if it’s low on disk space.

Well you had darn better get on it right now!

JonMikelV · November 8, 2017, 3:26pm

The issue is that exporting the job configuration does not include the local .sqlite file which tracks all the versions, blocks, and hashes(?). When you import the saved configuration the new job is created and a .sqlite database NAME is assigned, but the database itself isn’t created until a backup is run - at which point Duplicati realizes it’s (empty) local database doesn’t align with the (pre-existing) destination files.

I don’t know if it would work, but in theory once you’ve imported the job you could copy the .sqlite from the old job to the database name of the new job and you might be able to skip the rebuild…

In fact, I think I’ll try that right now…hang on…

Heh - thanks for the confirmation @Wim_Jansen. I got stuck testing a potential failure scenario.

@kenkendk, it looks like when an exported job that contains an explicit --dbpath is imported there’s no check to see if any other jobs happen to be pointing to the same dbpath.

I’m assuming if a job (and destination) is duplicated but the original and duplicate job still use the same local sqlite file the user will end up with extra or missing files errors flipping between the two jobs. Is that a scenario we can detect / handle?

Wim_Jansen · November 8, 2017, 3:30pm

That works. Done it in the past.

JonMikelV · November 8, 2017, 3:47pm

The amount of space that would be needed depends on your --dblock and --asynchronous-upload-limit settings.

AName · November 8, 2017, 11:17pm

Oh. I was under the impression that Duplicati would have to be looking at some type of actual file in order to commit anything.

…? Wait a minute here… Are you telling me that if I don’t keep the local .sqlite file around, or it gets lost or something, all of my data becomes useless? And are you also telling me that if I use Duplicati on an external drive on one computer and try to read from it in Duplicati on another computer, it won’t work? This better not be the case… Local files should be temporary, and the data itself should be fully, permanently self-sufficient, even in isolation. (And if it is, then copying the raw data from one place to another should clone the database completely.)

JonMikelV · November 9, 2017, 12:56am

I’m assuming when you say “commit” you are mean to add a file or change to the backup in which case, yes - normally Duplicati needs to access the actual file to get the file contents and determine what has actually changed.

Depending on your parameters Duplicati can be told to check date/time stamps for the initial “is this file changed” step which would then be followed by the “OK, we know this file has changed so let’s process the contents to determine what parts are different” step.

I believe there might be some preliminary thought into using COW (Copy On Write) or versioning type file system metadata to identify changed files and simplify file content processing, but I don’t think it’s anywhere near complete.

Not at all. Duplicati is designed to be resilient when parts of it’s normal functional data have gone missing (such as a failed local drive or partially lost destination contents). The local database is kept for performance reasons for doing things like looking up the timestamp of the last time a file as backed up or comparing already backed up block hashes to ones derived from the “live” file to determine what parts of the file have changed and need to be backed up.

If the local database were to disappear it could still be rebuilt from archive files in the destination. And I think one can even do a restore directly from the destination files even if the local backup job has disappeared. But in cases such as those things will take quite a bit longer because so much of destination contents have to be downloaded and re-indexed.

The issue of how long restores take for very large backup sets when the local database is missing has been brought up and I agree it’s taking longer than we’d like - but it does restore. Methods for potentially backing up the local database to the destination have been discussed, but I don’t know that any have been deemed workable in the near future.

Copying the raw destination data from one place to another is indeed enough to restore from and/or rebuild the local database.

AName · November 9, 2017, 3:38am

Great!

I mean… My .sqlite for about 150 GB is only 56 KB big… It just needs to be copied to the location where the data is stored? Since it’s temporary, you don’t even need to back it up on the remote, just replace it after each backup or on manual request (let the user decide).

Wim_Jansen · November 9, 2017, 12:50pm

you have a .sqlite that holds the configuraiton, and a .sqlite for each backup job. The 2nd one is probably a lot bigger :-), depending on the #files, blocksize, #versions, etc…

JonMikelV · November 9, 2017, 5:35pm

Wim_Jansen is correct, there is more than one .sqlite file in play here. If you want to check the size of your data file (rather than your settings file) you can identify it’s location by using the “Database…” link for you backup in which you should see a “Local database path” field which should help you identify exactly which file is storing your paths, versions, hashes, etc.

Hmmm…I wonder how hard it would be to add the .sqlite file size to this GUI…

AName · November 10, 2017, 11:24pm

Ah, okay. I think both .sqlite files are adjacent to each other. I now have 3 backups, and these are the sizes of their data .sqlite files:

150 GB backup => just over 1 GB
150 GB backup, 2nd copy => just over 1 GB (though, oddly, slightly different in size, even though they have the exact same data and setup)
61 GB backup => 268 MB

So for my data size, it seems the ratio is between ~0.44-0.67% for these data files. Yeah, that’s big enough that I would not want it to constantly recreate it.

Most quite excellent! Can you explain why this worked for me in PowerShell to find files, but specifying the actual file path of the backup did not?

C:\Program Files\Duplicati 2> .\Duplicati.CommandLine.exe find file://dummy “file name” --dbpath=“<path to .sqlite file>”

Looks fine to me, though I don’t have a lot of spare time to read everything just yet.

That would be excellent. I do know of PDF converters out there, so there probably exists something for this use case. Ideally this HTML version would eventually have a comment section, like with PHP and some MSDN documentation, or at least a feedback button that provides the URL/page. This would be your best means of reasonably provided feedback, and I would likely end up being a major contributor towards that.

I do strongly request that the ability to use my external hard drive’s backup on any computer be implemented, if not already, and if it is, that this be documented.

Thanks for all the help/info!

kenkendk · November 22, 2017, 2:50pm

The --dbpath that you can set in advanced settings is actually ignored. Duplicati uses another variable for storing and managing the path to the local database. You can edit this one via the “Database …” menu.

When importing a backup configuration, Duplicati also ignores the dbpath from the configuration file and creates a new (random) filename. This is done to ensure that there are not two different backups that point to the same database. But it can cause confusion/problems:

github.com/duplicati/duplicati

Import does not seem to respect Backup.DBPath from a saved .json file

opened 10:21AM - 08 Nov 17 UTC

hcjehg

enhancement local database issue

- [X ] I have searched open and closed issues for duplicates. ---------------…------------------------- ## Environment info - **Duplicati version**: 2.0.2.1_beta_2017-08-01 - **Operating system**: Synology DSM DSM 6.1.4-15217 on a DS1815+ - **Backend**: Amazon Cloud Drive ## Description I exported a set of Backup settings. After DSM Update Duplicati was missing backup settings. I assume the Main database is not put in a convenient place... After restore of the saved backup config json file the Backup database was pointing to the wrong place, something like /root/.config/Duplicati/PBFMCKUZCI.sqlite ## Steps to reproduce I did not retry this! 1. Change a backup set database to a non standard position. 2. Export Backup settings ``` ... > "Backup": { > "ID": "1", > "Name": "fs02@Amazon", > "Tags": [], > "TargetURL": "amzcd://Backup/xxxxxxxxxx?authid=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", > "DBPath": "/volume1/duplicati/database/fs02@Amazon", > "Sources": [ > "/volume1/data/", > "/volume1/homes/" ... ``` 3. Delete Backup 4. Restore Backup settings 5. Open Database settings for Backup - **Actual result**: Datapabse Path show /root/.config/Duplicati/PBFMCKUZCI.sqlite - **Expected result**: Database Path show /volume1/duplicati/database/fs02@Amazon ## Screenshots ## Debug log