I notice that the backup sqlite grows to, say, a gigabyte. From incidental information, I gather that the backup sqlite is vital in restoring. Please correct if I am wrong. But if so, imo, the gui should guard against not backing up the sqlite along with the other files.
A gui is obliged to not leave its less familiar users in the cold.
My feeling now is that the duplicati gui does. Since I do think duplicati is worth while, I submit.
I have needed (fault on me?) more than a day to get things set up reasonably.
Pitfalls include circular backups in the cloud (including the synchronized user folders), following links by default, when that should be made explicit and finally the above.
Next, the default settings seem to me to need improvement for any reasonable performance, especially on the first (voluminous) backup.
Thank you for taking this into consideration,
Jan
That is a common misunderstanding. I have written a post to clarify this:
Explaining the local database.
Earlier versions of Duplicati had a bug that caused the index files to sometimes be incorrect, causing a very long-running database recreate. For that reason, some users prefer to make a backup of the local database, but Duplicati does not encourage this, as it should not be needed.
I think we should strive to get the best possible defaults.
If you can explain the issue with having symlinks being followed by default, we can consider changing the defaults.
For the circular mention, I assume you are making a backup of a folder that syncs the backup data back to the local disk. Not sure what a great way would be to detect or prevent this.
That depends on: network speed, storage solution, source size, and restore expectations.
If you store on something that can handle large volumes of files (S3, B2, etc) I would argue that 50 MB is a reasonable size, especially since it makes it less time consuming to restore partial sets.
I would love to hear arguments for having a different default size though.
Local database backup: Thank you for the clear explanation. Putting the (compressed) database in the cloud is also a bit less opaque.
Symlinks are imo a method to avoid duplication, which aligns with default action “store” during backup. E.g. in a llvm build there are loads of symlinks for this purpose or really, to satisfy expectations of own or other client programs. Other useful aspects are avoidance of different versions and easy quick updates/upgrades.
When a symlink links to an out of (current) tree object, it may be argued that following the link is better or more guaranteeing a full backup, but I would backup everything in question anyway.
The performance points were not about de dblock (50 MB), which is reasonable and not particularly sensitive, I believe, but the hashblock size, which is perhaps more sensitive.
At first (and still, but less) my resources were/are underutilized. The upload speed is very spiky and on a low average, cpu-utilization seldom gets above 2 threads (out of 24), disk utilization in the 10% region.
Circular backup: I backup to dropbox destination and have a dropbox-synchronized folder tree.
So when I backup the volume having the synchronized folders, I need to take care that the backup destination is cloud-only.
Not sure whether that can be ascertained by duplicati, but perhaps possible? Quickly I also excluded the complete synchronized folder tree. But still would not have have liked to have the backup pushed on the local volume.
A suggestion for likely, but not overly safe defaults: Keep the default at the conservative value, then set an advanced option (new backup) doing the likely thing. In the case of symlinks, that is store rather than follow.
The user then sees this and can delete the option with one click if desired. If settings are used to give this a value, of course respect that value and do not inject an advanced option at the backup level.