What Happens When You Use the Files Being Backed Up?

Something has been on my mind for a while: What happens when you are still using the files that a currently running backup job is backing up?

Example 1: Listening to music files while they are being backed up
Example 2: Editing photos while they are being backed up

Thanks!

It seems to me that neither of those cases are likely to mess with a running backup. Listening to music shouldn’t interfere at all with a file’s ability to be backed up by Duplicati, as any good music player would just have the particular music file you’re listening to open for read mode only. And a photo editor will typically have an independent copy of the photo in its own cache (plus any undo / meta info it needs), so the only way there’d even be a chance of it messing with a backup would be if you attempt to save the edited photo over the old file right when it’s being accessed by Duplicati. And even then I’d guess that the only error would be that your photo editor temporarily fails to save the new one, and you can retry again in 3 seconds once Duplicati is done backing up that photo.

Enable VSS snapshots and this will be a non-issue.

Duplicati will create a snapshot at the start of the backup and read the files in that snapshot copy. Actual files on your disk will be readable/writable without interfering with the snapshot being used for backups.

I am guessing that if you are NOT using snapshots, you will not be able to modify a file that Duplicati is in the middle of backing up. And likewise if you have a file open and locked at the same time Duplicati tries reading it for backup, it will fail and skip that file.

In your example 1 I doubt the music program will lock the file for exclusive access while it’s just being read by the player, so Duplicati would also be able to read the file for backups.

I would just enable snapshots to avoid any problems.

What about those of us using Linux? Is there something that we can enable to protect against having a file modified mid-backup?

Thanks, this makes sense, just for the photo editing part, I just need to make sure I am not accidentally creating or deleting any files onto the directory being backed up?

I am using Linux too and saw this in the manual:

Duplicati can make backups of files that are opened by other processes. For Windows, a snapshot of the file system is created using Volume Shadowcopy Services (VSS), LVM is used on Linux Systems. To be able to create a VSS snapshot, Duplicati needs C++ run-time components for Visual Studio 2015 to be installed and must be run with administrator privileges.

Is LVM automatically used?

@sylerner heads up too

You can create new files in the directory - Duplicati will either see the new file (if it hasn’t yet scanned that directory) or it won’t; either way I don’t believe there’s any risk to the original file (or the new one). And the only issue I know of with deleting a file would be if you delete it after Duplicati has scanned it but before it’s been packed into the backup files, which AFAIK is a pretty short time window (i.e. you’d probably have to be TRYING to mess it up, and even then I’m not sure how easy it would be).

Thanks! So it seems for bigger datasets, the possibility of deleted/modified files affecting the success of the backup is much higher due to longer scan times?

Sounds like another reason to having smaller backup jobs (or to simply not touch anything during the backup process…)

Still no answer on whether LVM snapshots are enabled by default, or if not, how to enable them?

Also, and this really isn’t a duplicati question but a Linux one, what does one have to do to migrate from having regular ext4 partitions to having everything managed by LVM?

Thanks!

You enable LVM the same way you turn on VSS. By changing the snapshot-policy setting.

https://duplicati.readthedocs.io/en/latest/06-advanced-options/#snapshot-policy

The original situation sounds not much of a risk, with or without snapshots. Also, If somehow you backup a halfway-saved photo, the next backup will likely get the full photo. Constant changes are harder to deal with, for example virtual machines and databases sometimes requires special tools and methods (e.g. database dumps) to be application-consistent. LVM does crash-consistent. VSS can use application help, if available.

Some Duplicati forum articles on consistency:

Backing up Hyper-V guests / vhdx

Backing Up Server 2012 R@ Access Denied

Linux distro LVM support varies. It adds capabilities and complexity (which some hide until it trips Duplicati).

Basically, I’m just saying that knowing the exact needs is important. Demanding needs are demanding. :disappointed:

As far as I am concerned, the examples I listed are my exact needs (for VMs I would use different solutions for them)

And it sounds like in this aspect, Windows is better supported than Linux. Am I right to make this assumption? (I am considering temporarily merging my NAS and workstation until at least mid-year)

LVM snapshots require you to have saved your files in a logical volume. Without logical volumes, or using a file system that supports snapshots, like ZFS or Btrfs, there is no way to make a snapshot on Linux.

Can Duplicati use ZFS snapshots or only LVM ones?

@sylerner: LVM is not used automatically (as it requires root access + LVM). You can enable it if you have LVM on your system.

But, it is less of an issue on Linux, because it uses advisory file locking (i.e. there is no hard way to lock a file). This allows Duplicati to read files that are in use, which is prevented in Windows.

Theoretically you could have a partial write, but it is not likely to happen, except for databases. However, most databases use transaction logs so you can usually recover even if you have a copy taken during updates.

@davidrnewman: Not directly, but since it is just invoking a bunch of scripts, it should be possible to adapt it to ZFS:

Any plans to support snapshots on the modern Filesystems (ZFS et. al)? is a feature request which goes further into the scripts, but looks like it might be awaiting help to finish it, perhaps by making a pull request. There are some ZFS scripts that perhaps @davidrnewman or any interested and capable party may try.

Sounds like for this aspect, Linux is better than Windows for Duplicati?

Or is this a case of “different sounding solution but same results”?