How to backup the backup?

Welcome to the forum @emederios

I’m not sure I follow these two thoughts. The latter complete copy would be made incrementally as first paragraph suggests. Duplicati makes new files, doesn’t overwrite, but will delete when it runs compact.

Keeping the original copy of the rsync (or rclone or whatever one-way file-based sync) would allow the restore to use that copy without restoring a backup-of-backup, unless ransomware etc. destroyed it…

Propagating issues, whether from Duplicati bug or ransomware, is a concern with cascaded backups. Best practice for backup safety calls for multiple backups made in very different ways. Best protection against ransomware relies on backend security, as malware on host may have full control of machine.

SFTP/SSH backups to a Linux server with added security was one approach to protection against that.

Financial services industry regulations may require records in non-erasable and non-rewritable stores which some cloud providers (Amazon S3 for example) can provide. WORM also stops ransomware…

Well, I decided to deal with the possibility of intelligent ransomware that would search for profiles of popular backup programs, maybe it would even download new definitions periodically (although it could raise a flag), not to mention that it could also allow remote access, and then we would have a person acting there. And since Duplicati doesn’t encrypt credentials or settings (does it?), not only mapped shares or other obvious external locations could be reached.

By the way: I use SFTP, and the SFTP server is a container/jail without actual SSH, with the FTP server running a SFTP module, with “fake” users, and every user with their own credentials and base directories.

In my case it’s about 200 GB, still a lot. Especially being in Brazil…

Are you talking about the possibility of the NAS being attacked? And snapshots of only the NAS system?

Those are two different ways of doing it (incrementally or complete every day), maybe I didn’t phrase it well.

When I think of incremental copies of files using rsync or something like that (the files created by Duplicati being backed up) I imagine the files ending up in different folders with timestamps or some sequential order. Would it be OK to have all the incremental backups (backups of Duplicati files) being copied to the same folder (along with the first complete one)? Or could it create some mess when I tried to restore from those files using Duplicati without the original database (like, using the main, generic “Restore” function)?

Yes, I agree. Like I said in another post, I consider the hypothesis of someone acting on the infected computer and retrieving the credentials of whatever backup destination, not only obvious mapped SMB shares.

I use SFTP to Freenas, actually to a jail/container without actual SSH, with the FTP server running a SFTP module, with “fake” users, and every user/computer with their own credentials and their own rooted directories.

But would it be OK, to use Duplicati to a destination where data couldn’t be edited or deleted later?

In my case that is where my backups are stored, and then the NAS synchronizes near realtime to B2. So if the backups were encrypted by ransomware on my NAS, it would also affect my B2 copy. I could roll back to a previous snapshot on the NAS. Ransomware running on a separate PC would not be able to affect snapshots on the NAS, so I’m not worried about that. I could also consider delaying the B2 sync - I’m going to think about it…

And aren’t snapshots time or space consuming?

I’m testing this alternative backup model: Nextcloud server running in a container of the NAS, with its client running on the users computers. Their Desktop, Documents etc… folders were moved to their “Nextcloud” folder and are synced, and the client also permits adding other folders to be synced, and the actual backup job runs on the NAS (Duplicati on a container). So, the users files are synced to the NAS in real time, and Duplicati on the NAS makes the actual backup at night.

Nextcloud would sync ramsonwared files automatically to the server, although the server has a recycle bin and versioning (up to 50% of the user quota) that is just not accessible by the users (they just can’t login to the web interface). There is also an app for NC (on the server) that would somehow prevent or remediate ramsonware attacks, but I didn’t look at it yet.

In this same client I also have other model working, with Duplicati on each computer and an SFTP server on the same NAS as the destination. I’m testing both, both have pros and cons that I’m aware of.

In other client SyncBackPro is being used to sync files from the computers to the NAS, instead of Nextcloud. And Duplicati runs at night as well.

I also started to make this secondary backup (from the NAS) to AWS S3 in a couple of customers.

I think it’s too much fun!

Duplicati currently keeps all its backup files from all versions of all source files in one destination folder.
If you keep the structure intact and don’t add non-Duplicati files, you should be able to move backup to whatever destination you like, so the easiest way to have single-version backup of backup is to clone it.

Emergency restore can be done directly from clone files, assuming timing was such that ransomware didn’t clobber that too somehow, including possibility of sync/clone software propagating the damage… Keeping configuration information somewhere safe will help, because the goal is to restore the backup.

is answered above. An unmodified clone of the backup file area should do “direct restore” just as well as original did. When it comes time to get back into the backup business, full DB Recreate should work too.

It’s awkward because recycling old space is impossible (perhaps a maintenance window for compact is a tolerable risk – if an attacker is camped on your system waiting for that, you’ve got a very severe problem).

Defense against trojans gets into a discussion of an append-only approach to file changes as a defense…

EDIT:

The REPAIR command (which is being rewritten so I’m not sure what it will do when/if the rewrite finishes) can alter remote files to try to line up the local database records and the remote. One unfortunate accident that can happen is if an old DB is restored from a backup, repair will consider later backup files as extras and delete them. So in that case, the inability to delete the newer backup files would become a “feature”…

1 Like

Yes they consume space based on the amount of data that changes from the time the snapshot is created. In my case I’m keeping them only about a week. Should be plenty of time to discover the I’ve been hit by ransomware and it has affected the NAS.

But to have unmodified clones of the backup files I would have to clone the entire set of backups every day, right? If I used some tool to make incremental backups of the backup files, with all the complete and incremental files ending up on the same folder, wouldn’t it be a “modified clone”?

To maintain an existing clone, you only need to copy changes, which for Duplicati is either new files or the occasional deletion (“copies” basically instantly). Existing unchanged files in the clone just remain there…

Something’s still not clicking. Duplicati’s backup is one folder. Initial clone copies all files. After that, new… Resulting clone would be identical to original, so I wouldn’t think of it as a “modified clone” but exact clone.

Perhaps you’ve never looked at a Duplicati backup folder? Its files don’t look anything like the source files.

How the backup process works

Computer A makes its first Duplicati backup on server B in the morning, at night some tool on server B copies these newly created files to server C. On the next day computer A makes the second Duplicati backup to server B, that will take considerably less time than the day before, and later server B makes an incremental backup to server C, but not using different folders with timestamps or something, and instead to the same unique folder: the folder on server C won’t be an unmodified clone of the folder on server B; I mean: I think it would be at first, but not after the specified retention period for the task, I guess (that’s my point, my doubt here). And I don’t want anything that would actually clone, mirror the two folders, since the unwanted deletion would be mirrored as well, right?. Couldn’t it mess things up if someday I needed to restore files from server C if I don’t use Duplicati on computer A to restore them (I wouldn’t be using the original database)?

Still not connecting here.

What unwanted deletion? Ransomware wouldn’t delete (I think), If Duplicati means to delete, you mirror it.

Does “backup” mean Duplicati backup? I’m talking about mirroring for imperfect solution you can get now. Why do different folders and timestamps come up? Mirroring is mirroring. Structure and times stay same.
Server B would make a copy of the new added files, by transfer to server C. If a file is deleted, it’s deleted.

The database is expendable and can be recreated from the backup. It’s a cache of information about that. Database management shows the Recreate button. It can be slow, sometimes finds issues, but it exists.

Say the initial backup makes 101 files – 50 dblock, 50 dindex for the dblock, and 1 dlist. Mirror all of those.

Next day backup makes 5 files with changed blocks – 2 dblock, 2 dindex for the dblock, and 1 dlist. Mirror.

Sometime later Duplicati says too many files or space is waste, so compact packs old dblocks into new. Old dblocks whose content blocks are repackaged have no value, so delete them and mirror the deletion.

Having to deal with append-only updates stops deletions, so you set –no-auto-compact and waste space. There are several wishes mixed together here. Maybe having a backup of the backup is enough by itself? Arrange access methods and credentials to make it harder for malware to destroy both of the backups…

Agreed (except there aren’t generally any changed backup files). :wink:

If you want more than a single-copy mirror, you can still have Duplicati backup a Duplicati backup, but the setup is up to you. You’d also need to do the restore from second-level backup to elsewhere to do restore, and restore can be slower than simple file copy because dblocks need to be gathered, disassembled into blocks, and the blocks then put into the final destination. It’s slower than file copy-back, and file copy-back is slower than just pointing Duplicati to the second level destination and doing a direct restore or whatever.

The main point of MY observations and questions here is based on the possibility of advanced ransomware, or someone with remote access, attacking the destination of the backups of the infected computer.

In those particular sentences, no (because Duplicati doesn’t have such resource, at least not yet).

A solution that would automatically mirror the backup files affected by any attacker from an infected computer (or the sudden lack of files). The solution you give is awesome if we think only about computers stopping to work, fires that burn entire offices etc…

Backup programs that keep the files in their original form on the destination, and do complete, differential or incremental backup, usually create a folder for each task on the destination, usually adding a timestamp on the name of the folders. Cobian Backup is a good example.

So, if I take a set of Duplicati backup files from one computer, and just as a test dump some Duplicati files from some other computer (or old Duplicati backup files from that same computer) along with them, including some index files, I can be sure that I can restore that backup using Duplicati on a third computer without any problems? Because that would be the scenario after some incremental or differential rsync or whatever backups, depending on the retention settings of the Duplicati tasks that are being backed up.

Don’t ever mix backups from different computers together in the same folder unless you use the –prefix option to make them distinguishable. Mixing old backup files from the same computer is totally standard provided Duplicati does the mix. In the example I gave, the 106 files after second backup sit side by side.

Losing terminology trail again. rsync is not a backup, so incremental and differential in a backup sense doesn’t exist. Mirror is a mirror. If retention settings delete a backup version, its blocks are now wasted space (EDIT: provided nothing that remains uses a given block), and that waste is reduced by periodic compact if available. Regardless, if original backup is OK, then mirror of original backup is OK. As noted, this leaves open the possibility of ransomware attack on files from first backup, but more sophisticated backup would still backup the damage to second backup provided that timing landed so that that could happen. The difference is that (assuming second backup wasn’t itself wiped out by ransomware), an earlier version of first-level backup could be restored, to get original source files. As mentioned, this is cumbersome and much slower, but you can do it if you wish.

I’m not following the retention settings concern at all. Can you explain the processing you have in mind, assuming you disagree with the processing I just described (all Duplicati process + mirror everything)?

What is the operating system on your second backup target (computer C)? If it supports snapshots that may be your best bet. Ransomware running remotely cannot affect snapshots.

Alternatively, if it’s an operating system that supports filesystem level deduplication (Windows Server, for instance), then maybe you can even get clever and do something more like your original idea where you’d sync the entire Duplicati backup folder on computer “B” to a different folder each day on computer “C”.

This is probably only workable on a LAN unless you have a super fast internet connection. Each daily folder will be quite large, BUT with filesystem level dedupe you will mitigate that issue.

Either way, if your computer “B” backup gets corrupt and it gets copied to computer “C”, you can either roll back to a previous snapshot or just look at an earlier dated folder.

You can copy and paste using Windows’ File Explorer and have a backup. You can also use rsync with the proper options to create incremental or differential backups on the destination, comparing the files from previous tasks (you need a script for that actually).

But the only single point for me is that original backup may not be OK. I’m not talking about the original backup containing versions of files that were encrypted by ransomware on the source computer (or lacking files that were just deleted maliciously), but about the original Duplicati backup files being wiped out or even encrypted by ransomware from the remote computer.

I understand that you are trying to help me (and you are helping me indeed), but the point is not doing something cumbersome and slow, and that would increase the cost.

In two environments I’m trying this model: mirroring the files on the users’ computers to a local NAS, and doing the actual daily backups with Duplicati at night, inside the NAS, inside the same pool, and also a secondary daily backup, also with Duplicati, to a remote location. In one case I’m using Nextcloud (the server is on the NAS), in the other I’m using SyncBackPro (to a SFTP server on the NAS). Nextcloud is good in syncing changes automatically to its server, but I’m still working on a way of checking if clients are connecting to the server properly (I will use the “lastseen” command on a script) and if they are not tampering the settings somehow.

P.S.: I forgot to say: on that same environment with Nextcloud plus Duplicati being tested there are also backups from each computer to a SFTP server on the NAS using Duplicati, and that’s the scenario I’m focusing here (forwarding the latter backups somewher else). I started to use Duplicati recently, and it’s so neat, I even have HTTPS on every single computer’s web interface, and it would be great if I could stick to it only.

I’m not sure where this is heading. The main backup may not be OK. The secondary may not be OK because it may pick up a bad main backup (this can be covered by versioning), or get a direct attack.

Orchestration of the two stage backup is an issue. If security were not a concern, you might consider having third party duplicati-client drive two copies of Duplicati, but having Duplicati get remote control exposes it to attacks on its web server (which is not hardened) or just by attacker stealing credentials. Reducing risk could be done at the Duplicati level using its IP controls, but external firewalling is safer.

Another thing you can do is to use –upload-verification-file and utility-scripts/DuplicatiVerify.py to check backup integrity before doing the secondary backup. This might also find non-security integrity issues.

A simple attack would do something like encrypt original source files. Next step up might try to get the backup by getting access credentials from Duplicati (or whatever). If a tool can get there, and attacker attacks the tool, then attacker can get there. Defense must be server-side, e.g. using files immutable normally, and with ability to alter that policy well-protected, and without the credentials on uploader PC.

It really depends on how far you want to go. Ordinary ransomware is probably easiest to stop. Lengthy assault by skilled attackers in the computers are harder, as they will try to move through your systems. Unless your data is very high-value, I’d worry more about the simpler forms of attacks, but it’s your call.

Offline backups may be an option, but for online all I can say is make sure remote access can’t destroy backups. Don’t even give the uploading systems (which may be totally compromised) a way to do that.

Better still would be to restrict the web interface to localhost, ssh to remote, and browse to localhost GUI. Using HTTPS protects against eavesdropping, and gives you some assurance you’re on the correct site, otherwise someone could steal credentials via MITM. HTTPS doesn’t block attacks on the web server or even simple password guessing. Something like SSH is more hardened, and has attack mitigation tools, however mitigating rapid password guessing can sometimes leave you open to denial-of-service attacks. Also note the earlier recommendation to not leave the web server accessible. Best to firewall, if possible. Better still to firewall SSH if possible, and stick to localhost. Depends on how seriously you want security. There have been several forum users who worry about specific crypto algorithms of SSH they find weak. Security can have weak spots, so please keep overall system view in mind, and use layers of protection.

EDIT: “ssh to remote” refers to port forwarding. Basically create an encrypted tunnel to do your browsing.

Firewall only allows access from one single IP.

And I’ll take a look at the options you suggested. Thank you!

I do use rsync, robocopy and rclone to backup the backups + backup directory snapshotting with versioning. But due to serious Duplicati reliability issues I would recommend running separate backup sets, instead of copying the possibly non-restorable backups to multiple destinations. At least in that case there’s possibility that you can restore from another source, when the restore fails with one set. - And do test your backups with full restore. The duplicati test option unfortunately isn’t reliable.

Edit: Repair getting rewritten is great news. Let’s hope it finally ends the extremely serious non-restorable backup issues. I’m sure the issue isn’t technically big, but the ramifications of non-restorable backups are just huge.

I can be sure that I can restore that backup using Duplicati on a third computer without any problems?

No you can’t, unless you test it often. And if you do, you’ll find out that it works mostly, but at times it doesn’t. See my post linked above.

Edit:
Database recreate - It can be slow, sometimes finds issues, but it exists.

Yeah, it can take weeks and fail after that.

The main point of MY observations and questions here is based on the possibility of advanced ransomware, or someone with remote access, attacking the destination of the backups of the infected computer.

This is exactly why I’ve been asking for remote compact. It would allow the backup clients to work with create only access and without modify / delete permissions. Compaction would be done by secondary system with delete / modify access. Of course it would also reduce bandwidth needed, because there’s no need to download & re-upload data when compacting. But the primary aspect for that was security considerations. Anything which has been uploaded, can’t get deleted.

Of course you can do this also by having versioned backups of the backups. But the way above, would be slightly more elegant.

Interesting idea, but it would be a big design change. I think one of Duplicati’s strengths (and in some situations a weakness) is that the back end where data is stored is just dumb storage. There is no remote engine at all.

If there WERE a remote engine, then Duplicati could talk to it instead of the remote storage. The remote engine would be responsible for writing data, handling compaction, pruning, etc. This is more along the design of CrashPlan. There are advantages and disadvantages to it.

I prefer Duplicati the way it is now - client software + dumb remote storage. I’m not having the issues you describe in that other thread, so maybe that is why I’m more confident in its current design.