Duplicati created 70 GB of redundant backup data - Can I delete it?

Hi!

I use Duplicati on an Ubuntu system. I had to restore a 2 months old drive image backup of the Ubuntu system partition.
I forgot to recreate database in Duplicati just after the restore, so when it started backing up, it seems to have backed up the 2 months of new and changed files - that had already been backed up before the drive image restore - increasing the backup from 330 GB to 404 GB. I guess these new 70 GB are redundant.
Then I did a recreate database and I tried something called “Compact Now”, but the backup is still 404 GB.
The strangest thing is that 1-2 years ago, I had already moved the Duplicati database to another partition, that was NOT changed by the restore.
And I can add that I have “Smart Backup Retention”.
Now, the question is if there is some safe way of getting Duplicati to delete the 70 GB redundant data?

Thank you. :slight_smile:

This is a good way to break Duplicati. If you grab an old database, then Repair, newer backup is deleted. Going for backup would get complaints about all the excess files. Did you get any messages about this?

How could they be present to backup after an old image restore? Is there another restore not mentioned?

What did the image restore overwrite? Duplicati installation? Duplicati job configuration? What needed fix?

Posting the statistics from the top of your job log might help. Are pre-image-restore backup logs present?
If so, you can look at The COMPARE command in Commandline after clearing Commandline arguments.

Hi ts678!

Thank you for your reply!

I can’t see how to quote here…

I forgot to emphasize that the majority of data to backup by Duplicati is on another partition which was not restored or otherwise affected. (A non Ubuntu data-partition.)

The strange thing is that I moved the Duplicati database to that data-partition long ago (to avoid the type of problem that I am now facing…)

I did not get any complaints about excess files.

The drive image restore overwrote the Ubuntu system partition, including the Duplicati installation files. I didn’t try to fix anything before Duplicati did it’s first automatic daily backup after the restore.

Here is the part of Duplicati that I moved long ago to the data-partition which is unaffected by the drive image restore:

/mnt/4AF15A0435E762B4/Duplicati/ :

total 2897672
0 drwxrwxrwx 1 root root 0 Sep 7 21:40 control_dir_v2
4 drwxrwxrwx 1 root root 4096 Sep 12 2020 updates
1148724 -rwxrwxrwx 1 root root 1176293376 May 20 2021 ‘backup NQWTCGLDTL 20210521060232.sqlite’
368 -rwxrwxrwx 1 root root 376832 Nov 8 14:29 Duplicati-server.sqlite
1748576 -rwxrwxrwx 1 root root 1790545920 Nov 7 21:11 NQWTCGLDTL.sqlite

/mnt/4AF15A0435E762B4/Duplicati/control_dir_v2 :

total 1
1 -rwxrwxrwx 1 root root 6 Nov 8 14:28 lock_v2

/mnt/4AF15A0435E762B4/Duplicati/updates :

total 62
24 drwxrwxrwx 1 root root 24576 Oct 16 2019 2.0.4.23
36 drwxrwxrwx 1 root root 36864 Jan 25 2020 2.0.5.1
1 -rwxrwxrwx 1 root root 7 Jan 25 2020 current
1 -rwxrwxrwx 1 root root 283 Jun 2 2018 installation.txt
1 -rwxrwxrwx 1 root root 417 Jun 2 2018 README.txt

I tried a “Recreate database” before posting here on the forum.
In the browser interface, under “General”, there doesn’t seem to be any log before that.

Is this the part of the log you are asking for?:

  • 1 Nov 2022 02:01 - Operation: Backup

  • 1 Nov 2022 01:11 - Operation: Repair

Time Start 2022-10-31 19:44:37 End 2022-11-01 01:11:01 Duration 05:26:24

Source Files Examined 0 (0 bytes) Opened 0 (0 bytes) Added 0 (0 bytes) Modified 0 (0 bytes) Deleted 0

Recreate Database Phase

Start 2022-10-31 19:44:37 End 2022-11-01 01:10:51 Duration 05:26:14 Warnings 0 Errors 0

Warnings 1

  • 2022-10-31 21:56:17 +01 - [Warning-Duplicati.Library.Main.Database.LocalRecreateDatabase-MissingVolumesDetected]: Found 1 missing volumes; attempting to replace blocks from existing volumes

Errors 2

  • 2022-10-31 20:46:08 +01 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-MissingFileDetected]: Remote file referenced as duplicati-b692ed959eb644f0a9fc9154ea5860636.dblock.zip.aes by duplicati-i4fa7dedd5a6645409ef8490a3fd05c81.dindex.zip.aes, but not found in list, registering a missing remote file
  • 2022-10-31 20:46:08 +01 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-IndexFileProcessingFailed]: Failed to process index file: duplicati-i4fa7dedd5a6645409ef8490a3fd05c81.dindex.zip.aes

Complete log

Highlight text, get a quote option.

Database Repair from Recreate button loses old logs from database. Your newer backup then shows.

If you showed Repair, Backup was the one that would maybe explain what was seen, then backed up.
Repair log is interesting because errors aren’t normal, but seeing finish and not stop is probably good.

Now that I know a little more about what you did, wouldn’t it back up the 2 month old files from image?
What’s your retention set to? You can also check your Restore dropdown to look at the backup dates.
If you only keep them a short while in Duplicati, you might have re-added old versions rather than new.

If you can decide that the backup version is not desired, find its number on Restore dropdown and run
The DELETE command. Use Duplicati to get the more current restore, and its next backup will do little.

I am sorry, but this is beginning to sound a bit confusing and complicated to me. :frowning:
I had hoped that there was some ‘button’ on the Duplicati interface that would do the trick.
As I said earlier: I use “Smart Backup Retention”

Let me try again to follow the short initial explanation and its later amendments about partitioning, etc.

Is this the partition below which Duplicati’s seeming databases are on, and also the data it backs up?

Do you recall how you moved it? Can you also look at the old location? Did reinstall mess up move?
–server-datafolder or DUPLICATI_HOME environment variable would be usuall ways to do the move.

First looks like the server database with your configs (which is why you didn’t need to redo the config).
Large file (this looks like a big backup) would be the job database. Does that screen verify that’s right?
I don’t know what was Nov 7 and Nov 8. You only posted 1 Nov 2022 02:01 - Operation: Backup

If the drive image restore didn’t affect it, why would its recreate be helpful?

On the Restore dropdown, do you see what you like in terms of old backups, plus 1 Nov 2022 02:01?

There is always the possibility of downloading your backend to an old hard disk for the sake of security, then wipe the backend and the backup, delete the database, and create a new backup. Easier than trying to recover from two consecutive mistakes.

Perhaps, but a delete is also pretty easy. The harder chase is figuring out why it should be necessary.

Maybe “it seems to” idea could be explained by @Henr. How can we know no files changed recently?
Although we need the compare command to see actual file names, log has general info. An example is:

image

Or in Complete log

  "DeletedFiles": 0,
  "DeletedFolders": 0,
  "ModifiedFiles": 6,
  "ExaminedFiles": 6526,
  "OpenedFiles": 6,
  "AddedFiles": 0,
  "SizeOfModifiedFiles": 104091431,
  "SizeOfAddedFiles": 0,
  "SizeOfExaminedFiles": 7904386730,
  "SizeOfOpenedFiles": 104091431,
...
    "BackendStatistics": {
      "RemoteCalls": 12,
      "BytesUploaded": 44419466,
      "BytesDownloaded": 53490071,
      "FilesUploaded": 4,
      "FilesDownloaded": 3,

The claim was that BytesUploaded would be 70 GB, so how does that look from source file statistics?
Uploaded data would typically be lower because only changed blocks are uploaded from modified files.

Easiest path, if the Restore looks right, is to retract the idea of redundant data rather than needed data.
Linux find command could also see what source files have changed recently, if one wants that opinion.

EDIT:

which GUI explains very briefly as:

Over time backups will be deleted automatically. There will remain one backup for each of the last 7 days, each of the last 4 weeks, each of the last 12 months.

Is above what Restore list shows? The backup version (for a delete) is the first number (before date).
Newest one is always version 0. This is what you give to the delete command as the version option.

One scenario could be that restoring the old image backup of the system partition wiped out changes to a systemd unit file or an /etc/default/duplicati file that did the move. This seems unlikely because of the post:

but that theory begs the question of what mechanism preserved the move despite image restore. Some methods would do it, so I’m asking how it was done. I’m not hearing it was done again to get going again.

If you don’t know where past databases may be, one guess would be ~root/.config/Duplicati, but it varies.

Looking at what’s in the database for Restore is good too. If still unsure, maybe do a restore of some sort. Date and contents of the version just before the one being complained about would be a good one to test, because if it’s a lot different, that would explain why the version complained about had to upload changes.

This is getting too complicated. You are asking questions I don’t think I can answer. :frowning:
I guess I will have to accept that there is about 70 GB redundant data (out of 404 GB) in my backup.
The saddest thing is that I moved the Duplicati database away from the Ubuntu system partition exactly to avoid this kind of problem…
And maybe I have configured Duplicati in some erroneous way that makes it backup a lot of ‘crap’ on the Ubuntu system partition?

Thank you for trying to help me!

You can’t push the Restore button then type or screenshot the view?

I’m not accepting that yet. All I know without help is that backup grew.
Maybe all is fine. Without any information, I can’t say. Test backups…

Good luck.

EDIT:

If you’re desperately trying to cut costs, lose old versions, and clean up the DB recreate glitch, deleting database and Destination contents will let you start clean (and leave you without full backup for awhile).
Or you can (as suggested earlier) keep old for some time and also get new one. Or do some variation.

what I understand it like this:

  • backup initially good.
  • restore image: database restored to 2 months earlier.
  • Duplicati starts new backup: it see lot of new files because 2 months without backup, and can’t know that there are on the backend all these backups. So it backup these new files.
  • at the moment, the previous backups are not in the (old + new backup) database
  • when the database is recreated, all these files are reincorporated in the new database. Duplicati is doing its dedup when backuping, it can’t be engineered to ‘think’ that there is duplicated data on an existing backend - because it would mean that it has done bad mistakes, and software don’t like to admit mistakes.

At this moment things begin to look very bad for deleting things manually on the backend, because you don’t know how Duplicati has deduplicated stuff. Theoretically deleting stuff with Duplicati (the ‘Delete’ command) up to the 2 month limit could do the trick. But I think that is pushing things.

The claim is that it was not, and that database was not changed. Truth unknown, but under your case:

You’re talking about source files, but it sees a lot of new backup files too, and it errors, looking like this:

image

so doesn’t match unless Advanced option called no-backend-verification was used. It seems unlikely.

If this flag is set, the local database is not compared to the remote filelist on startup. The intended usage for this option is to work correctly in cases where the filelisting is broken or unavailable.

Set it anyway and continue test by Backup. In this case I get redundant data, but did it happen this way?

Why? Suggested database revert to old one would have had them. What would have removed previous?

To be specific about my test, I took a small text file. I backup, change it, backup, revert to old DB, backup.
Files are sets of 3 with dblock, dindex, and dlist. First two sets are good. Set after DB revert is redundant.

Restore dropdown is missing the second backup at 2:06 because DB revert took it to 2:04. 2:18 is newer:

image

Recreate from destination files is prone to confusion (as you say) from earlier confusion, but try it anyway:

image

Now it knows all three versions, because it saw the three dlist files with those dates in their filenames.
The above kind of view just needs the user clicking the Restore button and dropdown, but I can’t get it…

image

EDIT:

Maybe the missing “previous backups” you meant were those after the image backup and before new?
I originally thought “previous” meant all, or the very old ones, not a hole in the middle times as in above.

You are right, I forgot about that. This needs explaining.

Is this the info you asked for:

Restoring files from a backup describes it. Please read. Have you never restored? If not, time to practice:

In the first step, select the restore point from which you want to restore some files by selecting a date and time behind Restore from. Each restore point will list all files and folders included in the backup exactly as they were at the listed timestamp.

If you compare your image to mine above, you’ll see that mine has expanded to show what backups I have. Additionally, the numbers to the left of the date are the version numbers if you decide you want to do delete. To see the dates and version numbers, click on the visible date. Down-arrow at right is a clue it will expand.

OK… I get the idea… Below is the drop-down menu.
I can add that on Oct 28, 2022 I did the drive image restore on the Ubuntu system partition of a backup from August 23.
Do we need information from clicking on some of them?: