Increasing backup duration

mmiat · April 4, 2018, 9:16am

server X:
i’ve a 1tb of data, backup requires 5-6h, but from a couple of days it’s increasing, last backup required 8h30m, is it because i keep too much versions? every days i’ve 100-200mb of new data, not more, now i’ve 37 versions
duplicati sqlite db is now 13GB

server S:
33GB of data, new data not all days and less then 30MB, backup increase from 3h30m until more than 6 hours, 10 versions
duplicati sqlite db is now 6.7GB

is it normal it requires so much time after some run?

mmiat · April 4, 2018, 9:43am

just now it hangs with error:

Failed: Insertion failed because the database is full
database or disk is full
Details: Mono.Data.Sqlite.SqliteException (0x80004005): Insertion failed because the database is full
database or disk is full

but i’ve space:
150GB in /, where i’ve duplicati log, and 1.5T in /srv/dev-disk-by-label-Backup where i’ve backup file and tmpdir
and also more than 90% inode free

mmiat · April 6, 2018, 12:24pm

after a new run it seems ok, no more errors

Has this something in common with my problem about time increasing?

2018-04-02 - 2.0.3.3_beta_2018-04-02

Added a new retention policy and UI which allows backup versions to decrease over time

JonMikelV · April 10, 2018, 7:04pm

Hi @mmiat, welcome to the forum!

Assuming you’re using the standard block size of 100KB 100-200MB of new data a day would mean around 1k-2k of new block hash lookups per day.

Depending on your system specs and the size of the sqlite database for your backup job that could could be what’s taking so long. Of course if you’ve got a slow connection to your destination that could be the issue as well (but it would have to be pretty slow to take 8 hours for 200 MB).

mmiat · April 12, 2018, 3:12pm

i’ve investigated, these are some results:
14/03 6h
19/03 7h (about 800MiB of new data, 30GiB of modified data)
26/03 8h (about 4GiB of new data, 30GiB of modified data)
03/04 11h (about 2GiB of new data, 30GiB of modified data)
backup are schedules from mon to fri, i’ve written the sum of days data (not so precise)
block size is 100MB

JonMikelV · April 13, 2018, 4:09pm

Technically it’s not a problem, but a 100MB block size is likely to use more bandwidth than a smaller size - at least with modified data.

On the flip side, with ~6GB of new/changed data every day (~32G / 5 days?) larger block size is likely what’s keeping your backups from being even slower.

Is your Duplicati sqlite database still at 13G or is growing a lot every day/week as well?

mmiat · April 16, 2018, 7:31am

sqlite is still 13GB
i should delete backup and restart with a less block size? 10MB?

JonMikelV · April 16, 2018, 7:47pm

If you have the space available, I’d suggest you leave the current backup in place and try making a 2nd backup (into a different folder) with the different block size and see how the performance of that one works for you.

The current “runs-too-long” backup is still good and can be restored from, so there’s no reason to delete it unless you’re out of space. You can always turn off the scheduling of the first backup for a while to let the second backup run a few times so you can compare performance.

mmiat · April 17, 2018, 7:06am

ok i’ll try
I’ve 7.4M files, 1TB of data, average file dimension is 141Kb, so which block size can i choose?
thanks

JonMikelV · April 19, 2018, 2:32pm

It’s tough to suggest a block size because the results can vary depending on your CPU, RAM, disk I/O, bandwidth etc. however did you read this page at all?

Is this 141KB average file size on server X (with the 13GB sqlite file after only 37 versions)?

mmiat · May 2, 2018, 7:39am

i’ve run a new backup, with 100Kb of block size
1st run required 68h, and the .sqlite was about 5GB
the 2nd run required 6h, 3rd required 5h, and so on for a week
now, after 6 runs, .sqlite file is about 13GB (again…) and backup is running from 10 hours, with no indications of what % it is…
now wait, and hope…

JonMikelV · May 2, 2018, 4:26pm

I suspect this is NOT what’s going on for you, but I did want to mention that more recent versions (I think AFTER your 2.0.3.3) included a change that ended up making backups run more slowly. If it does turn out this is the cause for you, a fix is already pending.

Oh, and I just realized I never answered your question about the retention policy stuff from April 6.

The retention policy code only kicks in if you are using a “Backup retention” (on step 5 “Options” of your job edit) other than “Keep all backups”.

While an aggressive retention policy combined with lots of daily source changes could increase perceived backup times (it’s really just increasing the retention cleanup step that happens AFTER the actual backup is completed) I suspect that’s not the case for you.

@Pectojin might have a thought on how to verify, but I really think the database size is the cause of the performance issue for you. I know @kenkendk has an update coming to address the size as a few other users have asked about it, so hopefully that will be available soon.

Pectojin · May 2, 2018, 6:34pm

It sounds like the slowdown is caused by the DB size growing, but the DB growing 260% after 6 runs sounds crazy.

If just a few files are changed between backups then the DB should barely grow at all, but growing 260% would require you to basically replace all your data two and a half times during those 6 backups, while still retaining the old backup data (i.e. no retention policy). I can’t think of any other way to cause that kind of DB growth.

It’s worth noting that the upcoming change about DB size is not necessarily a crazy improvement. It’s still storing one row per file in the DB but you save space because it doesn’t need to store the full path for each file. I expect it to help generally, but I would be very surprised to see more than 10-20% drop in DB size.

mmiat · May 3, 2018, 7:48am

i think there are some trouble with new version… also in another server i’ve some issues…
so i downgraded to 2.0.3.2-1 and restart all, we’ll see… stay tuned…
thanks