Very slow database rebuild

I believe that’s correct - of course it’s 2 days later now, hopefully it has progressed a little more…

Hi,

Nope, still showing exactly the same progress.

Database file hasn’t been written to since morning of July 19th…

My backups haven’t been running as a result since 3rd July. Not ideal.

Andy

Absolutely agree there - unfortunately, while we have heard from a small number of people about this issue we haven’t been able to figure out exactly what causes it.

Even worse, an aborted rebuild can’t be continued so you end up just starting over again.

@kenkendk, is there a more appropriate way to check on the status of a database rebuild than OverallProgress in the lastPgEvent property on the System Info page?

Still going. Progress is now 0.7214286, so in a week it’s only gone up by about 0.7%.

This is going to take a long time to complete…

Andy

I’m almost tempted just to bin this backup and start again. However it’s a lot of data so would take a long time to complete.

Is there any reason to expect this rebuild to complete soon? If it’s managing 0.7% a week, then extrapolating that suggests it could take something like 40 weeks to complete!

Andy

That’s strange, I did big rebuild recently and it “only” took two days for 600GB. More info here Rebuilding Database Extremely Slow

Well, there was some surprise that it had completed that quick. My backup is more like 2TB I think.

Progress is still 0.7214286, so no progress in the last 19 hours.

Andy

Progress is now 0.7285714.

Is there any information / debugging I can get to you to try to work out why this is taking so long? I’m a developer myself, so if I need to do anything like provide an strace output or similar, I should be able to do that.

Is it even worth allowing this rebuild to complete? Would it be easier just to delete the backup and start again? I could probably live with that, as it’s just a lot of information that doesn’t change often, so I don’t need access to old versions etc.

I assume I can abandon the rebuild and then delete the backup set?

Thanks

Andy

Yes, that’s an option - though the backup set stored at the destination can still be used to restore FROM even if the local database doesn’t exist, so if you have the space you might want to keep it around for a bit while the new job is “filling up”.

It’s possible @kenkendk might have some input or testing requests… We know this is an issue in some instances, but have yet to figure out why it happens only some of the time. :frowning:

Ok

Best get those requests in son, because it’s now been bout a month with no backups. Can’t really allow this to continue for weeks more.

Andy

I have seen this too. What is even more surprising, sometimes during recreate duplicati seems to be doing literally nothing. It doesn’t use CPU and windows shows neither disk nor network, seemingly for minutes or even hours…

Looking at some of the queries (using joins and other not so fast approaches) in the log I also wondered if dumb simple batch inserts wouldn’t be possible…

Giving up on this now. Can’t afford to have it blocking all my other backups for any longer.

Will start again but keep the old files in place just in case.

Andy

Thanks for letting us know. Good luck with the new backup - at least now you can review your processes and decide if you want to split or join things into smaller or larger backup sets. :slight_smile:

Also, remember the files in the destination can still be used as a restore source via “Direct restore from backup files” so if you don’t need the space right away, it might be a good idea to keep those old backup files around while the new backup “fills in”.

For what it’s worth: I’m on my third try to delete/rebuild a huge database (>4TB target files on Google Drive) on Mac OSX (was High Sierra, now Mojave)… I takes about three weeks (or more), and has crashed twice with a “Panic” OSX failure. By the third week (now), my MacbookPro 15" (retina, mid 2014) w/1TB SSD drive is crawling, with some functions (printing, copying, activity monitor) failing intermittently. I’m only seeing about two transaction sets per day in the log now (a download and a bunch of SQL commands, then nothing for another 12 hours). I hope that, since I’ve upgraded to Mojave, this one runs to completion without crashing my system. Meanwhile, I haven’t been able to use Duplicati for weeks, and my non-Duplicati work on my Macbook have been very slow (e.g., taking an hour or more to start printing).

What would really help right now would be the ability to checkpoint the database rebuild, so that I could reboot. I’ve seen that suggested elsewhere.

Any thoughts on how to manage this situation better are welcome.

…Steve

2 Likes

If your main worry is that you need to leave your mac running, why don’t you rent a virtual server for a couple of days or weeks? You can probably find some not too expensive powerful virtual machines out there. Just make sure you don’t change the backup files while it’s running.

Until we get the slow recreate issue resolved, if a checkpoint would let recreates be resumable that sounds like a good idea to me.

Of course I have no idea how much effort that would take so my support doesn’t mean much. :slight_smile:

Update: I finally (December 4th) got a completed database rebuild and am running Duplicati normally again. The rebuild that finally completed took just about a month! Since the rebuilds cannot be checkpointed, whenever my MAC OSX Mojave locked up and had to be rebooted, I lost all progress and had to start again.

During the database rebuild, my system progressively became more sluggish due to high I/O traffic, and several subsystems and applications became unusable (e.g., printing worked intermittently and very slowly, other apps timed out accessing data and quit), which I believe contributed to the lockups. Also, since I couldn’t really use my system for normal workload (other than browsing and email), I lost several months of productivity.

Once I got a clean rebuild, I added the “auto-vacuum” and “smart backup retention” options to cull my database(s), which was over 63GB for a 4TB archive. The next backup took a week, and got caught when my system locked up and rebooted. The following backup proceeded to rebuild the database again, taking over a week, and then on the next backup resumed culling the database for another 4-5 days. Finally, the database is down to 5GB and backup (on Google Drive) about 800GB.

My observations: (1) “Rebuild” needs a checkpoint option, either automatic or manual; (2) Large databases are unwieldy and take too long to do anything; (3) it is impossible to “keep all backups” for a few years for an active drive, because the database becomes too large, so either use “smart backup retention” (and risk losing something you may want) or cull your backups manually.

Thank you. I appreciate all of the help and support from the Duplicati developers and community.

What caused your backup to go from 4tb to 800gb?

It seems he changed retention, cutting down on the number of versions stored in backup.

That would mean there was more than 3tb of wasted space after retention policy clean up. Highly unlikely, no? It depends on the type of data ofcourse. But could also be a wrong blocksize I think. To be checked/confirmed by @StevenKSanford.