Multiple hard disk failures

I have been using Duplicati on a small number of Windows sites. Mostly setup with it running as a service. Running with the beta with mainly default settings on Win10 PCs. And I have seen a seriously worrying trend.

Duplicati is killing hard disks. Not SSDs, but classic “spinning rust” drives are failing.

I’ve been running a small IT business for over fifteen years. Never have I seen so many hard disks failing as I am now. And they all have one thing in common - Duplicati was in use.

I only had Duplicati at half a dozen sites. Testing it out as a backup option. And I am seeing hard disks failing at all of these sites. So far I have caught the PCs at a point before total failure and managed to swap out hard disks to new disk without data loss.

What is so strange is this blip increase in disk replacements. Every site with installed Duplicati has now suffered at least one hard disk fail. Whereas among maybe a hundred other clients I have only seen the normal couple of worn out laptops during this last year.

I am not complaining - take this as feedback from beta testing. It is something that needs to be seriously looked at.

As you can imagine, I have been removing Duplicati from these sites after getting expensive failures.

Interesting… my initial thought is that I don’t see how Duplicati could be the root cause of any hard drive failure, especially mechanical disks. It’s slightly more plausible on SSDs where they have a limited write cycle lifespan.

I do know that when some hard drives are nearing failure, exercising the disk may push it over the edge. A backup system that reads all files (like Duplicati does especially on first backup) does put a larger-than-normal load on the disk.

How many machines are you talking about here that have experienced a hard drive failure? Anything else common about those machines (system brand/model/manufacture date, HDD brand/model/manufacture date, etc)?

My own anecdotal experience: none of the systems on which I run Duplicati have had HDD or SSD failures, but I only run it on 10 machines.

I have noticed that although Linux ext* file systems do not generally become fragmented very easily, the external disk to which I run the Duplicati backups does get fragmented rather quickly, and I have had to run e4defrag several times.

I don’t have experience in enterprise scenario (I use duplicati for personal\home use only). Anyway: first to upload data duplicati stores files in a temporary directory. It means:

  1. For the first backup the whole data will be rewrite to disk.

  2. For the second and next backups only the difference will be rewrite to disk.

Of course this situation stress the disk. This problem was dealed in other topics. The solution could be a ramdisk? see this topic: Attention SSD users: if your temp directory isn't ram-backed, Duplicati will prematurely wear your SSD

I agree that this is strange, but when I got in last night to see an email from one of my last Duplicati sites reporting a failing disk, that became one too many.

I am a very small support guy. Home and small business. We are talking between one and six machines in most cases on simple workgroup type networks. Nothing enterprise here.

I think I ran Duplicati on roughly eight different sites as a test. TOTALLY separate customers. Nothing in common with hardware. Everything from expensive Dells to old self build machines and cheapo PC World junk. Nothing in common apart from Microsoft OS. Some were Win7 but all upgraded to Win10 now.

On three of those sites I had Duplicati just self destruct after a Win10 auto-upgrade deleted the database. (funnily enough one of those were posh dells with SSDs). Once that happened I changed to alternate backup.

In another example site it was an engineering company. Roughly six PCs. Some new, some old. They all pointed their data onto a common PC in the office. Networked via SMB. The PC was a reconditioned one, but had a secondary hard disk installed around three years ago. That secondary hard disk was a decent Western Digital 2TB drive used for backup. That corrupted badly beyond repair. I have other drives from that same batch and same age without problems. This WD drive is one I had supplied and had known the history of.

Another example an accountant with two Dell Laptops. Backups pointing at a shared external hard drive. Laptop hard disk failed in a four or five year old machine.

Last night’s fail was at a Builder’s company. A cheapo build machine that I had not built. I’ve not seen the PC in six months, but Duplicati was on there backing up to OneDrive. Last night it spat out a massive Windows fault blaming a failing hard disk. I’ll get my hands on that PC tomorrow.

A few other others gone as well, but can’t remember details at time moment.

Ah - yeah - a home user with a eight year old PC backing up to external hard drive over USB2.0. That external drive died.

Oh, and the conservatory installation company with eight PCs. One was another refurbished machine, but again with a newer drive in it. Backups to OneDrive. That also failed. And now I think in more detail that was also running Duplicati until an upgrade deleted the database.

Generally I had backup scripts set to pick up Documents, Pictures, Desktop folders. I then dip into the %APPDATA% and grab mailstores (usually Outlook) and Browser settings (Firefox or Chrome). Avoiding TEMP files or Caches.

There is just too much of a bizarre coincidence going on for me. Other than these Duplicati examples, in the last two years I have had no other external hard drives die. NO other desktop drives die. And out of hundreds of computers I look after there have been maybe three other laptop hard disks go due to natural causes. The only desktop drives I have had fail have been in Duplicati PCs.

I have not had time to comb back through full stats for the year - but the common thread is standing out a little too much to be comfortable. The coincidence is getting a bit too freaky for me. Puzzling.

Am I understanding this correctly that you’re seeing both source and destination drives failing in different cases? It’s not always the source drive?

Yes. As noted above, I’ve had trouble on both sides. More source disks failed than destination.

Of the failed destination drives, one was an external drive on USB 2.0 for a home PC. So that could have just suffered by being on a desk and delicate. Also the USB 2.0 interface is painfully slow in moving data. I have seen times when that Duplicati backup took so long it ran over 24 hours and trips up over the next backup. A hell that gets worse if the PC reboots for an update in the middle of a backup.

The other destination drive to fail did have five PCs pouring backups into it across an office network. The backups were staggered across the evening with an hour gap between them, but could have some overlap.

It may all just be a weird coincidence, but the common theme has been Duplicati. The only desktop drives I have had failed in last two years have been on Duplicati machines.

I didn’t monitor every backup log as they are too confusing to read. I’d check up on clients every few months. A couple of those with failed drives had also had backups “go mad” and seem to get stuck on something for hours. I think this is reboots for updates hitting at the wrong moment.

Roughly the usual pattern I had setup was full backups either Monthly or Every three months. And then weekly incrementals. Some sites had daily incrementals. It varied according to needs.

I am a little annoyed at myself at not having had time to get a better understanding of Duplicati logs. Maybe I’d have better data for you now. In every case of failure I thankfully caught it at at early enough stage to be able to still have access to the original hard disks. One of my other tasks stuck on the “to do” list was to backup the database. Only on that “five machines into one” system did I have a proper backup of the Duplicati database. Which is kinda ironic that that is the one where it was the destination to fail. :smiley:

I’m now down to a single site running Duplicati. That is a simple set of three machine backup up across the network. In that case it is not running as a service and still seems to be happy.

Hopefully all of these are just some weird coincidence. I am not concerned as it was my choice to run Beta software. I just wanted to add some feedback of my experience running this on Windows.

Just for clarify, which is the time frame when these faults happened? which is the time between the installations of duplicati and the faults? Are you sure of the correct configuration of backup jobs? and what is their frequecy? and how many data are uploaded between 2 consecutive backup jobs?

your situation sounds like a bad backup configuration which copy many waste which change frequently (and are uploaded every time - I have a similar situation with firefox and thunderbird profiles directory).

Anyway, nowadays every HD has SMART support and there are a lots of programs which read SMART attributes and send emails to prevent faults. In addiction I suggest to use, for backup destinations, HD’s created for this purpose.

@xblitz these had been in place a couple of years for many of them. I used to use Crashplan for backup and Duplicati was one of the replacements I was trying out. Starting around the time of the v2.0 Beta release. So there had been many successful backups made by the the software. Only saw a few errors reported in the client - mainly about locked files. I’d then drop those files from the list of backups. Generally seemed to be fine.

As to “correct configuration” - that was mainly left to defaults. This is a very confusing product with far too many options. Too much to understand. So I’d generally use the built in suggestions. Never changed block sizes. I did always select a sub-set of files as noted above. I had to backup Thunderbird email. Or outlook pst files. Email is vital to my clients.

As to the comments about SMART monitoring or HDs made for the purpose. Different clients, different needs. This is not Enterprise with multi-thousand pound budgets. This is the tea shop on the corner, or the builder on his cheapo home PC bought from PC World\Walmart. My feedback is about use on almost default settings in very small business of less that half a dozen machines. These people do not have the cash for huge backup systems. (I don’t want to go OT)

Actually I wish more users would do what you did here. I see some changing lots of advanced settings without fully understanding the consequences. For the most part I think the defaults work well for a majority of situations.

I don’t think your config is relevant though. I can’t imagine any way Duplicati could actually cause drives to fail any more than any other program that does similar reading/writing of the hard drive (regardless of configuration or mis-configuration).

While it appears there is a correlation between your Duplicati installs and failed drives, I still suspect it is just coincidental. As we all know correlation does not in of itself prove causation. It is good to bring it up as a topic for discussion, nonetheless.

1 Like

I’ve been in this computer lark for more than a few decades, but when I come to a product like this I try and put my brain into “dumb user mode”. Try and think like other users who are attempting to make sense of the product. It is natural a forum like this is full of geeks who have got loads of spare time to understand the details. Whereas there are others of us who just got here “too late” to find where on earth to start to even make sense of Duplicati.

It is like walking into the middle of a long conversation where everyone assumes you already know who is who. Attempts of making sense of it is lost by someone charging off on some other tangent of theory.

So I tried to stick on the recommended settings. Mainly so I could try and feedback what I found about those methods as this is what the average user would use.

As you can see by this post (and previous from me) I do like to try and drop feedback from the frontline. When I first found this place in 2017 I was hoping the product would be nearer completion. Instead there have been very few beta releases. Instead many people seem on an ever changing canary with deep knowledge of the way everything works.

Meanwhile I now have the latest sick PC here on the workbench. Seagate 320GB HDD ( ST3320820AS ) Doing a backup first, but already I can see that SMART is saying the disk is FUBAR. 22,474 hours running, but Reallocated Sectors Count gone red and Current Pending Sector counts gone yellow. I use CrystalDiskInfo to put a GUI on SMART.

Duplicati seems to still have been running here as there is a crashlog dated 03/02/2020 in ProgramData. I have not seen this PC in over a year. And certainly did not choose this hardware.

Eeek! That Hard disk just did something really nasty to the old recovery PC I use… so will be a delay on feedback from that disk. (Yeah - I know I am talking irrelevant rubbish now… :D)

I agree with you, despite to the “how to” section of this forum the useful information for beginners aren’t so easy to find. Fortunately the documentation is well organized but to read it, one take many time (even if the manual IMHO is well divided); No less I understand that is impossible cover alI possible situations.

Anyway, six HDD faults are coincidence? maybe or not, personally is difficult belive for me and I understand your perplexity. Of course for the mechanic of HD duplicati or other software are irrelevant: the low TBF (2,5 year for this HD with uptime 24/7) could be caused to problem during the manufactoring process (but you are excluded yet) or a wear of mechanical parts. It could be possible that a bad configuration of duplicati cause many read and write cycle which lower the life of HDD. I don’t believe in a bug of software because me, you, and any other users of duplicati have the same sofware which run over different hardware.

Just a note: according to your report this HD run for 22000 hours (it means 2,5 years with uptime of 24/7 - it reasonable) so I try to search the characteristics and this HD is marked like “desktop moled” but IMHO this HD is not suitable for the use wich I suppose.

In the interest of adding another sample point, I have an Unraid NAS with several disks that is used as a duplicati backend. There is a parity drive that gets written to when any of the disks is written to, and it has been powered on for 21,000 hours and spun up for 6583 hours. 4 machines have been using Duplicati for around 2.5 years with this backend (1 linux, 1 mac, 2 windows). Crashplan has been backing up the NAS for even longer than that. All the drives are 3TB Western Digital Red and luckily I haven’t had any disk errors yet.

The problem I found is that too many of the real details of specific settings are buried in the forum posts. This is why I was trying to stick to defaults as far as possible. Only a few tweaks away from what I was assuming was recommended settings. I didn’t have hours and hours spare to read every setting and dig through all the posts. It was enough of a puzzle to work out backup of Outlook data as I had to resort to command lines to kick in the Service install. (ID-10T users tend to leave Outlook open as they leave the office… :roll_eyes: )

This is also why older posts of mine tried to document how I got things to work successfully with running as a Service on Windows as I do notice this forum very much sees Windows as something to be looked down on. Trouble is the users out there are using Windows and they need backup. I can’t convert everyone to Linux :wink:

I am not blaming Duplicati, but it is really weird coincidence. In some ways this is funny that those client PCs out there with zero backup options in place have suffered less hardware failures than those with backup. :smiley:

Never saw this kind of issue in the past when it used to be Crashplan hammering the same quality of kit. This is why the pattern is standing out to me as noticeable.

I know you want to blame the hardware. I cannot control what my clients buy. This is a builder who bought a cheap under-powered PC for the office junior. Of course it is a “desktop model” of hard disk. The computer is a desktop for use in an office. The way you are talking you seem to be implying that everyone should only buy the top end enterprise disks for their basic office PCs. That is not the real world.

This is a small builder and the guy who uses that PC only comes in one or two days a week to write a few emails and some worksheets. You can’t specify a monster PC for someone like that. (If you knew what the CPU was in there it would shock you). But that machine still needed to be backed up. That backup was uploading to OneDrive. Only backing up user files - documents, Outlook mailstore, Firefox data (not cache). Not a heavy load.

I have seen “desktop model” hard disks like that last a lot lot longer than 2.5 years. That is because they get a very low use with a one finger typist sitting on them sending emails. When I spec a machine I’ll take them higher up the range into better CPUs and quality drives.

On my own hardware I am using Samsung SSDs and WD Reds with much higher mileages. In the backup box there are a couple of WD Blacks on 44,000 hours (but only 140 power cycles). I used to run SCSIs with even higher mileages.

To me the logic is the backup software should run on low spec kit with crappy CPUs as they are going to fail and need backup. Had over a decade of success with Crashplan on all kinds of systems. Beautifully simple to setup and didn’t care about OS versions.

I can see that Duplicati is “almost there”, but not really progressed much in two years. I have liked the way I can swap the destination around between using SMB or FTP and it still would continue the backup. But when something went wrong - ouch. When an error appeared in the block count (or whatever the term is). If a repair was needed… then the headaches would start. Taking far too many days(!) to untangle the issues when I’d rather have an option to say “just forget that corrupted backup and run a fresh one”. A good example of how unfriendly it gets to the User when things go odd. At that stage the forum help gets all cryptic and the commands to fix it need too much knowledge. (Pretty sure those kinds of errors occurred when a PC restarted to install an update)

Anyway - I am waffling away here in the hope some of this feedback is of use.

I agree, those problems were not good… very difficult for regular users to recover from. A lot of progress has been in recent months to fix the root cause of those issues, and the fixes are part of beta 2.0.5.1 which was released less than a month ago. Sounds like you may be moving on and finding a different backup solution, but if not you should upgrade to 2.0.5.1 to enjoy stability improvements.

I am in a confusing place with backup options. If Duplicati had not let me down so often I would keep using it. Those problems were impossible for normal users to recover from. It really needed some kind of option to say “just drop that incremental backup”. Instead, in many of the cases, I had to just delete the whole historic backup and start again. I charge clients by the hour. So I can’t sit there and run a recovery routine on Beta software that takes days to recover. Just not feasible.

It is clear to me that the last OS for testing is the Windows environment. Maybe I’ll try again in a couple of years when the software is nearer to a Release Candidate.

Hi

I dont think Duplicati is really designed for a quick restore of data should things go south like a Desktop HDD Failure. Like you I work in the SME’s many of whom are reluctant to spend on hardware. Desktop grade HDD will only last 3 years when they are on 24/7

My backup procedure for all sites is a local Full Image of server systems or crucial Desktop Systems. Either Windows Server Backup or Terabyte Unlimited to USB/External Drive. This keeps recovery times to a few hours.

I will then live sync Critical Data to another desktop with a spare Internal Drive using Syncthing. Is basically a Flat File with up to data changes

Finally I use Duplicati to Nightly Remote Backup of that Critical Data in case the house burns down.

I never include PST/Email boxes in Critical Data If email is critical to a client for retention proposes then they should be using G-Suite or Exchange not POP3 to local storage.

2 Likes

Interesting thread I made a few personal remarks:

  1. Sure, in many cases the backup job is the most demanding task “low load” servers are doing.
  2. Disk failures can be hidden, and only become visible when data is accessed / written. - I’ve encountered this way too many times.
  3. About options, sure, there are some options which reduce disk load. As example you can use move from temp to destination instead of copying files, if you’re using the destination as temp. That already drops 50% of redundant “wasted” temp I/O. - This is something I’ve been aggressively utilizing when running larger local backup jobs. - Which can be later synced elsewhere using alternate means. All of my in terabyte range backups use this.
  4. In some cases disk seems to be really unreliable. SMART lies, some cases disks do return corrupted data without triggering any obvious indication, and so on. @warwickmm Good luck with 3 TB Western Digital’s. I’ve replaced almost all of those at very end the end of hte warranty period. All disks running 24/7 and being utilized with Duplicati. Yet currently we’ve only got Gold / Black drives. Anyway, my personal record with WD and “data reliability” is really poor. Especially WD Blues, which are often obviously broken (usually only noticeable during WRITE testing), but the SMART data still says drives are ok.

Yet, I wouldn’t say the disk failures itself are caused by Duplicati.

1 Like

This is an interesting point. Is that easy to implement in the GUI? Or do I need to learn some obscure command?

@ccmgr - I also use various different solutions. The one thing I like the most with duplicati is that ability to throw data onto OneDrive so quickly. Most of the backups I am doing are really simple. Just a heap of Office Documents which are usually tiny. And that huge Outlook PST which can be many GBs in one file. And yes, it is better if I can get these people into Exchange Server and do succeed to do with with many clients. (Or Office 365). BUT some people are very hard to break out of their habits. And this builder is a good example of stubborn madness stuck on their old system as “they know how it works”. (It drives me mad as here we have TWO people using POP and HUGE mailboxes… which obviously are not properly sync’d… but there is only so much you can do to make someone change)

I also agree about quality of hard disks. For my own kit I lean on higher spec disks. I have no Blues or Greens as they have also died on me. With clients I don’t always get the choice - these examples being mentioned above are junk “big computer chain store” PCs bought for £200. I spend £100 on just a hard disk.