Attention SSD users: if your temp directory isn't ram-backed, Duplicati will prematurely wear your SSD

This is not a “non issue.” I’ve read the techreport test.

" Developed by a this handy little app includes a dedicated endurance test that fills drives with files of varying sizes before deleting them and starting the process anew."

That is not how the vast majority use their laptop, desktop, or server hard drives, and just by total coincidence, it also happens to be the most favorable scenario for an SSD’s wear longevity.

  • The test is compressed over a very short period of time, so caches have maximum chance at absorbing re-writes
  • No already-written blocks are repeatedly re-written except for directory structure information
  • Re-writes are spread across a very large portion of the flash cells, and after every test cycle, the entire drive is freed for the wear-leveling algorithm to use again

Again: this scenario does not represent anyone’s real-world usage. I can think of two use cases that fit: videographers (many digital cinema cameras write directly to SATA or similar drive modules) and people using said devices as backup destinations by traditional backup programs (think Legato and the like) where the volume is written to, read for verification, then recycled by re-writing start to finish.

For most users, 50% to as much as 90% or more of the drive is filled with data and much of that does not change. If your drive is 90% full, unless it implements static wear leveling, its wear capacity is reduced tenfold because each write must result in a re-write in that 10% of available flash memory.

Note that I am assuming drives are not over-provisioned. This is partly because drive manufacturers do not generally advertise over-provisioning levels; I also don’t know how many SSDs actually implement static wear leveling; the industry similarly hides behind “we do wear leveling!” It’s probably safe to assume that most SATA and NVME SSDs implement at least dynamic wear leveling, but I would not be surprised if static wear leveling doesn’t exist except in the high-end desktop and enterprise drives.

Anyway. This is why Duplicati’s backup process causes so much wear (on a drive that doesn’t do static wear leveling): if you have a 500GB drive, and it’s got 450GB of data of which 300GB is backed up by Duplicati, that results in 6 wear cycles on that 10% of the drive. And if you have extra verify options turned on - every verification results in more writes as well because the verification streams the block to disk, not memory.

Worse, that 10% of the drive is absorbing the vast majority of re-writes due to user and system activity; logs, system updates, filesystem metadata, virtual memory (glares at Chrome), application caches, email clients, and so on.

Please re-read their testing methodology. Your bullet-points are inaccurate.

If you have 500GB drive with only 10% free space, you need to upgrade your drive or stop hoarding those cat pictures.
Anything that duplicati writes is no different than what OS or apps write. So if you perform your backups even twice a day - whoopy-freaking-do! You just created 2 extra writes on each cell involved in the backup. 2 extra writes out of dozens if not hundreds of thousands of writes each cell can handle.

1 Like

It sounds to me like most of the examples so far COULD be considered edge cases depending on individual system setup / usage, but both are valid.

Regardless of whether or not SSD wear is an issue, I’m pretty sure we can all agree that performance could be improved.

Using a ram-disk (or adding a feature to do more processing in RAM when available) absolutely should improve performance and if it happens to ease wear on drives (both SSD and spinning rust) then that can be considered a bonus. :slight_smile:

1 Like

“should” and “will” are not the same. Has anyone actually done any benchmarks to see how much of a performance difference ramdisk or bigger caching makes?

I think I saw some posts from users who set up ram-disks saying it sped things up, but I don’t know how they measured or what bottlenecks they had before.

I’ve never bothered to check myself as most of my backups run on always-on machines so I care more about reducing system impact than speeding things up.

Agreed - which is why I didn’t say “will”. :smiley:

Please re-read their testing methodology. Your bullet-points are inaccurate.

I did read their testing methodology. I even quoted their testing methodology and explained, at length, how it doesn’t represent the majority of end-user storage usage, how the test is an absolute best-case scenario, and specifically how it doesn’t apply to the scenario I’m describing - one which is extremely common.

The fact that they include 10GB of static data (a Windows install, an application or two, and a few movie files) in the test doesn’t really change anything. That’s still not anywhere near a typical user usage scenario, and it also has nothing to do with the scenario I described and based my post on.

Please stop derailing the conversation.

On macOS (Mojave at least), the RAM disk takes up actual memory depending on how much was maximally written to it. a 4GB ramdisk starts out taking up no memory according to Activity Monitor. Write 3GB to a 4GB ramdisk, and it takes up 3GB real memory. Remove those files and it stays at 3GB. So, if you take a large ramdisk, it wil grow in real time depending on use. No need to ‘start big and trim down’, I think.

I haven’t looked at using hdiutil to recover used space (on /dev/rdiskn) between uses (for a permanent ramdisk), but I have noticed that using APFS is a bad idea as it becomes more difficult to free the memory (with HFS+ you just detach and you’re done, it seems for now) and to actually use the ram disk because of access control. After a hdiutil detach, diskimages-helper keeps using the RAM if you have formatted with APFS. For now, I’m staying away from APFS.

On a modern Mac with SSD, reading on RAM is roughly twice as fast as reading from SSD. Writing is about 80% faster to RAM than it is to SSD. These are not ‘order of magnitude’ speedups (but that was not the issue here anyway).

For those inclined to help out, champion this, or just follow along on the progress, you can find the bug tracking this change over here:

The forum tells me that’s linked here already, though I didn’t see it.
Note: The bug is to use ram in duplicati when possible, not to use a ram disk as the temp store.

Not sure where we have landed on this topic but I have created pre and post run scripts for macOS that set the tempdir for duplicati to a ramdisk and unmount it after a run automatically. I have been using these scripts for a few weeks without issues. Use at your own risk though! YMMV. I also advise to update the Ramdisk size per your requirement in the script - 1 GiB ram works for the default 50MB volume size for me.

4 Likes

As with batteries, this topic is a bit for lack of better words, misleading. SSDs do have a limit sure, but I’ve been running backups (not Duplicati, but also other container based ones) for their entire lives and haven’t lost a single SSD drive over about a decade. I’ve corrected drives but they are all in use today.

If someone here says their drive failed, no one here can say for certain its from Duplicati or whatever. It could just be a failure that was going to happen. Not a single piece of that hardware is perfect.

If it was really bad as I had one backup application do, haven’t seen Duplicati do that. Either way, SSDs aren’t that bad lol. This is akin the the browser cache to the RAM thing people did when SSDs first came out. No actual need to do it.

Of course, if you have really large backups (eg 1TB) then it might be something to think about but, probably be better to do those backups some other way.

I run 80-120GB backups on one computer and this is of no concern to me.

Agreed. On my main machine I have been running Duplicati since 2017, protect about 500GB of data, and run backups every 4 hours. I have had the same SSD since 2015 (Samsung 950 Pro 512GB). Health according to Samsung Magician is still “good” - I am not even half way through the rated write lifecycle on this drive.

There might be some edge cases where SSD writes by Duplicati need to be considered, but I doubt the average person has to worry.