Poor performance

Hi,

Running Duplicati - 2.0.4.17_canary_2019-04-11.

Running on Windows 10 Pro on an i5 with 16GB Ram. Tried initially to backup to via SFTP to a server on my LAN but was very slow. I then tried to backup to an NTFS external drive. Speeds started at about 6MBps but dropped to a ridiculous 70kbps overnight. My dataset is about 270GB and what I would imagine to be a typical set of files documents, photos etc.

What should I expect speed wise? Is my experience typical and the software does not work yet or is there something wrong with my setup?

Thanks, BM

OK, wasted hours on this. Did eventually get a small backup to complete. There is no way I will trust my data to the flaky solution. Sorry but in its current state it is totally NOT fit for purpose. Using CloudBerry now, it just works.

Sorry, but this is a massive FAIL from me. I know it is free and that is great but it just does not work reliably.

BM

For many of us it does. You may want to re-try with the latest “beta” version as they are generally more stable than the “canary” releases. I am running beta 2.0.4.5 on 10 machines - works great on all of them.

I landed in the forum researching performance also. This is my first time using Duplicati.

Had an 8TB NTFS drive hanging off my router via USB2 and Gb ethernet. Running Linux and had mapped a backup drive to the samba share. After an hour or so, 29MB had been backed up. Realized it was going to take way too long, I attached the drive directly to the machine being backed up, via USB3 port. I’ve been running a 3TB backup for 15+ hours, and it shows 708 gig is on the backup drive. That’s about 50G per hour, which seems reasonable. But it’s currently backing up large mythtv mpeg files and the performance is bouncing from 3KB and as high as 90MBs. I could understand 3KB if it were small files, but these are GB size TV recordings. It maybe choking trying to compress files that don’t compress, not sure?

I’m running Linux Mint with Duplicati - 2.0.4.5_beta_2018-11-28
CPU G4400 @ 3.30GHz with 32GB of memory
Raid 5 Array is the source, with three 7200RPM drives
The destination (bottleneck) is a single 8TB USB drive with NTFS, attached via USB3.

Hi, it is good that it is working for you. I really wanted this to work for me but running backup on beta isn’t a great idea in my opinion. Maybe the latest betas are much better. Canary was simply hopeless. It was slow and if I stopped and restated a backup it would just error and not provide an easy remedial solution. I have lost complete confidence with this. For most general software beta is OK, but not ideal for backup, mission critical, etc. Backup is just one of those things that must work properly as otherwise what is the point. I do applaud the open source nature of this project and I do wish it well but I will not spend anymore time trying to get it to work. My time and data are more valuable than that. I had been using Backblaze backup and it had been great and very idiot proof and reliable but it lacks some important features like version ING and retention plans. Also, for smaller datasets it works out expensive. I think I have now settled on Cloudberry with B2 storage.

Good luck, C

CloudBerry is a good product. I use it in addition to Duplicati. My only complaint with it is that it doesn’t do deduplication, so back end storage is not nearly as efficient. Otherwise it’s a solid product.

The de-duplication is definitely a good feature to reduce storage space. God knows how much duplication there is on my file system. In fact, it makes me wonder if there is a tool that will report on duplicate files out there…? I might come back here in the future if and when it matures a bit and I feel more confident. I really did want this to work. Who knows, I might give it another go at some point, was just getting really fed up with it not working for me and I had wasted a lot of time - I work with IT and for home I just want my stuff to work - I don’t want to have to play about with it. Having more than one backup plan is not a bad idea anyway.

C

I am very pleased with the speedy backups I see, but file restoration needs some work. As I write this, I have been waiting for 15 minutes to build the “partial temporary database” to recover a single file from my ~2TB backup. Surely there must be a way to speed this up.

Deduplication is available on Windows Server versions, and you can run ddpeval.exe to see what your savings will be at the filesystem level. Maybe someday it will be supported on client operating systems!

Let it run for another day, and it’s up to 1.2Tb. Around 20Gb per hr. Not sure why the performance dropped off. It’s still in the same directory chugging away at MPEG(TS) files. Perhaps this is impacting performance.

I did go back and try the main beta just for completeness sake. For me performance still sucked - it dropped to like 10s of KBs. It started much faster but bogged down - overnight still only managed 30GB. I calculated it was going to take over 4 years to complete the initial backup of a few hundred GB to a local drive. It was also consuming about 50% of CPU. CloudBerry completed this in under 24 hours to BackBlaze B2 which is backing up at about 5MBs and is usually consuming less than 1% of CPU although it is more when 1st started and is calculating file changes.

There is nothing special about my desktop - recently rebuilt and fully up to date with Windows 10 patches.

What sort of backup speeds do you get?

C

My speeds listed above. On Windows it may be virus scanning each file on opening them, slowing things down.

I’m on linux and happily use fslint-gui for finding duplicate files. For windows your mileage may vary: https://alternativeto.net/software/fslint/?license=opensource&platform=windows Manual dedup is time-consuming and risky if you’re not paying attention…so get your full backup first. :slight_smile:

Next up I’ll be trying BorgBackup. Haven’t tried it yet myself, but watching the videos, it seems to handle dedup + encryption quite well. https://borgbackup.readthedocs.io /en/stable/installation.html There’s a Vorta gui, but these seem to be only mac/linux today. I haven’t tried them myself, but a likely next step when Duplicati finally finishes.

I recommend white-listing the backup process from the anti-virus to stop that happening.

My speed continues to drop. Still stuck in the same folder of MPEG TS files with a total of only 1.6TB backup up since I started 3 days ago. Speeds on the current file show as slow as 291bytes/s to as high as 101MB/s. My hypothesis is that it’s choking while trying to swallow compressed files. Perhaps the code could be updated to bypass already compressed files to speed performance, eg: mp4, zip, etc. Compressed data would still need to be encrypted though.

The slow parts are where it is calculating block hashes and creating backup volumes, I believe. Then the 42MB/s is when it is actually transferring the backup volume to storage.

Deduplication is a wonderful feature but it carries with it overhead. If it were me I’d let it complete - you’ll probably find that subsequent backups (after the initial full) are pretty fast. The only time I have long-running jobs now is when I add or modify a lot of data (like over 100GB).

Unfortunately my backup crashed on day 6 for some reason. On day 7 it started over from the start. Will be investigating alternatives backup strategies. Thanks for the replies and the open source community!

Mitch123, your backup was running faster than mine. I do agree that where possible open source is a nice route to go down. I think maybe this one needs more time to mature. If you can entertain paid then I have found CloudBerry very good although it does lack some features - de-duplication, encrypted filenames, exclusion rules that are a bit clumsy. I only needed one licence and got that half price for giving a review.

I am curious why some people are having success with this and others not? I am not curious enough to spend any more time with it as I want my backup too be fire and forget. The Duplucati concept is a good one.

C

I have a few suggestions here…

  1. Do not run a single large backup of everything, especially over 1TB worth. Split it up into several smaller groups not exceeding 750GB each, although 500GB or less seems much more ideal. I have 5 different backup processes that vary from 80GB to 770GB each (a total around 2TB worth). That is the biggest issue I see is new people coming in expecting to backup their entire drive and them complain about the slow speeds. There is a lot of calculations involved, this is not simply copying files from drive A to B that runs as fast as the drives, connection, and OS allows.
  2. Having a lot of individual larger files (aka video files and larger installers) instead of a lot of smaller files will also slow things down quite a bit as it has to split these files into individual 50MB backup files, where it has to calculate how to split up that 8GB worth into a lot of 50MB files. This can be mitigated some by using 100MB block size instead of 50MB, but it will still take a lot more time than it would doing a normal 700GB backup with 500,000 files. Yes it is much quicker to do 700GB of 500,000 files (avg 1.4MB per file) than 700GB of 500 files (avg 14GB per file). Other backup system I’ve used tend to use a single backup file for every file over their chosen block size, so it would fit a lot of smaller files into their 100MB backup volume, but would keep the larger video files as their own larger backup volumes but then you deal with the drive/OS limitations and transfer speeds that bog down on larger files.
  3. With prior versions of Duplicati, on the INITIAL backup, I saw speeds that averaged around 10GB per hour, which meant over 70 hours with my 700GB backup. With this latest version on new fresh backup I am seeing much faster speeds, the new fresh initial backup of the same 700GB took 4h:29m. From 70+ to 4.5 hours.
  4. The most common reason for slow speeds on a standard Windows desktop PC is antivirus scanning everything as Duplicati reads then from the drive, and scanning every new backup file created before it is transferred to the destination folder/drive (and in some cases scanning it again after it has been written to the destination). Certain other backup systems I’ve found create their own exception or encrypt everything on the fly which allow them to run much faster, but the down side is that antivirus cannot protect you if there is any virus, it will get backed up and potentially infect your backups.
  5. Drive speed (and sometimes network speeds) is also a big factor. If it is a platter based drive that is only 5200RPM, you need to realize that read/write speeds on those is at best in the 100MB/s range, and even a 7200RPM drive is 120-150MB/s range. Even in a RAID setup, 7200RPM drives typically only see around 200MB/s. Once you figure in the calculating encrypted block hashes, creating the new files and volumes, splitting them into the standard 50-100MB files, calculating what files will make up that 50MB file size, splitting individual larger files into multiple 50MB volumes, and so on down the list, it is not uncommon to see it average around 8-10MB/s (as shown in Duplicati itself), which after it is done if you do the math it comes in at a more realistic 60-70MB/s average.
  6. System speed/performance is much less of an issue. For example my main backup system is dual CPU 4 core (8 cores total) Xeon E5405 @ 2.00GHz, 32GB RAM, and storage drive is 4x platter drives at 7200RPM each for a typical 200MB/s read/write speed. The gigabit ethernet and source files on Enterprise SSDs (typically 400-450MB/s) are not limiting factors. Even then, right now it is running a initial first backup on 770GB worth of a lot of files from 1KB to 100MB, with maybe 30-40 that are close to 2GB file size. It is only using 25-50% of the CPU and around 200MB of RAM, and when reading files from the network drive around 120Mbps network speed used (out of max 1000Mbps).