Hi,
thanks for the reply.
That I tried both sizes, the 4GB for large media, 1 GB for small media files without any effect.
The info was per drive and per CPU core, I forgot to mention that, increasing parallel hashers as well as compression did not have any effect either in my tests (though I did that with default block size only, which in retrospect may explain the limited impact since it could have it other bottlenecks).
Ah, sorry, that was me throwing a random fact in there that I appreciated about duplicati. All tests for the performance were done locally. duplicati running in Docker, with source and destination being SATA drives mounted via volumes. Duplicati docker (database, temp files) running on an SSD. So everything wrt network you mentioned does not apply.
[some background in case its interesting: When writing the post I only had smaller backups running into GDrive. The big ones never completed due to performance reasons. I don’t have the bandwidth at home to get 10TB of initial backup data uploaded to Gdrive within a reasonable amount of time, the plan is to upload these from the office where I have more bandwidth and can split this over multiple days given the 750GB limit. Then incremental backups are going via my home line.]
Thx for the tips on the temp dir and files, I used that to locate bottle neck and see at which phase of generation it happens next to profile level debug output. Zipping was never the issue, also next to iotop, during zipping throughput went up to close to max of SSD limit (250 MB/s) when testing on SSD drive only. Same with copying temp files over, which happens at max pace of the USB SATA drive connected (60 MB/s).
I tried to put all of this together and ultimately as far as I could get the only logical point was block hashing is the bottleneck. So to isolate that I basically set everything else to a fixed value, set # of concurrent hashers, compression, etc. to max CPU count. And then created a testbed of scripts that ran for 2h then got killed on the same maxed out config with varying block sizes on the large media archive (10GB avg file size, 10 TB in tital) and low media archive size (2GB avg file size, 3TB in total).
After letting it burn over night and then looking at output of how much data it pushed (512k, 1M, 2M, 4M, 8M, 16M, 32M, 64M, 128M, 256M), 16MB was the winner by a significant margin. I then re-run both full backups with the 16MB block size and concurrent hashers to CPU cores, it generated a continuous performance of around 22MB over the whole backup, with an average CPU usage of 60%. CPU pattern is 100% for a while, then going down to 40% for a few minutes, going back to 100%. I guess its happening at times when it hits the --asynchronous-concurrent-upload-limit that its not fully used, probably hitting some limit on IO or elsewhere that isn’t obvious to spot.
I still think it should be possible to get more, but performance is fine for my use case and incremental backups I tested achieve the same performance and since there upload is anyway the bottleneck I don’t mind.
Thanks for the tips and clarifications, helped to set up a simplified test bed and rule out different bottlenecks.
Best,
Marcus