What I am doing is running Duplicati on an Ubuntu server. It mounts various shares via CIFS and the backup destination is a NAS on the same network.
What I’ve done is recreate the backup jobs from scratch (not restoring from the config files from the previous backup setup), with entirely new destinations, and so far I am not seeing the sha256 hash error yet. The old jobs are still running as well and I am still getting the errors on those jobs. I will run the new jobs concurrently for a few days and see if the errors return on the new jobs.
I’m going to call this one solved, though I don’t know exactly how. Recreating all the backup jobs from scratch seems to have fixed it; it has now been several days and I haven’t seen any errors in the logs.
I just re-ran the backup and between yesterday and today, the same 38 files appear to have the sha256 error.
“remote file duplicati-b19edfe0fbf7c4ca49e09c569e6824fac.dblock.zip.aes is listed as Verified with size 10934076 but should be 52406685, please verify the sha256 hash “sKqBc3FLBHZ1b0FDO0t0xuNG1jIyXZrON0zsKAC0aFQ=”,”
It appears that this file is in fact 10MB (so the verified size matches the actual file size on the NAS), but it was modified 1/14 during a backup operation that reported no warnings or errors.
I guess I will try moving the destination back to the local machine again to see if I can isolate this to the remote storage.
The volume size is 50MB (default). Most of the files are 50MB at a quick glance. I don’t think anything is capping since the other files are larger, but certainly something might be truncating certain files for some reason and I’ll look into that. I am hoping to be able to spend some time on this over this weekend.
Thanks for checking that - and good luck with the weekend time (I know I certainly need it).
If you do get time, can you check if the reported-too-small files are all about the same size (10MB)? I’m not sure yet what it would tell us if they are, but it would be good to know…
I deleted the broken files from the destination, ran list broken files, then purge broken files (could the GUI just ignore filters for those operations rather than telling users filters are unsupported and making us deselect the filters every time we try to run them?), and it looks like it changed the dlist.zip.aes files to match the contents of the dblock.zip.aes files (and deleted 17 more of those dblock files, each of which was ~49.9MB).
I re-ran the backup after that and I got a rather cryptic:
removing file listed as Deleting: duplicati-bad0d455cdae045f1882db88a10bdc5dd.dblock.zip.aes,
removing file listed as Deleting: duplicati-b7a21edefd1b94189a2a88b7fe0d16806.dblock.zip.aes,
removing file listed as Deleting: duplicati-be7a9b15c09be49c0bcfd93f3bdc1e826.dblock.zip.aes,
removing file listed as Deleting: duplicati-b01da0d24c86841859a74bab69da183c1.dblock.zip.aes,
No remote filesets were deleted
That last line, “No remote filesets were deleted” seems contrary to the fact that it deleted 4 remote dblock.zip.aes files, but maybe I’m misunderstanding what it means by fileset vs file.
I’ve made a copy of this backup job on a local partition as well; I’ll run them side by side (at different times) for a few days and see if anything of interest happens.
@JonMikeIV - sorry I didn’t see your last message until I had already deleted the files. But I did go back to a recent log file and I see a handful of sizes:
10934076 but should be 52406685
377520 but should be 52403997
8912896 but should be 52344077
589824 but should be 52400365
2031616 but should be 52399085
So it seems the answer was that they were different sizes.
This destination was on a NAS, with bit rot protection and snapshots enabled. I turned off snapshots (not sure why that would make a difference, but it’s the only thing I can think of that might affect the files somehow). But I should note that before, when I had Duplicati running on the NAS itself, those were the same settings for the (same) backup destination. So I don’t think this has any relevance but at this point I’m willing to try pretty much anything to get to the bottom of this.
First of all thanks for this nice tool, I find it great in principle.
Unfortunately I have had the same issue since early December 2017 for 3 times now: each time I tried to patch it removing the broken files from the remote storage, repair purge and verify the errors where gone.
… Only to come back in a few days.
And that procedure is also an issue because each time you lose a lot of backed up data.
For what it’s worth I have actually verified that file on the remote storage where really corrupted.
Another point which may be useful to know for you is that the first time this happened was kind of 10 days after I moved the remote storage from a NAS over sftp to a workstation with smb/cifs due to the fact I had to change NAS.
Now I’ve got a new NAS and will move the storage back again to NAS over sftp in a few days… Of course after removing the new broken files (22, a new peak) which came out yesterday, repairing the db, purging files and so on…
I’m starting to wonder if that approach is really fixing anything or not, maybe it would be better to start from scratch
Will let you know if I find out anything new, in the meanwhile any suggestion is really welcome.
BTW, didn’t mention it before, but I’m on beta, not canary
Thank you for sharing your experience as well. Let us know if things start working properly when you switch back to sftp. If this turns out to be a problem with CIFS/SMB I’ll do the same. I am still running parallel backups to local storage on my server, but I haven’t been running them long enough for the SHA256 hash problem to show up (as you say, it takes a few days though I don’t think I’ve had to wait ten days).
Ultimately I still want to store my backups on my NAS, since it has plenty of space as well as RAID.
As I said I had planned to do, I have reverted to the initail setup moving the storage back to NAS over SFTP. I did this on the 25th after removing the broken files, repaining the db and purging.
So far I didn’t see any misbehaviour but it’s 1 week only. Last time it took from the 4th of January to the 21st to experience the problem.(or at least to be notified the waring).
At the current stage I would say that a bug in SMB/CIFS handling is the most likely candidate, but here I would wait for some feedback by the devs.
I am inclined to agree with you on this. I have been running 3 concurrent jobs backing up the same two sources (six backup jobs in total) to a CIFS share on the NAS, an FTPS share on the NAS, and local storage on the system running Duplicati. The only jobs that continue to show this error are the ones where the CIFS share is the destination.
I am just at day 10 for one of the FTPS destination jobs so I will continue running this but I am hopeful that the issue is resolved by not using CIFS.
I’m wondering if there’s a destination file count limit at which point CIFS gets unhappy and the error starts manifesting…that could explain why it takes a while of working backups before the problem starts appearing…
Ha! I thought it was a good theory until I remembered the second (smaller) backup job. To be honest I thought I was all alone in this one until @Amanz posted. I never considered FTPS or SFTP as another option. It’s not ideal as I think it’s slower than SMB (protocol overhead maybe?), but I’ll take it if the backups work.
I don’t think it’s related to file count: I had the same issue on December 28th, with less remote storage files, corrected it and then it seemed to be working fine until January 21st.
I’m wondering if it’s just the two of us having this issue, I would say CIFS would be a rather common solution for most users that do not require cloud storage…
Based on the evidence, I’d agree with that. Unfortunately I don’t yet have another thought on what the issue could be.
Unfortunately, there’s no stats to tell us if a user has CIFS at the destination or not so I can’t say how many people are using CIFS vs. having issues. Perhaps there’s some other commonality for you two - maybe the CIFS version or distribution / kernel on which it’s running?
Kenkendk (the main developer of Duplicati) has posted a few times in this topic - what sort of feedback are you looking for? With over 600k “file system” destination backups in January (I don’t know how many were to a CIFS destination, but my backups are one of them) if this were a general CIFS issue there would be lots of reports of the problem.
So until we can replicate the issue, the best we can do is rely on your patience and willingness to try different things and provide feedback until we can start narrowing down possible causes. If it gets to the point that we think we have an idea of the cause but still can’t replicate it ourselves we could even publish a canary version with specific handling or reporting for what we THINK it might be and ask you to use that version to see if it fixes the problem.