Invoke ZipFile.open() on the current path. Allows opening for read or write, text or binary through supported modes: ‘r’, ‘w’, ‘rb’, ‘wb’. Positional and keyword arguments are passed through to io.TextIOWrapper when opened as text and ignored otherwise. pwd is the pwd parameter to ZipFile.open().
Changed in version 3.9: Added support for text and binary modes for open. Default mode is now text.
yet fix it it does…
The hashlist block hash is found in the dlist file(s). When it’s missing from the index file, it can still be found in the dblock file - the index file holds a copy of all the hashlist block hashes in the dblock corresponding file. It’s read here:
The readBlockList routine does read the file from the downloaded dblock file.
I will send you by private link the address where you can download the whole backend.
Without any blocklist hash in its blocklists list, it has no idea what data the file contains.
Possibly a “fix” would be to delete the file as hopelessly undefined – not exactly a great fix.
Not always, although taking anything but that default seems bad (or maybe for special tests).
Maybe you should make sure that the new code can handle all the cases kind of presentably.
C:\Duplicati\duplicati-2.0.7.2_canary_2023-05-25>Duplicati.CommandLine.exe help index-file-policy
--index-file-policy (Enumeration): Determines usage of index files
The index files are used to limit the need for downloading dblock files
when there is no local database present. The more information is recorded
in the index files, the faster operations can proceed without the
database. The tradeoff is that larger index files take up more remote
space and which may never be used.
* values: None, Lookup, Full
* default value: Full
The dlist provide a hash. When Duplicati in the dblock handling (the third pass) reads a dblock file, it has the list of all ‘files’ (the zip directory). When it finds a ‘file’ name corresponding to the hash, it reads this ‘file’ and interprets it as a collection of block hashes. That’s why it can find the block list hash, while it can (theoretically at least) be missing from the index file list/ subdirectory.
If the user don’t want index files to be useful, that’s their call but it would be rather strange to complain that database repair is slow then. The whole purpose of index files it to make db rebuilding faster.
there is no new code. The branch is not able to fix anything new. The only change is to make it finish faster, fail or success, whatever. I think that it is valuable to know that Duplicati will not be able to fix it in 12 hours rather than 12 days, that’s all. If it fixes it, it’s rather better as well since almost nobody will wait for 12 days.
BTW I think I forgot to say that the link I gave you is a Sftp server.
Last line is a “block ref that IS a special blocklist hash” but not a block. Loss is not repairable.
The blocklist itself is a concatenation of block hashes, and is in dblock and usually its dindex.
From the pushback, I will assume you’re removing the lists/ blocklist copy, not the dlist ref.
It wasn’t a speed issue. The “presentably” meant no messages suggesting things were wrong.
I had the “Unexpected changes” warning in mind, but I’m not sure if dindex downgrade does it.
this is very much to make more obvious what is the file causing a problem. This message is more in the hope that once a dblock full scan will become more useful, it will bring more chances to investigate the root cause(s).
Come to think of it,there are definitely instances where the machine could have lost power during the 12hour long backups that somehow could have contributed to the excess of Index files. Bit of an edge case but a real risk at this site unfortunately
Lots and lots and lots of smallish mostly files in a few locations including network shares. Change rate isn’t particularly high,just takes ages to run through and check all of it
Kopia in comparison takes minutes,though I don’t particularly like their software
too bad since it means that you can’t use USN (well, I think, I never tried it in this case but it would make sense for USN to not work for network drives). If most of your files are local, maybe it could make sense to have a local backup (using USN) and an external backup.
Right here goes,installed the canary and downloaded the duplicati destination folder,imported the old job with the original metadata and started an initial repair
Right, if you are looking at the advance, can you give a rough estimate of the time taken for processing a block when there are ‘unexpected changes’ and when there is none ? can you note at least a few of the offending blocks for further analysis if it succeeds in an acceptable time ?
Unexpected changes caused by block duplicati-b231b21ca170042aeb26b60ce885f5849.dblock.zip.aes
Unexpected changes caused by block duplicati-b2d3bcf0cd64547afa9f9a5710c01b57e.dblock.zip.aes
Unexpected changes caused by block duplicati-b2e58018b4d074407a1c182fc109ccdfb.dblock.zip.aes
We’re going a bit faster but i’m not sure if it’s related to the code changes or me overspeccing the VM as well Pass 3 of 3, processing blocklist volume 1404 of 7481
Blocks are taking seconds when there’s no issues,haven’t spotted an unexpected change block completing while i’m looking anyway
This seems to nail it; before the change each of the 1400 blocks done so far would have taken 7 mn so about 9800 mn for the 1400 blocks (that is, more than six days)
Yep!
Aug 23, 2023 12:32 PM: Pass 3 of 3, processing blocklist volume 2559 of 7481
We’re on this taking days rather than weeks
Compare to the other machine it’s still running on since this forum thread was made:
23 Aug 2023 12:26: Pass 3 of 3, processing blocklist volume 1990 of 7481
At this speed it should take 1.5 days. Still abnormally slow, I don’t know why for me running a full test of a 1.7 TB backup (a task that does a very similar work since here the block size is 400K) takes 7h on a server that don’t even have SSD. Maybe that’s because it has less files (more big files).
Optimize the database could be worth it at this point - I was using