I just pushed a short python script that generates some stats about
wasted space in the repository (data in blocks that relates to deleted backups)
files that have never been verified (duplicate does verify one block after each backup jobs but as a job usually generates many blocks there are a lot of blocks in storage that have never been verified)
The script takes one parameter (the path to the database)
I’m guessing the knowledge of which blocks have been tested is lost any time you do a database recreation. (This is something I do every so often just to validate that recreation is fast and doesn’t need any dblocks.)
Thanks! That must have taken some learning of DB tables. Feel free to keep on learning…
backup-test-samples and a new option below can adjust that to leave fewer gaps, if you like:
--backup-test-percentage (Integer): The percentage of samples to test after a backup
After a backup is completed, some (dblock, dindex, dlist) files from the remote backend are selected for
verification. Use this option to specify the percentage (between 0 and 100) of files to test. If the
backup-test-samples option is also provided, the number of samples tested is the maximum implied by the two
options. If the no-backend-verification option is provided, no remote files are verified.
* default value: 0
upload-verification-file and the DuplicatiVerify utility-scripts can do major verification if file access exists.
The COMPACT command describes some tuning, if the default options for compact are not as desired.
Using verbose log level or capturing with log-file-log-filter are other ways of getting stats about compact.
Thank you, I was not aware of that option. My desire to know how many blocks had never been verified before (so I could ultimately verify them with the verify command) is what drove me to write this short script.