How to clear the Duplicati scheduler queue

It has some interesting standard capabilities which I’m using and is used in Duplicati too (sorry macOS).
The below is random_file2.py but I’m still not sure what tool I put around it. The focus here is on change, potentially Duplicati-style, efficiency, predictability, and flexibility to make random files or change existing randomly, probably either within current size or extending randomly depending on how options were set.

#!/usr/bin/python
# path --block-size= --id-size= --percent= --create= 
import os
import random
import argparse
import sys

parser = argparse.ArgumentParser()
parser.add_argument('path')
parser.add_argument('--block-size', type=int, default=102400, help='size of each block of this file')
parser.add_argument('--change-size', type=int, default=2, help='size of change at start of a block')
parser.add_argument('--percent', type=int, default=10, help='percent of blocks to get changed')
parser.add_argument('--create', type=int, help='create file with this many blocks', dest='blocks')
args = parser.parse_args()

if args.blocks is not None:
    size = args.block_size * args.blocks
    fd = os.open(args.path, os.O_WRONLY | os.O_CREAT | os.O_TRUNC | os.O_BINARY)
    os.ftruncate(fd, size)
    blocks = args.blocks
elif os.path.exists(args.path):
    fd = os.open(args.path, os.O_WRONLY)
    size = os.path.getsize(args.path)
    blocks = size // args.block_size
else:
    print('Please supply path of an existing file, or use --create to create a new file.', file=sys.stderr)
    sys.exit(1)

changes = blocks * args.percent // 100

for index in random.sample(range(blocks), k=changes):
    os.lseek(fd, index * args.block_size, os.SEEK_SET)
    change = random.randbytes(args.change_size)
    os.write(fd, random.randbytes(args.change_size))

Efficiency comes from only changing a small amount in a “block”, which need not match Duplicati size.
As little as 1 byte is “different”, although changing just 1 byte will eventually run out of unique blocks…

Predictability comes from being able to orient to Duplicati blocks, unlike trying to predict random bytes.

Flexibility lets you make a completely random-content file of arbitrary size if you want for some reason, and I “think” you can probably also extend an existing file (though only so far, and I haven’t just tested).

An ambitious workload simulator would also add and delete filenames, with this doing changes in them.
I’ve got a very old very idle PC I could run tests on, if anybody can figure out how to make it a workload.

1 Like