Ok - I get it now. Thank you for the additional clarifications. What I had gathered from the material was that the information in index files was based upon zip directory listings. What I was missing is that files have an actual content, because at one moment in time I had tried to open the entry in vol/
in a dinxed file but unzip
whined that that entry was corrupted. Probably a random event, that wasted me a couple hours of research.
So, mainly for my future reference, let me summarize all that information:
- There are volumes, which are compressed and optionally encrypted archives containing either backup lists (dlist), any kind of block (dblock), or indices and blocklist blocks (dindex)
- These are simply files in the remote
- dlists have a list of all the source files contained inside a backup, with their hash, metadata hash, and if necessary the hashes of the blocklist(s) that must be used to put them back together
- They are represented as
Fileset
s, where each file is associated to the set by aFilesetEntry
-
Fileset
s are also linked back to their source dlist file
- They are represented as
- dblocks have blocks named after their hash
- dindexes have blocklist blocks named after their hash in
list/
and an index of their dblock named after the dblock invol/
- They are represented as
Remotevolume
s in the database, where dblocks and dindexs can also be linked together throughIndexBlockLink
- There are blocks, which are binary blobs containing either source files, metadata, or blocklists (i.e. lists of blocks which compose a source file)
- Blocks exist in the remote in volumes, either in a dblock (all types of blocks), or in the
list/
directory of a dindex (only blocklists) - They are represented as
Block
s in the local database, belonging to aRemotevolume
. In the case of blocklists, which belong to more than one volume, they are (as far as I can see) always represented to be part of a dblock, even when they are also part of a dindex
- Blocks exist in the remote in volumes, either in a dblock (all types of blocks), or in the
- Source files are… well, we know what they are
- They are listed for each backup in the backup’s dlist file
- In the same file, they are linked to their metadata
- In the same file, they are either listed to a single block through its hash (which is also the file’s hash), or to a list of blocklists, through the hashes of the blocklists
- A file is represented as a
FileLookup
, which belongs to aFileset
and has aBlockset
- Metadata exists only as part of a source file, in the form described earlier
- It is represented as
Metadataset
s
- It is represented as
- Blocklists are binary files as described earlier
- They are represented by
Blockset
s, where each block is linked to is by aBlocksetEntry
(which obviously points to aBlock
) - Note that this system parts form the remote representation: when there is more than a blocklist for a file, that’s represented as a single
Blockset
in the database, which thus has the hash of the file; and when there is no blocklist for a file, aBlockset
containing a single block is represented, with the same hash as the block and thus the file - The above is also true for metadata, represented as
Metadataset
(which however is not a set in the same sense as aBlockset
or aFileset
), which points to aBlockset
with a singleBlock
, having the same hash as theBlockset
itself - The blocklists themselves are not represented as
Blockset
s with a single entry (i.e. there are noBlockset
s with the same hash as the hash of a list), but asBlocklistHash
es, each associated with aBlockset
- This means that not every
Blockset
has aBlocklistHash
, because as I mentioned earlier someBlockset
s are made up for the sake of normalization and don’t have a corresponding representation in the remote - or a corresponding blocklist
- This means that not every
- They are represented by