Ok - I get it now. Thank you for the additional clarifications. What I had gathered from the material was that the information in index files was based upon zip directory listings. What I was missing is that files have an actual content, because at one moment in time I had tried to open the entry in vol/ in a dinxed file but unzip whined that that entry was corrupted. Probably a random event, that wasted me a couple hours of research.
So, mainly for my future reference, let me summarize all that information:
- There are volumes, which are compressed and optionally encrypted archives containing either backup lists (dlist), any kind of block (dblock), or indices and blocklist blocks (dindex)
- These are simply files in the remote
- dlists have a list of all the source files contained inside a backup, with their hash, metadata hash, and if necessary the hashes of the blocklist(s) that must be used to put them back together
- They are represented as
Filesets, where each file is associated to the set by aFilesetEntry -
Filesets are also linked back to their source dlist file
- They are represented as
- dblocks have blocks named after their hash
- dindexes have blocklist blocks named after their hash in
list/and an index of their dblock named after the dblock invol/ - They are represented as
Remotevolumes in the database, where dblocks and dindexs can also be linked together throughIndexBlockLink
- There are blocks, which are binary blobs containing either source files, metadata, or blocklists (i.e. lists of blocks which compose a source file)
- Blocks exist in the remote in volumes, either in a dblock (all types of blocks), or in the
list/directory of a dindex (only blocklists) - They are represented as
Blocks in the local database, belonging to aRemotevolume. In the case of blocklists, which belong to more than one volume, they are (as far as I can see) always represented to be part of a dblock, even when they are also part of a dindex
- Blocks exist in the remote in volumes, either in a dblock (all types of blocks), or in the
- Source files are… well, we know what they are
- They are listed for each backup in the backup’s dlist file
- In the same file, they are linked to their metadata
- In the same file, they are either listed to a single block through its hash (which is also the file’s hash), or to a list of blocklists, through the hashes of the blocklists
- A file is represented as a
FileLookup, which belongs to aFilesetand has aBlockset
- Metadata exists only as part of a source file, in the form described earlier
- It is represented as
Metadatasets
- It is represented as
- Blocklists are binary files as described earlier
- They are represented by
Blocksets, where each block is linked to is by aBlocksetEntry(which obviously points to aBlock) - Note that this system parts form the remote representation: when there is more than a blocklist for a file, that’s represented as a single
Blocksetin the database, which thus has the hash of the file; and when there is no blocklist for a file, aBlocksetcontaining a single block is represented, with the same hash as the block and thus the file - The above is also true for metadata, represented as
Metadataset(which however is not a set in the same sense as aBlocksetor aFileset), which points to aBlocksetwith a singleBlock, having the same hash as theBlocksetitself - The blocklists themselves are not represented as
Blocksets with a single entry (i.e. there are noBlocksets with the same hash as the hash of a list), but asBlocklistHashes, each associated with aBlockset- This means that not every
Blocksethas aBlocklistHash, because as I mentioned earlier someBlocksets are made up for the sake of normalization and don’t have a corresponding representation in the remote - or a corresponding blocklist
- This means that not every
- They are represented by