Since Duplicati is cataloging everything, it would be really helpful if it could show me how much space each directory and file in the data is contributing to the size of my backups.
My current source data set is over 35GB, and I could probably trim it quite a bit if only I could easily tell what some of the bigger contributors are.
Block-level deduplication makes it sometimes impossible to say how much space a file adds to the size. Additional copies of the same file add roughly nothing to size of backup. Partially-identical files add some.
Compression is sometimes a factor, but for many people the largest files are audio or visual media, and Duplicati doesn’t try to compress those again when it builds the zip file if they’re in a compressed format.
Versioning can apply. 10 edits of a photo might take 10 times the space. Text appends might deduplicate.
Having said all that, there is a wide selection of tools for your computer that can guide you at least a little.
Here are some examples, and as local tools I suspect they can visualize better than Duplicati’s web can.
but it wasn’t the only word. The rest was ... to say how much space a file adds to the size which was the request, and which I don’t see done. For whole folders this could be a useful approximation, although Duplicati’s deduplication works differently. Despite no visuals, ddpeval.exe may help with analysis.
I’m just interested in being able to spot when I’m backing up a gig or two of data from some app that I don’t care about. Whether this is in terms of the live data on my local machine or in terms of the backup archive size isn’t terribly important.
@drakar2007: The problem with using an external tool like WinDirStat is that it shows me my whole disk and cannot easily show me just what I’ve effectively selected as my backup set in duplicati. I’ve got dozens of filters (mostly exclusion) in my duplicati config, and I’m interested in figuring out if what’s still left has any big files that I might want to add additional exclusion filters for.
Thanks for the WizTree suggestion though, I’ll check it out.
This is far from the “visualize” need, but while GUI restore tree doesn’t reveal sizes, find command does.
Running commandline entry
Finished!
Listing contents 0 (3/27/2019 1:09:32 PM):
C:\stop test source\
C:\stop test source\length1.txt (1 bytes)
C:\stop test source\linuxmint-18-cinnamon-64bit.iso (1.58 GB)
Return code: 0
Big files would presumably have " GB)" in list for a dumb search, but a regular expressions can do better. My suspicion is that these are the original sizes without compression or deduplication, but it’s something.
You can also use the COMPARE command, that will give you the difference and size of any 2 backup versions of your backup set.
Comparing version 1 to version 0 will show the size of the most recent and the second last backup.
Using binary search (start with comparing version 10 to version 1, then compare 10 to 5 or 5 to 0 etc) can help tracking down a backup version with much added data.
You can also subscribe to the free Duplicati Monitoring service. The web interface can show a nice graph of the allocated backend storage per backup version: