NTFS multiple data streams

I did a test restore of a file that has multiple data streams. I restored it to an alternate location, and it seems only the primary data stream was restored.

Is this correct that Duplicati does not currently back up alternate data streams?
Any plans to add this feature?

Yes, it is correct. Duplicati currently does not query for alternate streams, so they are not backed up.

If we add support for alternate data streams, it will add a small overhead when querying each file (it needs to query each file for streams) but other than that there are no issues.

I have looked into adding it, but ran out of time:

It should be an easy addition if someone has the time to add it.

I was wondering how tough this would be. While it’s nice that AlphaFS has some support, Windows itself (still) seems not to include easy tools for examining and troubleshooting ADS. This gives me concern about how to support users whose backups develop problems only visible/fixable with ADS tools. Even Duplicati seems like design decisions would need to be made on exactly how visible ADS is, e.g. does it get UI or is it very hidden? Even if hidden, internal representation (e.g. database), and processing (e.g. backup, repair, etc) are needed.

Possibly there’s a way to box this in enough to keep its impact limited. I’m also not familiar with typical usages. Sufficiently tiny amounts of ADS could maybe even be relegated to metadata instead of normal-class backup. Regardless, ADS requests are infrequent, so maybe such design decisions will just wait for a willing volunteer.

Yeah I’m not sure how critical this is. But if it is implemented, I don’t think it would need to be exposed in the UI at all. If you back up a file, it should back up all streams. When you restore, it restores all streams (assuming you are not redirecting the restore to a non-NTFS drive).

Duplicati essentially treats all paths as opaque strings, in the sense that it does not validate them. This means that there are no problems with storing the paths in the zip archives, the json files or the database.

So I suggest just treating ADS as separate files.

We can do this either by returning the paths from within the snapshot implementations, transparent to the rest of the code. Or we can add a ListAlternateStreams() method to the snapshot implementations.

I think it should at least have an on/off switch somewhere.

For most users, I think storing ADS is just overhead. I do not know of a case where ADS restore is super important.

It would leak a bit into the UI I think, because you would be able to see “strange paths” at least when restoring. Also, the restore needs to know a bit about ADS, as it cannot restore a stream if the file is missing.

1 Like

I don’t think it’d have to show up in the UI if alternate data streams were just considered “attributes” of the file. And I don’t think someone would ever want to restore just a CERTAIN stream for a particular file. This is how other backup software works that I’m familiar with – they just back up all streams and then restore all streams if someone restores that particular file.

But like you said, I am not even sure this is all that important. In most cases the alternate streams are not critical. But maybe some people use them in a way that is.

I don’t know much about ADS but from what I’ve read it’s more of a potential security issue than a benefit. :slight_smile:

That being said, if we’re not backing up alternate data streams are we at least mentioning that somewhere so users can be aware of it?