Backing up Google Drive Cloud to AWS via local Duplicati


#1

This is a feature request.

The following environment applies:

  • Duplicati installed on a Windows machine, successfully backs up to AWS S3 buckets, which includes user data synced from Google Drive Cloud to the local machine, via Google Backup & Sync.
  • Google Backup & Sync downloads non-Google formats as discrete files, but Google Docs themselves are soley symbolic links (in the Windows environment, they are hyperlinks which launch the default browser and open the file in the browser).

Apps like Spanning save the data from Google Drive Cloud to Spanning’s own servers.

Apps like BackupGoo convert Google doc content to an OpenOffice format for apps, like Duplicati, to backup.

At present, Duplicati has loads of places to which to save backup data, but only backs up data on the machine on which it is installed.

Is there merit for Duplicati to include functionality similar to both BackupGoo and Spanning, for Duplicati to OAUTH into the user’s Google Drive and back the data up from Google Drive Cloud to the user’s preferred backend? (Conversion to OpenOffice format shouldn’t be necessary; syncing converted data down to the local machine solely to upload it as a backup shouldn’t be necessary).


#2

Currently I don’t believe there are plans to support backing up other sources.

This is logical because backing up remote files causes terrible performance since you essentially download your entire Google drive and re-upload it. And since Duplicati uses hashing for many of it’s comparisons (any time it’s unsure if a file has changed) it will waste a ton of time downloading files.

You can get away with backing up network shares by mounting them into the filesystem and similarly if you get something like Fuse for Google drive you can actually do what you want without Duplicati knowing. But performance will be lackluster and I honestly think time is spent better working on other parts of Duplicati :slight_smile:

Some Fuse Google Drive solutions:
GitHub - astrada/google-drive-ocamlfuse: FUSE filesystem over Google Drive
GitHub - dsoprea/GDriveFS: An innovative FUSE wrapper for Google Drive.


#3

Hi @taxedserf, welcome to the forum - and thanks for the feature suggestion!

As @Pectojin mentioned, the performance (and potential bandwidth consumption) on such a feature could be troublesome. And using the referenced tools you could even test it out to get a general idea of how well it would work.

That being said, as an open source project you are free to contribute code to Duplicati over at GitHub so if you really would like to see that sort of feature, you are welcome to suggest it in an “issue” over there (with much more detail on how you think it should work) or even add the code yourself. :slight_smile:


#4

I looked at those apps briefly. Below are quick impressions that @taxedserf might comment on.
I’m just a Duplicati user, and quite new to a lot of this, but this is an interesting topic to consider.

What issues (besides disk use) does one run into simply backing up Backup and Sync’s drive?
Minus conversion, BackupGoo sounds like it just gets its updates some way other than on disk.
Duplicati now uses some OAuth, but I believe it’s only to get to destinations to do simple things.

Spanning runs at Google as an app. It can probably get to Google Drive fast, and AWS less so.
Interestingly, I see that Duplicati already uses App Engine, and App Engine allows C# and .NET.

Having said that, a lot of ideas have merit, but I suspect the focus is currently to get out of Beta.
Thanks for the suggestion, and perhaps more discussion will follow.

Meanwhile I wonder if rclone with B2 versions or --backup-dir can help as some sort of backup?


#5

Many thanks for your prompt replies.

I agree that the desktop-bound Duplicati cannot efficiently back up Google docs within part of a backup job covering a Windows folder Google Drive (sync’ed by Google Backup & Sync).

I’ve since discovered that a Linux user running google-drive-ocamlfuse will sync Google Drive with an instant conversion of all GDoc files to ODS/ODT. The Linux version of Duplicati would easily back those files up as part of its normal process. A Windows user uses Google Backup & Sync, which syncs only a URL to GDoc files (and not the content of the GDoc files, which would typically exceed 1kb URL).

For a Windows user, then, the missing bit is a Duplicati agent running as a Google app, paired with an existing Duplicati desktop installation on Windows, both instances working in sync with the same backup job, backing up to the same destination. What Duplicati desktop spots as a native Google Doc file, it would instruct Duplicati backup to convert to the required file formats (an API call to Google Drive?) and squirt the result to the Duplicati backup destination. The magic is the de-duplication process, because a user’s synced Google Drive is functionally identical, except for the GDoc files containing data in Google Drive Cloud, but being only symbolic links/URLs in Google Drive Backup & Sync Desktop.

Is this a viable strategy?

Sadly, I’m not a programmer, so I can’t contribute more technical stuff!


#6

It’s an interesting design, but more research is needed to figure out if Duplicati could handle things like metadata (is last modified date when it was changed on Google drive or when it was synved).

How should restores be handled? Could they go directly into the sync folder?