Backup failing with key already exists exception on OneDrive

hosting-services

#21

Yes, I think I was thinking back to my C program days 20+ years ago.


#22

I thought it might be helpful to share my crudely hacked up version of OneDrive.List and the log file it generated.

       public IEnumerable<IFileEntry> List()
        {
            int offset = 0;
            int count = FILE_LIST_PAGE_SIZE;
            
            m_fileidCache.Clear();

            var fileName = @"c:\temp\fileList.txt";
            int fileNumber = 0;

            while(count == FILE_LIST_PAGE_SIZE)
            {
                var url = string.Format("{0}/{1}?access_token={2}&limit={3}&offset={4}", WLID_SERVER, string.Format(FOLDER_TEMPLATE, FolderID), Library.Utility.Uri.UrlEncode(m_oauth.AccessToken), FILE_LIST_PAGE_SIZE, offset);
                var res = m_oauth.GetJSONData<WLID_DataItem>(url, x => x.UserAgent = USER_AGENT);

                if (res != null && res.data != null)
                {
                    count = res.data.Length;
                    foreach(var r in res.data)
                    {
                        File.AppendAllText(fileName, r.name + "," + fileNumber++ + "\n");

                        if (m_fileidCache.ContainsKey(r.name) == false)
                        {

                            m_fileidCache.Add(r.name, r.id);

                            var fe = new FileEntry(r.name, r.size.Value, r.updated_time.Value, r.updated_time.Value);
                            fe.IsFolder = string.Equals(r.type, "folder", StringComparison.OrdinalIgnoreCase);
                            yield return fe;
                        }
                    }
                }
                else
                {
                    count = 0;
                }

                offset += count;
            }

            File.AppendAllText(fileName, "Total File records = " + fileNumber + "\n");
            File.AppendAllText(fileName, "Unique file Names = " + m_fileidCache.Count + "\n");
        }

fileList.zip (316.5 KB)

The last lines of the log show that only 11066 unique filenames are reported out of a total of 16887.

I have to believe Microsoft broke the API. Another possibility is that the back-end store is corrupted, but I think we can discount that as the MS OneDrive Android app and the OneDrive web site appear to be reporting files correctly - I can find files in the Android app that are not reported by the API.

I haven’t been able to discern any pattern to when files are repeated - it seems to be random. I also don’t believe there is any file number threshold that causes the problem as the number of files I don’t believe had increased meaningfully as the number of source files probably only increased by less than 10 in the week prior to the problem appearing - although needless to say I wasn’t paying attention so I could be wrong.


#23

Thanks patman!

I was making same experiments yesterday similar to this to see if the only problem are duplicate names. It seems that the json returned by the Api has the correct number of items but the file names repeats randomly, which means that some file names are missing from the list.

This hack may be dangerous, because if during the backup, duplicati needs to access one file that is not reported by the api, the process will fail.


#24

There appear to be about 840 people using OneDrive (not the business version) as destinations. Is this API issue something we should consider somehow highlighting in the forum?

(I’m assuming there’s no real way to pro-actively let OneDrive users know…)


#25

Yes, I should have been explicit that the hack was absolutely not a fix, I was merely trying to show how I had debugged the issue. I suspect the files that are not reported would indeed cause major issues for the backup.


#26

I think I’ve solved the problem:

My backup is on a folder called “Backup”.
I have synced the onedrive account with a windows client, then I’ve
renamed this folder to “Other” and I’ve created a new “Backup” folder.
I’ve copied aaaall the files from the “Other” folder to the new
"Backup". And last, I’ve waited for windows to finish the sync.

After doing that, the database recreation works again and I guess the backup too.

I think that last week tuesday there was same problem with some onedrive
servers, and they must have some kind of file cache that was corrupted.
Moving all files from one folder to other may have forced this cache to
refresh.


#27

Did you rerun your experiments to see of the JSON returned by the API now lists the correct number of items without repeats?


#28

Not yet. I’ll do when database creation finish.

I’ll keep you informed


#29

Hi,

New Duplicati user here. I am having this issue with my first backup - thought I would report what I am seeing in case it helps.

  • Backing up from a FreeNAS 11 jail to a 1TB Onedrive account. Using 2.0.2.1_beta_2017-08-01.
  • Total backup size around 280GB.
  • Block size set to 10MB
  • Backup throttled to 250KB/s so it was expected to run over several days. (I know… I’m in Australia. Backward internet infrastructure here).
  • Backup started 19 Oct and was running continuously and flawlessly until 30 Oct.
  • On 30 Oct 0527 UTC, backup started failing once the backup directory had reached 205GB in size with 42,081 files in it (as reported by Onedrive). The first error reported as follows. This error was repeated 3 times in the following few minutes.

Duplicati.Library.Interface.UserInformationException: Failed to authorize using the OAuth service: GetResponse timed out. If the problem persists, try generating a new authid token from: Duplicati OAuth Handler —> System.Net.WebException: GetResponse timed out —> System.Net.WebException: Aborted.
at System.Net.HttpWebRequest.EndGetResponse (System.IAsyncResult asyncResult) [0x00064] in <65d66c5eaa6a48038257422d3c294dab>:0
at Duplicati.Library.Utility.AsyncHttpRequest+AsyncWrapper.OnAsync (System.IAsyncResult r) [0x00021] in <1cb5198b00f34ae59d97ee7fe7a3a16c>:0
— End of inner exception stack trace —
at Duplicati.Library.Utility.AsyncHttpRequest+AsyncWrapper.GetResponseOrStream () [0x0004d] in <1cb5198b00f34ae59d97ee7fe7a3a16c>:0
at Duplicati.Library.Utility.AsyncHttpRequest.GetResponse () [0x00044] in <1cb5198b00f34ae59d97ee7fe7a3a16c>:0
at Duplicati.Library.JSONWebHelper.GetResponse (Duplicati.Library.Utility.AsyncHttpRequest req, System.Object requestdata) [0x000b4] in <138bf26c6c1d46ad83e0ec8ca32c67c5>:0
— End of inner exception stack trace —
at Duplicati.Library.Main.Operation.BackupHandler.HandleFilesystemEntry (Duplicati.Library.Snapshots.ISnapshotService snapshot, Duplicati.Library.Main.BackendManager backend, System.String path, System.IO.FileAttributes attributes) [0x0000e] in <118ad25945a24a3991f7b65e7a45ea1e>:0
at Duplicati.Library.Main.Operation.BackupHandler.RunMainOperation (Duplicati.Library.Snapshots.ISnapshotService snapshot, Duplicati.Library.Main.BackendManager backend) [0x0018a] in <118ad25945a24a3991f7b65e7a45ea1e>:0
at Duplicati.Library.Main.Operation.BackupHandler.Run (System.String[] sources, Duplicati.Library.Utility.IFilter filter) [0x0063c] in <118ad25945a24a3991f7b65e7a45ea1e>:0

  • On 30 Oct 1411 UTC, the following error started to be reported everyday until today within a few minutes of each scheduled backup starting:

System.ArgumentException: An item with the same key has already been added.
at Duplicati.Library.Main.BackendManager.List () [0x00038] in <118ad25945a24a3991f7b65e7a45ea1e>:0
at Duplicati.Library.Main.Operation.FilelistProcessor.RemoteListAnalysis (Duplicati.Library.Main.BackendManager backend, Duplicati.Library.Main.Options options, Duplicati.Library.Main.Database.LocalDatabase database, Duplicati.Library.Main.IBackendWriter log, System.String protectedfile) [0x0000d] in <118ad25945a24a3991f7b65e7a45ea1e>:0
at Duplicati.Library.Main.Operation.FilelistProcessor.VerifyRemoteList (Duplicati.Library.Main.BackendManager backend, Duplicati.Library.Main.Options options, Duplicati.Library.Main.Database.LocalDatabase database, Duplicati.Library.Main.IBackendWriter log, System.String protectedfile) [0x00000] in <118ad25945a24a3991f7b65e7a45ea1e>:0
at Duplicati.Library.Main.Operation.BackupHandler.PreBackupVerify (Duplicati.Library.Main.BackendManager backend, System.String protectedfile) [0x00066] in <118ad25945a24a3991f7b65e7a45ea1e>:0

  • Note also that if I attempt to do a “Test Connection” on the backup config, it will fail with the following two errors. If I change the destination path to a different (but valid) directory, then the “Test Connection” function works as expected.

System.ArgumentException: An item with the same key has already been added.
at System.ThrowHelper.ThrowArgumentException (System.ExceptionResource resource) [0x0000b] in :0
at System.Collections.Generic.Dictionary2[TKey,TValue].Insert (TKey key, TValue value, System.Boolean add) [0x0008e] in <dbb16e0bacdc4a0f87478e401bc29b6c>:0 at System.Collections.Generic.Dictionary2[TKey,TValue].Add (TKey key, TValue value) [0x00000] in :0
at Duplicati.Library.Backend.OneDrive.List () [0x000c9] in <13c3feac32714e2bad328f5eb27cf3b5>:0
at Duplicati.Library.Backend.OneDrive.Test () [0x00000] in <13c3feac32714e2bad328f5eb27cf3b5>:0
at Duplicati.Server.WebServer.RESTMethods.RemoteOperation.TestConnection (System.String url, Duplicati.Server.WebServer.RESTMethods.RequestInfo info) [0x000b7] in :0
at Duplicati.Server.WebServer.RESTMethods.RemoteOperation.POST (System.String key, Duplicati.Server.WebServer.RESTMethods.RequestInfo info) [0x00091] in :0
at Duplicati.Server.WebServer.RESTHandler.DoProcess (Duplicati.Server.WebServer.RESTMethods.RequestInfo info, System.String method, System.String module, System.String key) [0x0026e] in :0

System.ObjectDisposedException: Cannot write to a closed TextWriter.
at System.IO.StreamWriter.Flush (System.Boolean flushStream, System.Boolean flushEncoder) [0x0000b] in :0
at System.IO.StreamWriter.Flush () [0x00006] in :0
at Duplicati.Server.WebServer.RESTHandler.DoProcess (Duplicati.Server.WebServer.RESTMethods.RequestInfo info, System.String method, System.String module, System.String key) [0x003aa] in :0

  • I also tried renaming the backup directory and moving it within my Onedrive account. Neither improved the situation.
  • I attempted to copy the backup directory within Onedrive, but this failed with an error in the Onedrive console.
  • Listing the files reliably in Onedrive was impossible. Even trying to sort by date to see the latest files at the top was unreliable, with a random set of files shown each time the page was refreshed.
  • I don’t have the luxury of synchronising the whole backup directory in order to make an offline copy given our 1999-era internet infrastructure here.

It would be awesome if you guys could get to the bottom of this - I’d hate to have to re-do the entire backup again! Let me know if you’d like me to try anything differently and I’ll report back.

  • Brad

#30

Well… after more or less 12 hours recreating the backup, it fails again with same error in another point:

System.ArgumentException: An item with the same key has already been added. at Duplicati.Library.Main.AsyncDownloader+AsyncDownloaderEnumerator+AsyncDownloadedFile.get_TempFile () <0x40eade40 + 0x0002f> in :0 at Duplicati.Library.Main.Operation.RecreateDatabaseHandler.DoRun (Duplicati.Library.Main.Database.LocalDatabase dbparent, Boolean updating, IFilter filter, Duplicati.Library.Main.Operation.NumberedFilterFilelistDelegate filelistfilter, Duplicati.Library.Main.Operation.BlockVolumePostProcessor blockprocessor) <0x40e432f0 + 0x0584e> in :0 at Duplicati.Library.Main.Operation.RecreateDatabaseHandler.Run (System.String path, IFilter filter, Duplicati.Library.Main.Operation.NumberedFilterFilelistDelegate filelistfilter, Duplicati.Library.Main.Operation.BlockVolumePostProcessor blockprocessor) <0x40e40b70 + 0x000b3> in :0 at Duplicati.Library.Main.Operation.RepairHandler.RunRepairLocal (IFilter filter) <0x40e40320 + 0x00327> in :0 at Duplicati.Library.Main.Operation.RepairHandler.Run (IFilter filter) <0x40e3fe10 + 0x00083> in :0 at Duplicati.Library.Main.Controller+<>c__DisplayClass20_0.b__0 (Duplicati.Library.Main.RepairResults result) <0x40e3fc40 + 0x00073> in :0 at Duplicati.Library.Main.Controller.RunAction[T] (Duplicati.Library.Main.T result, System.String[]& paths, IFilter& filter, System.Action`1 method) <0x4118a390 + 0x00243> in :0

And now, it fails again at the beginning of the repair process.

Debugging I get that the total files listed is ok (23321), but the json have 15202 unique files and 8119 repeated.

What it seems to be true is that after moving files from directory, the list report was ok several times until it fails again.


#31

More info: I’ve hacked OneDrive.cs to use “api.onedrive.com/v1.0” instead of “apis.live.net/v5.0” API, and the result is the same duplicate items.

Also, If I request from a browser the complete api url, with an offset of, for example 2400, everytime I make the request I get different items.

Also I’ve found this link that can be related: /drive API giving inconsisent results · Issue #727 · OneDrive/onedrive-api-docs · GitHub (happens the same day as us).


#32

Thank you Brad… It seems to be the same problem, and the date matches.

I hope too that we (or MS) can solve this.


#33

So we’re still pretty sure the issue is on OneDrive’s API end right?


#34

I think that yes, but only some accounts are affected.

Microsoft recommends to use the Graph REST API to access OneDrive, but the old APIs will be online until the end of 2018.
But there is no warranty that with Graph it will work, and implement Graph is more difficult than change Live API to OneDrive API because the auth method changes (although it must be done before the end of 2018, when the old APIs dissapear).


#35

Debugging the json data retrieved by the api when we make the same request with the same offset and limit, we get a random list of files when it must be equal.

I’ve seen that the “malfunction” account fails too through the web interface. When I browse the file list in the onedrive web interface, and I scroll down, the files disappears and appears randomly (in an account that is working right the list is continue). I suppose that the website is using the Graph api, and this this must be a general failure that, soon or later, will be solved by Microsoft.

Until that, I’ve made an ugly workaround: when the application fetches the list of files, I retry the download until all the unique files are retrieved. It is a little slow (in my case, the full complete list is retrieved in the 7th or 8th iteration), but it works, I could recreate the database and continue my daily backups:

The modified DLL: duplicati/Duplicati.Library.Backend.OneDrive.dll at 126c692a9d3ced006a60a7701a76a0d66b30a6ad · osmogar/duplicati · GitHub


#36

osmogar: Thanks for figuring this out!


#37

@osmogar: The answer from MS indicates that we should change the API/URL. Do you know if the change they mention is just using the new URL ?


#38

All their old api are being moved to Graph api. From what I’ve seen, the changes are deeper than changing only the url, including the authentication: Migrating from Live SDK to Microsoft Graph - OneDrive dev center | Microsoft Docs


#39

Thanks. It seems there is a library that has all the required stuff here:


#40

Throwing in a “Me Too” here. This was another multi-day backup situation (.au isn’t the only place with crap upload speeds), and Duplicati is indicating that the backup has never completed successfully. Total backup should probably be right around 660GB, and there’s currently 420GB present.

Is there any fix on the horizon and/or a Canary build that has a fix in place? I’m running on Debian Jessie (OpenMediaVault), headless, 2.0.2.1_beta_2017-08-01.

Not sure what I would need to do with @osmogar’s workaround file, just drop it in overwriting the existing DLL and restart Duplicati?