Taking too long to list folders just to find and restore a handful of files

ShapeShifter499 · November 20, 2024, 9:25am

I’m left feeling like I made a mistake with choosing Duplicati. I am trying to check my backups via the webUI to make sure I am not forgetting about a file inside of a folder I accidentally deleted but it’s currently at 1 hour 39 minutes as of writing and still “Fetching path information …” I can only tell it’s still working by checking ‘htop’ and seeing CPU activity from duplicati. Is there anyway to speed this up?

System
Arch Linux
Samsung Chromebox 3 running coreboot
USB 3.0 card to pci-e for external drives
mSATA SSD 512GB internal storage
Intel(R) Core™ i5-2450M CPU
16GB RAM

Duplicati version 2.0.9.110

kenkendk · November 20, 2024, 11:04am

Hi @ShapeShifter499 that seems like an excessive amount of time for a simple search.

What size is the local database?
How many files (roughly) do you have in the database?

Is the hanging screen the initial one, or did you see some folders and then it crashed?
If you have a screenshot, that would help in pinpointing the bottleneck.

ShapeShifter499 · November 20, 2024, 11:39am

It eventually loads a list, but it is appearing to take multiple hours to sort through files.

In the WebUI, I go to home → job → operations → restore file. It will bring me to restore files but if I don’t remember the file exactly (name or otherwise) it could take a really long time to check each folder I knew it was in per backup (57 versions in my current list over two years).

Source:
    2.83 TB
Backup:
    11.23 TB / 57 Versions

One of the most recent successful backups in my logs shows this

Source Files
 Examined 3182996 (9.74 TB) Opened 57275 (1.01 GB) Added 57271 (1.01 GB) Modified 4 (0 bytes) Deleted 1131610

edit: I updated the numbers from the correct successful backup

ShapeShifter499 · November 20, 2024, 11:49am

I should add that it’s taking hours to load the initial list and every time I click on a folder in the list.

If this were a bare metal file system and a simple file explorer, that action would take seconds, a minute at worst on most of my computers. I feel like if all I am doing is searching up files and names it should not take hours to do so. Only when I try to restore a file should it take any significant time.

kenkendk · November 20, 2024, 12:27pm

ShapeShifter499:

One of the most recent successful backups in my logs shows this
Source Files
 Examined 3182996 (9.74 TB) Opened 57275 (1.01 GB) Added 57271 (1.01 GB) Modified 4 (0 bytes) Deleted 1131610 

Thanks, that gives an idea. The numbers only mentions what has been processed in that backup, not the full backup number of files (I would neeed NotProcessedFiles as well to calculate total = Examined + Not processed.

I agree that anything measured in “hours” is not useful for an interactive UI.
I have registered an issue for improving the search speed in general.

For your use-case (finding a missing file) we do not currently have a great UI for that.
What you can do instead, is use the “commandline” feature of the UI:

In there you can choose “find” as the operation, leave the “Target URL” as-is, and then type in the filename you are looking for, with */ before and * after. Finally set the option --all-versions=true to get a search across versions:

You can remove all options that are pre-filled, except dbpath as that is the database you are searching in.

I cannot promise it will be significantly faster, but at least you only have to run the command once.

ShapeShifter499 · November 20, 2024, 12:37pm

What if I only know the folder? Can I get a one time list of all files under one folder across backups? Does that command work for folders?

kenkendk · November 20, 2024, 12:46pm

Yes. The expression you type in as the “commandline arguments” is a filter expression, so it can match a folder as well (technically all the files in the folder are matched):

/path/to/folder/*

If you have a terminal, you may want to avoid the UI, and use the real commandline interface:

duplicati-cli find ssh://unused --dbpath=/path-from-ui.sqlite --all-versions '/path/to/folder/*'

ShapeShifter499 · November 20, 2024, 12:53pm

I didn’t copy the whole log I’m sorry, this is what it shows under “complete log”. I have not ran a new backup in quite a while.

            {
  "DeletedFiles": 1131610,
  "DeletedFolders": 5744,
  "ModifiedFiles": 4,
  "ExaminedFiles": 3182996,
  "OpenedFiles": 57275,
  "AddedFiles": 57271,
  "SizeOfModifiedFiles": 0,
  "SizeOfAddedFiles": 1087854118,
  "SizeOfExaminedFiles": 10703749287223,
  "SizeOfOpenedFiles": 1088104874,
  "NotProcessedFiles": 0,
  "AddedFolders": 116,
  "TooLargeFiles": 0,
  "FilesWithError": 0,
  "ModifiedFolders": 0,
  "ModifiedSymlinks": 0,
  "AddedSymlinks": 0,
  "DeletedSymlinks": 0,
  "PartialBackup": false,
  "Dryrun": false,
  "MainOperation": "Backup",
  "CompactResults": {
    "DeletedFileCount": 0,
    "DownloadedFileCount": 0,
    "UploadedFileCount": 0,
    "DeletedFileSize": 0,
    "DownloadedFileSize": 0,
    "UploadedFileSize": 0,
    "Dryrun": false,
    "VacuumResults": null,
    "MainOperation": "Compact",
    "ParsedResult": "Success",
    "Interrupted": false,
    "Version": "2.0.7.103 (2.0.7.103_canary_2024-04-19)",
    "EndTime": "2024-06-20T09:05:20.676338Z",
    "BeginTime": "2024-06-19T12:07:34.919392Z",
    "Duration": "20:57:45.7569460",
    "MessagesActualLength": 0,
    "WarningsActualLength": 0,
    "ErrorsActualLength": 0,
    "Messages": null,
    "Warnings": null,
    "Errors": null,
    "BackendStatistics": {
      "RemoteCalls": 16,
      "BytesUploaded": 469401487,
      "BytesDownloaded": 410724359,
      "FilesUploaded": 11,
      "FilesDownloaded": 3,
      "FilesDeleted": 0,
      "FoldersCreated": 0,
      "RetryAttempts": 0,
      "UnknownFileSize": 0,
      "UnknownFileCount": 0,
      "KnownFileCount": 570668,
      "KnownFileSize": 12347552451148,
      "LastBackupDate": "2024-06-17T01:07:44-07:00",
      "BackupListCount": 56,
      "TotalQuotaSpace": 0,
      "FreeQuotaSpace": 0,
      "AssignedQuotaSpace": -1,
      "ReportedQuotaError": false,
      "ReportedQuotaWarning": false,
      "MainOperation": "Backup",
      "ParsedResult": "Success",
      "Interrupted": false,
      "Version": "2.0.7.103 (2.0.7.103_canary_2024-04-19)",
      "EndTime": "0001-01-01T00:00:00",
      "BeginTime": "2024-06-17T08:07:43.801464Z",
      "Duration": "00:00:00",
      "MessagesActualLength": 0,
      "WarningsActualLength": 0,
      "ErrorsActualLength": 0,
      "Messages": null,
      "Warnings": null,
      "Errors": null
    }
  },
  "VacuumResults": null,
  "DeleteResults": {
    "DeletedSetsActualLength": 0,
    "DeletedSets": [],
    "Dryrun": false,
    "MainOperation": "Delete",
    "CompactResults": {
      "DeletedFileCount": 0,
      "DownloadedFileCount": 0,
      "UploadedFileCount": 0,
      "DeletedFileSize": 0,
      "DownloadedFileSize": 0,
      "UploadedFileSize": 0,
      "Dryrun": false,
      "VacuumResults": null,
      "MainOperation": "Compact",
      "ParsedResult": "Success",
      "Interrupted": false,
      "Version": "2.0.7.103 (2.0.7.103_canary_2024-04-19)",
      "EndTime": "2024-06-20T09:05:20.676338Z",
      "BeginTime": "2024-06-19T12:07:34.919392Z",
      "Duration": "20:57:45.7569460",
      "MessagesActualLength": 0,
      "WarningsActualLength": 0,
      "ErrorsActualLength": 0,
      "Messages": null,
      "Warnings": null,
      "Errors": null,
      "BackendStatistics": {
        "RemoteCalls": 16,
        "BytesUploaded": 469401487,
        "BytesDownloaded": 410724359,
        "FilesUploaded": 11,
        "FilesDownloaded": 3,
        "FilesDeleted": 0,
        "FoldersCreated": 0,
        "RetryAttempts": 0,
        "UnknownFileSize": 0,
        "UnknownFileCount": 0,
        "KnownFileCount": 570668,
        "KnownFileSize": 12347552451148,
        "LastBackupDate": "2024-06-17T01:07:44-07:00",
        "BackupListCount": 56,
        "TotalQuotaSpace": 0,
        "FreeQuotaSpace": 0,
        "AssignedQuotaSpace": -1,
        "ReportedQuotaError": false,
        "ReportedQuotaWarning": false,
        "MainOperation": "Backup",
        "ParsedResult": "Success",
        "Interrupted": false,
        "Version": "2.0.7.103 (2.0.7.103_canary_2024-04-19)",
        "EndTime": "0001-01-01T00:00:00",
        "BeginTime": "2024-06-17T08:07:43.801464Z",
        "Duration": "00:00:00",
        "MessagesActualLength": 0,
        "WarningsActualLength": 0,
        "ErrorsActualLength": 0,
        "Messages": null,
        "Warnings": null,
        "Errors": null
      }
    },
    "ParsedResult": "Success",
    "Interrupted": false,
    "Version": "2.0.7.103 (2.0.7.103_canary_2024-04-19)",
    "EndTime": "2024-06-20T09:05:20.677162Z",
    "BeginTime": "2024-06-19T10:57:58.770902Z",
    "Duration": "22:07:21.9062600",
    "MessagesActualLength": 0,
    "WarningsActualLength": 0,
    "ErrorsActualLength": 0,
    "Messages": null,
    "Warnings": null,
    "Errors": null,
    "BackendStatistics": {
      "RemoteCalls": 16,
      "BytesUploaded": 469401487,
      "BytesDownloaded": 410724359,
      "FilesUploaded": 11,
      "FilesDownloaded": 3,
      "FilesDeleted": 0,
      "FoldersCreated": 0,
      "RetryAttempts": 0,
      "UnknownFileSize": 0,
      "UnknownFileCount": 0,
      "KnownFileCount": 570668,
      "KnownFileSize": 12347552451148,
      "LastBackupDate": "2024-06-17T01:07:44-07:00",
      "BackupListCount": 56,
      "TotalQuotaSpace": 0,
      "FreeQuotaSpace": 0,
      "AssignedQuotaSpace": -1,
      "ReportedQuotaError": false,
      "ReportedQuotaWarning": false,
      "MainOperation": "Backup",
      "ParsedResult": "Success",
      "Interrupted": false,
      "Version": "2.0.7.103 (2.0.7.103_canary_2024-04-19)",
      "EndTime": "0001-01-01T00:00:00",
      "BeginTime": "2024-06-17T08:07:43.801464Z",
      "Duration": "00:00:00",
      "MessagesActualLength": 0,
      "WarningsActualLength": 0,
      "ErrorsActualLength": 0,
      "Messages": null,
      "Warnings": null,
      "Errors": null
    }
  },
  "RepairResults": null,
  "TestResults": {
    "MainOperation": "Test",
    "VerificationsActualLength": 3,
    "Verifications": [
      {
        "Key": "duplicati-20240603T142900Z.dlist.zip.aes",
        "Value": []
      },
      {
        "Key": "duplicati-i31b7abbee7f64989bfe3a996f3bf8a95.dindex.zip.aes",
        "Value": []
      },
      {
        "Key": "duplicati-b569743ad33cb4ab2865ad9d61bf96afd.dblock.zip.aes",
        "Value": []
      }
    ],
    "ParsedResult": "Success",
    "Interrupted": false,
    "Version": "2.0.7.103 (2.0.7.103_canary_2024-04-19)",
    "EndTime": "2024-06-20T09:12:25.026201Z",
    "BeginTime": "2024-06-20T09:09:44.900716Z",
    "Duration": "00:02:40.1254850",
    "MessagesActualLength": 0,
    "WarningsActualLength": 0,
    "ErrorsActualLength": 0,
    "Messages": null,
    "Warnings": null,
    "Errors": null,
    "BackendStatistics": {
      "RemoteCalls": 16,
      "BytesUploaded": 469401487,
      "BytesDownloaded": 410724359,
      "FilesUploaded": 11,
      "FilesDownloaded": 3,
      "FilesDeleted": 0,
      "FoldersCreated": 0,
      "RetryAttempts": 0,
      "UnknownFileSize": 0,
      "UnknownFileCount": 0,
      "KnownFileCount": 570668,
      "KnownFileSize": 12347552451148,
      "LastBackupDate": "2024-06-17T01:07:44-07:00",
      "BackupListCount": 56,
      "TotalQuotaSpace": 0,
      "FreeQuotaSpace": 0,
      "AssignedQuotaSpace": -1,
      "ReportedQuotaError": false,
      "ReportedQuotaWarning": false,
      "MainOperation": "Backup",
      "ParsedResult": "Success",
      "Interrupted": false,
      "Version": "2.0.7.103 (2.0.7.103_canary_2024-04-19)",
      "EndTime": "0001-01-01T00:00:00",
      "BeginTime": "2024-06-17T08:07:43.801464Z",
      "Duration": "00:00:00",
      "MessagesActualLength": 0,
      "WarningsActualLength": 0,
      "ErrorsActualLength": 0,
      "Messages": null,
      "Warnings": null,
      "Errors": null
    }
  },
  "ParsedResult": "Success",
  "Interrupted": false,
  "Version": "2.0.7.103 (2.0.7.103_canary_2024-04-19)",
  "EndTime": "2024-06-20T09:12:25.522335Z",
  "BeginTime": "2024-06-17T08:07:43.801461Z",
  "Duration": "3.01:04:41.7208740",
  "MessagesActualLength": 41,
  "WarningsActualLength": 0,
  "ErrorsActualLength": 0,
  "Messages": [
    "2024-06-17 01:07:44 -07 - [Information-Duplicati.Library.Main.Controller-StartingOperation]: The operation Backup has started",
    "2024-06-17 12:53:36 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started:  ()",
    "2024-06-17 12:56:31 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Completed:  (557.28 KB)",
    "2024-06-18 04:03:55 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-b067db446e59441698790177510783f9d.dblock.zip.aes (40.00 MB)",
    "2024-06-18 04:04:33 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-b067db446e59441698790177510783f9d.dblock.zip.aes (40.00 MB)",
    "2024-06-18 04:04:42 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-i9c0005282af849ee9480243a98b3c982.dindex.zip.aes (1,006.14 KB)",
    "2024-06-18 04:04:44 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-i9c0005282af849ee9480243a98b3c982.dindex.zip.aes (1,006.14 KB)",
    "2024-06-18 04:44:49 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-be7391861dc7948cdba5ae72a2d6939ee.dblock.zip.aes (40.00 MB)",
    "2024-06-18 04:45:27 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-be7391861dc7948cdba5ae72a2d6939ee.dblock.zip.aes (40.00 MB)",
    "2024-06-18 04:45:36 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-i6aef5fba198c4625861f98236ad549e1.dindex.zip.aes (1.13 MB)",
    "2024-06-18 04:45:38 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-i6aef5fba198c4625861f98236ad549e1.dindex.zip.aes (1.13 MB)",
    "2024-06-19 01:51:54 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-b2e4ddfc5520d4cc9845c7b7034968c96.dblock.zip.aes (44.82 MB)",
    "2024-06-19 01:51:59 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-b223f044c61fd45b69b747084f640e24c.dblock.zip.aes (43.08 MB)",
    "2024-06-19 01:52:52 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-b2e4ddfc5520d4cc9845c7b7034968c96.dblock.zip.aes (44.82 MB)",
    "2024-06-19 01:52:57 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-ib496d78cb6fc4a75a2f409dca925404c.dindex.zip.aes (789.42 KB)",
    "2024-06-19 01:52:59 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-ib496d78cb6fc4a75a2f409dca925404c.dindex.zip.aes (789.42 KB)",
    "2024-06-19 01:53:01 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-b223f044c61fd45b69b747084f640e24c.dblock.zip.aes (43.08 MB)",
    "2024-06-19 01:53:02 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-i1131d6aab45b4431a7560acd69a89085.dindex.zip.aes (954.58 KB)",
    "2024-06-19 01:53:04 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-i1131d6aab45b4431a7560acd69a89085.dindex.zip.aes (954.58 KB)",
    "2024-06-19 02:14:11 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-b038a1b62287d476999da7256fe9b12d1.dblock.zip.aes (558.73 KB)"
  ],
  "Warnings": [],
  "Errors": [],
  "BackendStatistics": {
    "RemoteCalls": 16,
    "BytesUploaded": 469401487,
    "BytesDownloaded": 410724359,
    "FilesUploaded": 11,
    "FilesDownloaded": 3,
    "FilesDeleted": 0,
    "FoldersCreated": 0,
    "RetryAttempts": 0,
    "UnknownFileSize": 0,
    "UnknownFileCount": 0,
    "KnownFileCount": 570668,
    "KnownFileSize": 12347552451148,
    "LastBackupDate": "2024-06-17T01:07:44-07:00",
    "BackupListCount": 56,
    "TotalQuotaSpace": 0,
    "FreeQuotaSpace": 0,
    "AssignedQuotaSpace": -1,
    "ReportedQuotaError": false,
    "ReportedQuotaWarning": false,
    "MainOperation": "Backup",
    "ParsedResult": "Success",
    "Interrupted": false,
    "Version": "2.0.7.103 (2.0.7.103_canary_2024-04-19)",
    "EndTime": "0001-01-01T00:00:00",
    "BeginTime": "2024-06-17T08:07:43.801464Z",
    "Duration": "00:00:00",
    "MessagesActualLength": 0,
    "WarningsActualLength": 0,
    "ErrorsActualLength": 0,
    "Messages": null,
    "Warnings": null,
    "Errors": null
  }
}

kenkendk · November 20, 2024, 1:28pm

Arh, that number was 0, so the search is done with ~3million files + folders.
Thanks, that makes it easier to set up a measuring experiment for speeding up the query.

ShapeShifter499 · November 25, 2024, 1:39pm

On the topic of slow and not optimized. I ran a version of that ‘find’ command shortly after posting and it’s still running right now. htop shows I/O and CPU usage. I’m not sure if it’s my older hardware but does it really need to take multiple days (possibly weeks) for this sort of ‘find’ command?

kenkendk · November 25, 2024, 1:51pm

It should not, but there has been little work done to optimize this part of Duplicati. It is possible that it is buffering a hug response, if there are many files in the folder.

From the perspective of a user, waiting days for an answer is not useful.
I looked briefly at the code and it supports many complex things that just slow it down.
Once thing is using a Regex, which will revert to evaluating filters in C#, which will be a bit slow due to back-n-forth from the database.

Essentially, it is an SQLite database, so making a query for 3mil strings should take seconds, even on older hardware.

If you are familiar with SQL queries, you could also make a query into the database to locate the path prefix, and return any filenames that are in that folder. Let me know if you want to go that route, and I can assist in crafting the queries.

ShapeShifter499 · November 25, 2024, 2:53pm

I can read my way through guides and manuals but I don’t have much experience with SQL.

ShapeShifter499 · November 30, 2024, 6:18pm

@kenkendk I am not kidding when I say that ‘find’ command is still on going. I’m wondering if there’s a way to restore all files under a folder with rename for files with differences or if it actuality would be any faster.

ShapeShifter499 · December 2, 2024, 11:39am

@kenkendk I’m going to give up on waiting. I can make a backup of the database to work on. What should I know to work on the database? I’m just trying to get a list of files from a folder in each version to check any deleted files.

kenkendk · December 2, 2024, 11:59am

Fair enough!

You do not need to change anything in the database, so it is not required to make a backup, as long as you do not save/commit anything.

You can use a tool such as SQLite Browser.

In the “Browse Data” tab, you can find the “File” table, and there see all paths. You can filter in the top, under the column name and drill down to the files you need.

You can also use the “Execute SQL” area to write the queries directly.

To find the names of files in a folder, use this query (replace /Users/ with your prefix):

SELECT "Path" FROM "File" WHERE "BlocksetID" > 0 AND "Path" LIKE '/Users/%';

To get the timestamps of the backups where these files are in, use a query like:

SELECT "Timestamp" FROM "Fileset" WHERE "FilesetID" IN (
SELECT DISTINCT "FilesetID" FROM "FilesetEntry" WHERE "FileID" IN (
SELECT "ID" FROM "File" WHERE "BlocksetID" > 0 AND "Path" LIKE '/Users/%'));

The timestamps you get back are in Unix Epoch format, and can be converted to “normal” time with an online tool.

ziesemer · March 31, 2025, 6:47am

See also: Listing directories for restore very slow · Issue #1715 · duplicati/duplicati · GitHub

kenkendk · March 31, 2025, 2:24pm

Hi @ziesemer, welcome to the forum

Thanks for bringing this up. We are aware of slowdown issues for large backups, especially when listing/searching for files.

Right now we are working on the recreate database speed, and after that we will focus on the search speed.

ts678 · April 5, 2025, 1:52pm

At least for my slow restore of a 1 TB HDD (roughly C:\ to see what happens):

http://localhost:8900/api/v1/backup/1/files?prefix-only=true&folder-contents=false&time=2025-04-04T14%3A51%3A00-04%3A00

took 52 minutes to go through lots of nearly 1 GB etilqs files to report stats:

{
    "Filesets": [
        {
            "Version": 0,
            "IsFullBackup": 1,
            "Time": "2025-04-04T14:51:00-04:00",
            "FileCount": 1104385,
            "FileSizes": 891689036345
        },
        {
            "Version": 1,
            "IsFullBackup": 1,
            "Time": "2025-03-31T20:44:30-04:00",
            "FileCount": 1087160,
            "FileSizes": 888039470434
        },
        {
            "Version": 2,
            "IsFullBackup": 1,
            "Time": "2024-12-20T21:52:55-05:00",
            "FileCount": 1185405,
            "FileSizes": 864043309910
        },
        {
            "Version": 3,
            "IsFullBackup": 0,
            "Time": "2024-12-09T19:05:18-05:00",
            "FileCount": 1090126,
            "FileSizes": 844535393729
        },
        {
            "Version": 4,
            "IsFullBackup": 1,
            "Time": "2024-07-01T21:11:36-04:00",
            "FileCount": 1090126,
            "FileSizes": 844535393729
        },
        {
            "Version": 5,
            "IsFullBackup": 1,
            "Time": "2024-06-30T08:25:10-04:00",
            "FileCount": -1,
            "FileSizes": -1
        },
        {
            "Version": 6,
            "IsFullBackup": 1,
            "Time": "2024-01-31T13:54:17-05:00",
            "FileCount": 1127975,
            "FileSizes": 756391546831
        },
        {
            "Version": 7,
            "IsFullBackup": 1,
            "Time": "2023-12-31T19:02:48-05:00",
            "FileCount": 1093188,
            "FileSizes": 815995864904
        },
        {
            "Version": 8,
            "IsFullBackup": 1,
            "Time": "2023-08-28T19:46:00-04:00",
            "FileCount": 963239,
            "FileSizes": 700881371759
        },
        {
            "Version": 9,
            "IsFullBackup": 1,
            "Time": "2023-08-28T14:43:20-04:00",
            "FileCount": 966185,
            "FileSizes": 700960607054
        },
        {
            "Version": 10,
            "IsFullBackup": 1,
            "Time": "2023-08-27T16:47:01-04:00",
            "FileCount": 967312,
            "FileSizes": 700531900609
        },
        {
            "Version": 11,
            "IsFullBackup": 0,
            "Time": "2023-08-27T10:55:18-04:00",
            "FileCount": 2107357,
            "FileSizes": 696623246527
        }
    ],
    "Files": [
        {
            "Path": "C:\\",
            "Sizes": []
        }
    ]
}

Did that help me any? I’m not sure. What I wanted to see was the file list in C:\:

http://localhost:8900/api/v1/backup/1/files?prefix-only=false&folder-contents=true&time=2025-04-04T14%3A51%3A00-04%3A00&filter=%40C%3A%5C

That “only” took 3.6 minutes. Initially I was wondering if Duplicati plans to keep (maybe it hurts performance while reducing storage use) the path prefix design.

If it’s going to stay, opening folders might be able to use it, for example like this:

SELECT concat("C:\", Path) FROM FileLookup JOIN FilesetEntry
ON FilesetEntry.FileID = FileLookup.ID
WHERE FilesetID = 15 AND PrefixID = 2

which returns 47 files just like GUI, except it requires 362 milliseconds not an hour including the long query I’m questioning, so compare it to 3.6 minutes if you prefer.

EDIT 1:

To explain that a little. Duplicati long ago got a space optimization to avoid storing complete paths for every file. Instead, it stores the first part of paths separately for reuse in an SQL view, which then may raise questions of how well it uses indexes.

For my little PoC snippet, I’ve manually picked my latest backup FilesetID which is what pops up when one goes into Restore GUI, and its root is C:\ thus ID 2 here:

Theory is any folder expansion (unlike general search) always expands a folder, therefore one could just look in the FileLookup table to see the files in the folder:

and on that note the reason I’m messing around with a 1 TB backup is to see if it potentially can avoid the need for doing an image backup to get heavy coverage.

Macrium Reflect Free (sadly discontinued) could quickly grab files from an image, though images have their own inherent issues (I think – certainly haven’t tried all).

EDIT 2:

The Duplicati query does return a little more than files, which might slow it a little:

{"Filesets":[{"Version":0,"IsFullBackup":1,"Time":"2025-04-04T14:51:00-04:00","FileCount":1104385,"FileSizes":891689036345}],"Files":[{"Path":"C:\\$SysReset\\","Sizes":[-1]},{"Path":"C:\\$WinREAgent\\","Sizes":[-1]},{"Path":"C:\\backup restore\\","Sizes":[-1]},{"Path":"C:\\backup source\\","Sizes":[-1]},{"Path":"C:\\bin\\","Sizes":[-1]},{"Path":"C:\\boot\\","Sizes":[-1]}, (etc.)

but this was “only” a few minutes. Main slowdown is getting stats for the versions.

I’m afraid I wasn’t initially looking at the restore dropdown during initial tree filling.
It does need to know which versions are partial, so it can flag them on dropdown:

but it looks like it had already done a GET /api/v1/backup/1/filesets for that.
Question remains on whether the 52 minute statistics query serves any purpose.

ts678 · April 5, 2025, 7:09pm

is what I was testing. After looking at the code in RestoreController.js, I thought it might take less time when opening a subfolder. I clicked on boot, and got query:

http://localhost:8900/api/v1/backup/1/files?prefix-only=false&folder-contents=true&time=2025-04-04T14%3A51%3A00-04%3A00&filter=%40C%3A%5Cboot%5C

which “only” took 2.9 minutes, which avoided 52 minutes more to get the initial list.
I’m unsure why your slowness with subfolders sounded similar to getting initial list.

Speed of the query does vary a bit for me, for example Windows took 4.5 minutes, however if I used my path prefix query, it’s 366 milliseconds, however that method probably doesn’t fit “Search for files” as nicely as it does to open a given subfolder.

There might be a couple of different cases, one the generalized search, but if the folder opening can be made fast, at least that might be comparable to other tools where I’d wonder if a full search is as fast – the cited “Listing directories” issue is more like opening folders, and not all reports distinguished initial from subfolders.

Since @ziesemer is also here in the forum, another data point on that might help. Using browser developer tools to look at queries is also helpful and not very hard.

kenkendk · April 12, 2025, 8:32am

The plan is to keep the prefix design, but eventually drop the “File” view.

As mentioned, we are currently investigating the recreate speed, and search performance is the next goal.

At least for listing a subfolder, it should be super fast because we can look into the prefix table and grab the id, and then extract the contents. But I think we may need to add pagination to these calls.

The issue with the search feature is that it supports too much and that makes it slow. You can search with regular expressions for example, but since the database does not support it, Duplicati will read all filenames and run the regex in C#.

A similar issue is with comparison that should be case-insensitive, as the SQLite string compare is (was?) not locale aware, adding further complexity. For Windows, this causes all queries to fall back to a slower listing.

We have not started the process yet, so I cannot promise anything, but at least cleaning up the operations will make it clear when we call something slow. This will likely cause breaking changes to the CLI.

If we cannot get the query times down enough, we also need to do some kind of polling (or websocket) to allow long-running queries.