Box.com max number of files per folder limitation

Hi,

Box.com apparently has a limit of files per folder of about 10000 to 20000, depending on where you look.

What happens if you set up an hourly incremental backup, when the number of files duplicati uses reaches that limit? From what I can tell, each incremental backup uses up three files. Which means 24 * 3 = 72 files per day. So within a year, the max file limit would be reached.

Is there a way to nest duplicati structure to mitigate this? Does anyone have a suggested solution for this problem if unlimited history and frequent backups are desired?

Thanks.

This discussion is related:

I just realized, 3 files per backup is best case scenario. With 50MB blocks, an 8TB backup would take up 160k files and would run into the limitation really fast, long before completing.

Are we to use 2.5-5GB blocks in such a scenario for now, is that the only way?

Hello @archon810 and welcome to the forum!

Have you tried filing a ticket with Box to ask them whether the limit applies to paged requests to their API?

What seems quite clear is that their FTP server has had errors at 20,000. Duplicati uses their REST API.

Unable to skip the file on the server, when the there is a file of the same name

Response: 550 Box: Too many items to display. Directory contains 21294 items; limit for directory listing is 20000 items.

List of FTP server return codes

550 Requested action not taken. File unavailable (e.g., file not found, no access).

Max files per folder was an all-APIs question that (in 2016) said to file a ticket with scaling need, for advice. There’s also another 2018 comment on the 20,000 limit for FTP (but above 550 proves it’s a server error).

How many files in a folder said “there is not an enforced limit” then worries about performance scaling up.

Which is exactly what I’m going to do. Duplicati seems (based on forum reports) to have practical issues scaling up, well before theoretical limits are reached. Choosing sizes in Duplicati explains some tradeoffs.

The default deduplication hash –blocksize of 100KB is probably too low. 8TB would track 80 million blocks, probably resulting in a large SQLite database, slow queries, and pain on Repair and Recreate (test those).

Can you split that 8TB into smaller backups? That would also avoid possibly super-slow disaster recovery.

Do you need all versions all the time, or do things grow less important eventually? Retention policy can thin versions out if that’s suitable, and this will remove the dlist file for the version. The dblock will eventually get empty enough that compact will repack what’s left of it into a new dblock (along with other dblock leftovers).

The COMPACT command shows some of the available tuning options. Some are touchy at extreme levels.

I filed a ticket with Box to clarify whether the limit applies to the Box REST API. I suspect it does, but we’ll see what they say.

I’m now testing a 10MB block size, thanks for the suggestion here and in other posts.

I’d rather not split up the backups into smaller pieces (even though it’s possible, of course, just select fewer dirs) to simplify maintenance. I can see it quickly becoming nightmarish managing so many individual backups.

I’ve configured the following retention policy for now: 1W:1D,1M:1W,50Y:1M. This way, I’ll have at least one monthly backup older than a year.