Hello everyone, I’m from Brazil and here we have characters with accents like á, ã, etc., and I’m having problems with some files that have this type of character on our backup server, an Ubuntu Server 20.04, where I centralize some backups and send them to the cloud with Duplicati.
I’m currently experiencing 1900 warnings in my backups due to accent issues:
2024-10-25 01:03:19 -03 - [Warning-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-FileAccessError]: Error reported while accessing file: /home/backup/wp-content/uploads/2011/08/Caf�-3-300x225.jpg
FileNotFoundException: Could not find file '/home/backup/wp-content/uploads/2011/08/Caf�-3-300x225.jpg'.
So as you can see on the dump around 66 e9 2d , 66 being ‘f’ and 2d ‘-’, the ‘é’ is mapped as E9, which indicates it is a CP1252 encoded filename (maybe created by old Windows?).
You can try:
convmv -f CP1252 -t utf8 filename
This will confirm that (the command performs a dry run so it does not actually apply any change to the filename, until you run it with --notest flag)
How does the centralize part work? It seems to be putting old 8-bit characters onto Linux.
Those possibly look pretty strange even on Linux. Can you convert names on the way in?
As a side note, that ls wasn’t exactly the one requested. Why was the directory different?