There are a few issues with the current result reporting:
If the backup is aborted before metadata is collected (e.g. by run-script-before), empty metadata is saved and shown on the backup overview. This looks like the entire backup was deleted, even though the versions are still there when running restore.
Another problem is that some exceptions that occur during a run are not recorded in the database log of the backup, but rather in the server log. This causes the operation to essentially disappear from the protocol, even though it should show an error run: "errors occurred" but none in log
Some exceptions seem to be handled within the backup operation and those are recorded in the database, but others are handled by the server and are reported in the server log. I think all exceptions during the backup operation should also be reported for that backup.
I find it difficult to change the result type in a way that would allow checking if the metadata is correct, because it is updated at many different places and often only partially. So even if some metadata is correct (e.g. source data size), some might be wrong (e.g. versions).
My idea is to add new result codes (in addition to success, warning, error) for abort success, abort warning and abort error to ParsedResultType. These would be set if the script calls for an abort and then the metadata could be ignored.
If exceptions thrown during an operation are also supposed to be integrated in the result, like I would prefer, they might also be able to use these aborted result codes (probably aborted error, as the operation did not finish). However, it could be the case that the exception occurs after some work is already done, so I am not sure if that could be a problem (I think it should cause a rollback normally). Also I would need to think about the interaction with partial backups.
What is your opinion on this? It feels right to use the result type for this, but I am a bit hesitant to add new result codes, because it might break stuff in the UI or in other scripts (maybe it should, because those results need to be treated differently). Otherwise it could be enough to simply add a bool flag to indicate whether the operation completed or not.
I am up to my neck in backend(s) code at the moment, so not much time has been given to thinking about your post.
re: run-script-before: from the doc, this should NOT abort the job. --run-script-before-required should be the command used to abort a job. I did not try to repro with script-before-required.
re: log into job vs server. This is a pretty fearsome enterprise as you said yourself. An important point in my opinion is that Duplicati will have to evolve to adapt to new times (driver updates). However this is pretty risky. A mitigation is to consider the database structure as frozen. This way if users have trouble with an update they can re-install the old version without hassle. This seems to me the only way to make faster updates and reliability expected from a backup software at least tolerable. So I am -1000 on any database structure change.
I believe you are looking at old documentation. Some number of years ago, --run-script-before was enhanced to handle various errorlevel values so you can choose to continue the backup or cancel it, and you could select the job result: success, warning, error.
I think it was mentioned that --run-script-before-required wasn’t really needed any more after that enhancement.
There are different exit codes 0-5 which cause different behavior (not documented in the manual, but in the example scripts), to display error or warning messages and also abort:
There is no need to change the database, it already contains the result logs (what you see when you go to show log on the backup). It would only also have entries when the job failed due to exceptions, which is more intuitive in my opinion.
I also looked in the code and results are only ever written to the database log, never parsed. This means we can add any fields or enum values we like and backwards compatibility is unaffected.
It could still be used if you have an existing script that returns a different code, e.g. exit 1 to fail which is quite common.
In Combine logs in GUI #1152 kenkendk commented on this, but didn’t detail “technical reasons” causes.
It’s definitely not user friendly, especially since the GUI offers a button to show logs, but gives wrong one.
I don’t know the code well, but I think the JSON reportings that Duplicati can do is just a variables dump, however I don’t think there’s a formal format definition, so hopefully the consumers will ignore extra stuff.
I don’t know the code well, but the database log would seem to be only data source for GUI’s log display.
A link from there would be nice. Lots of the options descriptions don’t go past quoting the code help text. Example Scripts which is linked in navigation Articles section is where the example scripts are available.
You’re doing a lot. Is there anything that anyone else can help with so, e.g., you can get in a few PRs? @Jojo-1000 and I have been looking one over that’s been lingering, trying to do some test and review.
which of those currently set metadata? I’d think Success would be the clear favorite on having values. Warning might be second choice. My impression is Error and Fatal have bailed out before finishing.
I was about to mention JsonIgnoreAttribute to keep things out of JSON, but proposal is tweak to enum whose value creation I’m not clear on, but I think I’ve heard that it’s complex, maybe hinted by Parsed.
Regardless, I think preventing any wipe-out of home page values would be a nice improvement to add, although if these were equally difficult (they’re likely not), I’d propose de-confusing the two logs instead because bad logs (and this is not the only case) just make difficult support handling even more difficult.
In addition to reliability, I think Duplicati needs to be more supportable before its user base gets too big.
As long as a result is returned from the operation, the metadata is set in the server database.
Any operation that “bails” (throws an exception) does not return a result and is logged in the server log instead (this is the second part I want to change).
If there is an exception that is caught, for example while processing a specific file, that is written to the error log in the result. When the operation completes normally, the ParsedResultType is simply determined by the logs:
An alternative to an enum code could be to use the Type column in the database (which is only used with "Result" as far as I can tell), maybe use "InterruptedResult" combined with a hidden property. But I still think we should use result codes for this because this is what they are made for.
Probably it is easier to fix the log messages than the metadata deletion. However, I am not sure what even belongs in the server log and what does not (looking at mine I see mostly connection errors to clients and updater errors that belong there).
Maybe another parameter of UserInformationException could be used to forward an error to the server if it is not specific to a backup, but I think that is probably not needed and only leads to more edge cases (operation not showing in log, clicking on message leads to wrong page, etc.).
The reason why I am proposing this together is because I think backups interrupted by unexpected exceptions should also have a different result code than backups which failed to upload a file. Maybe that is not necessary.
I was sort of speculating that as the original author wrote the command line before the GUI (and both still exist), that might explain some of what we find now. There have been other evolutions over time though.
What flag are you talking about. I’d be very surprised if command line lacks the log-file option.
My concern in this area is that I think the LogData table gets log, but how to see it using CLI?
Lots of my CLI work is by Export As Command-line, so I just use the GUI to read the job logs.
What doesn’t happen is home page statistics updates, as those go into the server’s database,
however here the server’s only clue that a true CLI ran a job is that the job database changed.
An interesting test would be to see if GUI Commandline is better about reporting job results…
After trying some changes for a bit, I came to the conclusion that it would be difficult to add more ParsedResultTypes. They all need to be forwarded from child results, so I think a flag for interrupted jobs would be easier and also break less externally.
The Fatal result is supposed to be used for operations that throw exceptions, so this can just be used as is.
I made a draft PR with my changes, it would be nice to get some feedback.
There are installers if you can’t build from source. It worked with a file permission error and different script return codes, but I didn’t do a lot of testing yet.