Improve UX when dealing with backup logs

mikaelmello · November 8, 2018, 1:30pm

The current UI to see errors or warnings directly related to backups is so lacking.

When coming from a notification, you have to click on “Show”, then click the item of the last operation to open the entire log and manually search for the Errors and Warnings fields.

Right now, the BasicResults object is serialized in a custom way and stored in the local backup database, at the Message column in the LogData table.

Logged events, intended to be used when 
constructing an error report or when 
debugging
*/
CREATE TABLE "LogData" (
	"ID" INTEGER PRIMARY KEY,
	"OperationID" INTEGER NOT NULL,
	"Timestamp" INTEGER NOT NULL,
	"Type" TEXT NOT NULL,
	"Message" TEXT NOT NULL,
	"Exception" TEXT NULL
);

When retrieving the serialized object, the current approach is to simply dump everything as a string and show it to the user. This leads to a pretty bad user experience, as it is not simple to deserialize this string and therefore you can’t easily differentiate any fields by simply reading the string, among other things.

When serializing, there’s also a collectionlimit parameter that indicates how many member of a enumerable are serialized and stored, as the amount can be quite big on larger backups.

A good approach that will not require a large amount of refactoring (and db changes) is to serialize as JSON and store it this way. Therefore when retrieving these logs, we can parse it on the front-end (if there is no filtering on the back-end to limit used bandwidth, parsing it is useless) and display it formatted.

Current changes that are needed:

In a notification indicating errors or warnings, “Show” should redirect the user directly to the Errors/Warnings fields of that specific run.
Format the data to make it user friendly.

Possible problems:

The LogData table could grow big on larger backups, we can either limit the number of items per enumerable (the current approach) or clean older logs periodically, or both.

These lists can and should be extended, suggestions are welcome!

mikaelmello · November 8, 2018, 1:33pm

If by setting it to 0 it means we store everything, I believe the overhead of continuously retrieving the entire serialized object, parsing it and retrieving more errors/warnings instead of just sending everything to the UI is probably not worth it.

If it means we don’t store anything, where are we going to store it so that it can be retrieved later?

kenkendk · November 8, 2018, 1:38pm

I was hoping for the opposite, storing none!

The current approach is to store log messages in memory, using a ring-buffer like structure, such that new messages throw out old ones.

This has the upside that we limit the disk-io and storage space required. And the downside is that the messages disappear when the server restarts.

But I think we should remove the log-lines from the output, or at least limit it to ensure that we do not spam the database.

If we think the current “discard older messages” approach is worth pursuing, we could periodically flush the current set of messages to disk so they can be loaded on startup. This will ensure that we can display the messages, but does not allow us to go back in time, and may not cover all messages (if the output is spamming warnings/errors).

mikaelmello · November 8, 2018, 1:45pm

But I think we should remove the log-lines from the output, or at least limit it to ensure that we do not spam the database.

What are the log-lines? The Error/Warning lists?

I am a bit confused on what you are saying is kept in memory and what is stored on the backup database.

My understanding is that all Error/Warning messages are stored in the serialized format, but limited by a certain amount. The same goes to the Messages array, which seems to indicate backend events. They don’t seem to cause usually cause spam.

Also, what I see that is spammed are Verbose and Profiling messages that log things to each file being processed, so that is a no-no to being stored.

kenkendk · November 12, 2018, 9:16am

Yes, I meant the lines that are serialized into the result object.

Yes. The error/warning messages are also stored in the result object (unlimited storage!) but only a few are serialized.

Besides this, all messages (incl. profiling if you choose that level) are stored in the ring buffer. You can see these messages in the “server logs” area of the UI.

My idea was to unify these two approaches, such that we do not store the messages in the database, only in the ring buffer (in memory).

The downside is that we can only display errors/warnings until the server restarts. I am not sure if this is a poor workaround. We could also do a hybrid, where the UI pulls messages from the in-memory buffer, but can also access messages from the database. We could have a limit of, say 50, messages for each run. That should have virtually no impact on the database and allow us to retrieve all messages for most situations.

There have been cases reported where each file causes a warning message. (wrong timestamp, broken attributes, etc)

mikaelmello · November 12, 2018, 1:17pm

Alright.

What do you think of storing everything as JSON then, limiting each array to something like 50 as you said. I guess that any backup that throws more than 50 errors is likely spamming a bunch of equal warnings/errors in different files, so no practical info to debug is really lost.

Then when accessing the log pages of a backup, we simply parse each JSON and with this data it is possible to create a nice UI.

This is someting we are insterested in doing at my company, and, instead of improving it on our version, we can make that part on the base Duplicati and then make a PR if you like the results