Easier to read backup reports or weekly summarization

I’ve been using CrashPlan for a number of years to backup to a mates computer for a number of years and his computer to mine. Following the recent Crashplan announcements, I looked for an alternative and came across Duplicati. I’ve been using it for a few months and find it to be excellent albeit not quite as easy to setup but once there it seems very robust.

One thing I really miss since changing to Duplicati is easy to read backup reports. With Crashplan I used to get a simple weekly summary email detailing how much was in the backup source in GB, how many files and whether the backups had been successful. I do have backup emails configured but they are not the easiest in the world to read and my friend who I back up for is not hugely computer literate. Is there any way via the CLI or similar to generate say a weekly summary email as to whether the backups have been successful for a given week?

I would see quite a few users moving over from CrashPlan to Duplicati and think this would be a really good feature or workaround to include.

1 Like

JoeH,

As another Crashplan refugee I had the same problem. I am backing up 8 different machines, so having those regular summary reports was really nice, and I missed that as I moved to Duplicati.

I have been working for the past couple of weeks on a program to download and summarize Duplicati email reports into a single Crashplan-like summary email. I still have a couple of things to work out and clean up, but if you can be patient for a while longer hopefully this will fill in the reporting gap.

HG

1 Like

Hmmm…I could have sworn there wasn another topic about this in the last week or so but all I’m not finding is a Github request for better number formatting.

In any case, the current reporting is geared more towards developers and being able identify and resolve issues, so yes - having more user friendly reports (both daily and over longer periods) would be good.

The problem is the primary developers are still busy working on backup bugs or improving performance so there’s not a lot of time left for cleaning up things.

The good news is this is open source and on Github, so anybody interested in tackling something is welcome to hop on board and give it a try! I myself have been annoying people all over the #developer category while learning how to fix an annoyance in the UI. :slight_smile:

I don’t have anything super helpful to add, other than I’m currently working on some “centralized reporting” methods for Duplicati.

My first test went like this:

  • script runs via crontab
  • it checks last update time of backup destination
  • it records backup name (which I had as part of the path) and last update time of directory
  • places all results in a formatted HTML table

Instead of getting a ton of emails for every single Success or Failure (or getting no emails in the case of a system not even functioning), I could just open 1 web page and immediately see when every client last connected. If I’m seeing clients connecting daily, and I’m getting no Error emails, I can hopefully assume everything is running smoothly. It would also tell me if a client was NOT connecting - in which case email reporting from the client wouldn’t help determine.

I’m also testing something with Duplicati’s “send-http-url” option. After backup, Duplicati can send a report via HTTP POST, which you can then capture and parse with something like PHP. It can record data to a database and build a nice status page that could viewed via web or sent weekly via email.

BitingChaos, those sound like some pretty useful items you’re working on!

I feel like across the board I’m seeing everybody wanting “better” reporting, but the specifics of it seems to be broken into two (and-a-half) groups:

  1. People who want per-backup (or per-machine) reports
  2. People who somehow got stuck maintaining all their friend’s & familie’s computers and want single centralized reports (like CrashPlan provided)
  3. People who want FAILURE reporting such as “hey, that computer over there hasn’t backed up in over a week!” (again, like CrashPlan provided)

Getting #1 going in the current codebase should be doable once people have time to work on it.

Getting #2 & #3 going is much more difficult as it “requires” a centralized reporting service similar to what it sounds like you’re working on.

There is a third option which is basically that Duplicati could start saving it’s log (and potentially backup settings) information with the the archive history allowing server OR client based code to review the log info and generate a notification. That way people who are using cloud services where they don’t have the ability to run code could still get weekly summaries and the like while people who CAN run centralized code would be able to detect missed backups.

2 Likes

A central monitoring service is available at https://www.duplicati-monitoring.com.
It is a test version with limited functionality, but more useful features like mail reports are planned. Right now, it looks very promising and seem to work well.

Website made by crazy4chrissi, announced at Github:

4 Likes

Many thanks for all the feedback and comments on my query thus far, I’ve just set myself up on www.duplicati-monitoring.com so finger crossed :):grinning:

1 Like

Well, crap, I didn’t know there was already some monitor service out there.

Using the “send-http-url” feature, I have Duplicati “report” to a server after every backup. The server parses the “message” POST sent by Duplicati and saves the data and a timestamp as a MySQL record.

A cronjob then queries the MySQL database daily and then builds HTML output that can viewed on a server or sent as email.

1 Like

@BitingChaos would you be willing to share your code so I can try it myself?

I’m working on getting it cleaned up and straightened out. Part of it is on a local server at home, and another part is on a local server at my place of work. I haven’t had a lot of time to work on it, so each location just has some parts working!
At home I have the script recording to the MySQL database, and at work (where I did the most recent screenshot) I have the script reading from text files with the backup report output (what you see as part of %result% in the email reports).

1 Like

The www.duplicati-monitoring.com is interesting, but would be nice to have self-hosted solution as well :slight_smile:

1 Like

Hi all,

Being in a similar situation and missing my daily CrashPlan summary email, I have recently put together dupReport, a utility for gathering email reports from Duplicati and creating a summary email similar to the one CrashPlan used to provide.

The description and code can be found here.

It’s only one step removed from prototype and it hasn’t been tested anywhere but on my personal system, but I’m interested for people to take a look, try it out on their own systems (Linux for now, other OSs in the future) and let me know what you think. If there is sufficient interest I will continue developing it into something more stable and portable.

I haven’t set up a GitHub project yet, so please comment here. If it gets too unwieldy we can move the discussion to a separate thread.

Enjoy!

1 Like

The linked to stuff looks pretty good!

I’ll try to poke around with it in the coming days and see what I can make of it.

Before I spend too much time on it, do you know if it will work on the Windows 10 Linux Subsystem?

I have only been able to test on a vanilla Debian 8 system, so that’s all I can say will definitely work. The main code is all bash scripting & awk, so you should have no problem there. Getmail, ssmtp, and sqlite will be your dependency issues, if anything. If you can get those running you should be OK.

Hey guys!
Cool that you already found our monitoring service: https://www.duplicati-monitoring.com/
We are continuously improving it. Just go ahead and try it out, and let me know what you miss the most. Nice readable e-mail reports are definitely already on the top of our todo list.

Nice to see that @handyguy already published a self-hosted solution. We love Duplicati and think that its main disadvantage is that there is/was no central monitoring. Nice that multiple opportunities for users are coming up now to solve this.

Greetings,
Chris

4 Likes

Hi folks,

I’ve spent the past couple of weeks completely rewriting dupReport to be better/stronger/faster. Here are some of the highlights:

  • Rewritten entirely in python into a single self-contained executable. No more bash/awk madness and no more need to install supporting programs like getmail and ssmtp

  • Based on that, it should be able to run on OSs other than Linux (still testing that)

  • Built-in support for multiple email transports (IMAP/POP3/SMTP) (still testing that also)

  • Automatically discovers source/destination pairs for reporting. No more need to manually specify them in the .rc file.

  • Lots of other configuration and reporting options

Here’s a sample report output. It’s still a bit messy because (1) it’s running against a lot of mail for testing and, (2) I’m still working on the formatting:

I’ll need another week or two to let it burn in, clean up the code, and complete testing. As soon as its ready for public consumption I’ll update the web site and post an announcement here. Hopefully people will find it useful.

5 Likes

That looks awesome!

Based on the 9/21 B2 row having fewer files but a 0 in the deleted column is it safe to assume Deleted refers to historical versions cleaned out of the archive files?

Begin greedy mode… :slight_smile:

Is it possible for the report to include links such as to the web GUI command line for a specific job with parameters pre-populated to generate a file list for the specific job?

I realize it would only work if clicked while on the machine from which the backup ran, but it might still be useful…

1 Like

That looks awesome!

Thanks!

Based on the 9/21 B2 row having fewer files but a 0 in the deleted column is it safe to assume Deleted refers to historical versions cleaned out of the archive files?

Interesting question. The 'Deleted" column comes directly from the “Deleted:” line in the report email. Your assumption sounds correct, but I’ll need to do some more research to verify. The “+/-” column shows the actual calculated difference (size and file count) between the previous run and the current run. The fact that the “+/-” and “Deleted” columns are different would indicate that “Deleted” refers to a different type of calculation. Perhaps kenkendk can clarify what that means.

Is it possible for the report to include links such as to the web GUI command line for a specific job with parameters pre-populated to generate a file list for the specific job?

Not greedy at all! I’m looking for feedback and suggestions for future updates. I’m currently trying to finish up the basic functionality and get the program “released.” Once that is done I can turn to wish list items. I already have a few, but I will add your to the list to see how it might be done.

Thanks for the feedback! :grinning:

Doesn’t it make more sense to analyse the LogData table in the database? It has all data you need, and this way you don’t need to check your email client. The data in LogData looks exactly the same as in the email report.

Quickly checked, and seems the LogData only keeps the last 30days by default, but there is an option to extend this, so you could go back further.

Doesn’t it make more sense to analyse the LogData table in the database?

Unfortunately, in my setup I have 9 different systems in multiple locations backing up to 2 back-end storage locations (one local, one cloud). “The database” in this case is distributed across all those systems. The only thing they have in common (and the only way to easily correlate all their data) is to parse through the result emails.

If all your backups are controlled from one system, I agree that searching through the database is probably a lot easier. I wasn’t so lucky.

1 Like