Announcing dupReport - A Duplicati Email Report Summary Generator

Hey folks, version 2.2.8 is now in the pre_prod branch on GitHub. Really excited about this release. In addition to bug fixes, I finally broke down and allowed the use of the read/unread flag on IMAP to determine which emails to scan. When used properly, this can dramatically reduce the time needed to scan the emails and run the report. More info in the changelog and readme files. Check it out and let me know if you find any problems.

HG

1 Like

I’ll try it tonight and let you know if I hit any issues

Can report no breakage with the new version

1 Like

Having heard no objections, 2.2.8 has been released to the wild.Enjoy, everyone!

HG

1 Like

Please forgive the painfully newbie question and reflections, but I had a tough time getting off the ground with Gmail.

So having finally gotten basic functionality, does someone have a tutorial on how to specifically integrate dupReport.py with the Windows 10 Scheduler? Tried it over the years for other things and had no luck - works manually but doesn’t auto trigger - stuff like that.

Other cautionary tales:

  • 1a. When you load Python (quick step 1.) on Windows (10 1903 in this case), check ON the box to add Python to the PATH (off by default). For what peculiar reason I don’t know, Python loads itself many folders deep in the User section, not under Program Files where one would expect executable files. BTW, the Python version included in Libre Office doesn’t work for this; don’t bother with it (found while trying to find where in **** they put it).
  • 1b. Start a new Power Shell session after loading Python so the above PATH will be in play.

I’d gotten Duplicati to send the individual backup reports to the Gmail account, so I knew I could at least access the account. After the initial setup, I got a raft of authentication and timeout errors. I was sure I wasn’t setting up the .rc file correctly, because I’m not understanding what to enter.

  • In particular inserver = – what should this look like? The help file says # DNS name of email server with Duplicati emails - but I don’t know what that means - I don’t know what Google’s DNS for Gmail is, and could it change any time? ???

Tried various things, but what wound up working was smtp.gmail.com - this may be obvious to sysadmins who work with email systems, but unfortunately not to me.

inserver = smtp.gmail.com

Got the hint from the Duplicati settings for

–send-mail-url=smtp://smtp.gmail.com:587/?starttls=when-available

Along the way, figured out not to put the full description email account information.
Unlike Duplicati, which needed it,

–send-mail-from=“My Name <myname @gmail.com>”

the dupReport.rc file can only have the simple version:

outaccount = myname @gmail.com

Somewhere, in the copious good help folks have posted, including @Marc_Aronson & others, there was a note in the Duplicati email setup articles that any string with blank characters in the string should be surrounded by " marks. This seems to work well enough but the side effect is that if you do this for the

–send-mail-subject=“Duplicati %PARSEDRESULT%, %OPERATIONNAME% report for %backup-name%_hostname”

dR won’t parse and you get a blank report.
[Note I changed the source_dest delimiter to an underscore (srcdestdelimiter = _ ), so don’t copy this verbatim if you aren’t doing that.]
So either change to this with the leading " mark:

subjectregex = ^"Duplicati ([\w ]*, |)Backup report for

or do it without the " marks and then this:

–send-mail-subject=Duplicati %PARSEDRESULT%, %OPERATIONNAME% report for %backup-name%_hostname

subjectregex = ^Duplicati ([\w ]*, |)Backup report for

In other words, Duplicati doesn’t need quotes around spaces for –send-mail-subject – didn’t test other strings.

Lots of good information here, @Geek4eye. For the Windows Scheduler, I used to do this a while back during the original dupReport testing. The test system has long since been re-commissioned so I can’t check for sure, but I recall the secret to using the Windows scheduler with dupReport successfully was to include the full path on all the executables. For example:

C:\path\to\python\python.exe c:\path\to\dupReport\dupReport.py <options>

I didn’t use PowerShell during the testing so I’m not sure how that would interact with he environment. You should be able to run Python/dupReport directly from the Scheduler’s command line without needing additional environment settings.

Good notes/comments on the dupReport options documentation. I will look at enhancing the descriptions in future releases.

HG

dupReport version 2.2.9 has been quietly released into the wild. Nothing really new here except a fix for a weird parsing bug which only I seem to get and some docs updates. Unless you’re especially retentive about keeping up with the latest version you can probably skip this one.

HG

2 Likes

dupReport 2.2.10 has been released. This one finally fixes the bug that 2.2.9 was supposed to fix. I think I got it this time. :wink:

Unless you (like me) insist on always having the latest version of any software you are running and you’re not experiencing the problem described in Issue #126 this update is not required.

Enjoy!

HG

1 Like

Looks nice! Congrats, will definatelly give it a go

Sorry for the time dilation, but life intervened. It turns out that adding the dupReport.py to the Windows Task Scheduler was easy enough. There is a nice tutorial on windowscentral.com that makes it easy to follow.
The “Basic Task” was sufficient and it didn’t need any exotic permissions or anything; ran in user space.
I did use the full path-names as you suggested, so set python as the program, in my case:
C:\Users\Me\AppData\Local\Programs\Python\Python37\python.exe
and Q\DupReport-master\dupReport.py as the Optional Argument
and its folder Q\DupReport-master as the Starting in.

Another newbie challenge was getting the source & destination names to print right. After rattling around awhile I found your post on regex in @Marc_Aronson’s excellent Duplicati to gmail article with the link to https://pythex.org/ tester which also has a cheat sheet. For me \S* was the ticket - all chars except whitespace. It wouldn’t surprise me if certain characters that this would permit would have special meaning to python and cause some disruption and thus may not be generally advisable, but it worked for me using dashes and periods with underscore as the delimiter. In the Duplcati Settings I wound up putting the hostname before the %backupname% in --send-mail-subject. This made it more consistent and looks good in the report.

While I can see the value of the Duplicati version in the report, it literally is of least importance to me relative to the other values in the report. I decided not to remove it but to move it to the far right. Turns out is wasn’t that hard; I had to relocate dupversion to the end of the lists:
in report.py the order of the fields in rptColumns, and likewise in rpt_bysource.py the order of sqlStmt, the for loop, titles and fields. [I’d guess the other styles are similar]. It would be nice if one could accomplish this in dupReport.rc instead, say as an ordered list (as in rptColumns) or some sort of table. Biggest challenge there might be error handling for folks like me with elbows and cat enabled keyboards…

So all in all I’m fairly close to operational, except for one thing: I’d like to have a weekly report always, and daily report only if there are unsuccessful backups. The challenge here is that doing daily reports via -c with warnoncollect=true accomplishes the latter, but then the db considers them seen. So without intervention, the weekly would be all No new activity. This is handled by the -b switch that reverts to a given date/time, but in the context of the Windows Task Scheduler I’m not sure how to approach it, probably some scripting that I’m not good at. What would be wonderful would be a switch that was relative to now rather than absolute date and time like -b. So, for example -R with values of nD, nH, (or even) nM for some number of days, hours or minutes to revert; or just dd:hh:mm for that matter… Then the only thing I’d have to do with the Scheduler is add to the argument field -c for the daily and -R 7D for the weekly tasks.

Thanks for all your work on this - the more I get into it the more astounding it is…

Just setup 2 task in windows one with the -c option with the warnoncollect=true option run daily then second task with -t option (Report Only) to run weekly.

Just make sure your “report only” task runs after the collect or you will miss a day.

That worked! Sorry I didn’t grok the function of -t. To simulate where it would normally be I had to manually run dupReport.py in powershell with the -B argument to set the DB back to last week. So after that running the two tasks in order, -c daily then -t weekly performed as advertised. I’ll find out next Saturday if it is fully automagic, or earlier if something burps (like those pesky “file not processed” warnings). Thanks!!!

@Geek4eye,

Thanks for the nice words, I’m glad you like the program.

There are a bunch of things I’d like to update in the program, the top of the list being the reporting engine. It’s kinda kludgy and the code has gotten bloated and redundant. My ideal is to get it to work through a simple .rc file configuration (like you proposed) to make it much more customizable by the end user. Unfortunately, every time I start thinking about how to make it easier it just gets more complicated, I get frustrated, and give up. :frowning:

There’s other things I’d also like to update, including OAuth support, streamlining/clarifying the option functionality, and lots of code optimization, so it looks like I may have to start working on Version 3.0 at some point soon. I’ll add your suggestions to the update list (like relative database rollbacks - brilliant!).

It looks like @dcurrey set you on the right path toward your scheduling issue, so hopefully you’re good to go there. Let me know if you have any other problems or questions.

HG

Discussion Topic: Should Source-Destination Pairs still be a thing in dupReport?

Hi folks. I’ve been kicking around the idea of a dupReport upgrade/rewrite (version 3.0.0?) and one of the things I’ve been considering is dropping the requirement that backup jobs be named in the <Source>-<Destination> format. I’m not entirely sure yet how that would work, but I’d thought I’d poll the users and get their feedback on the following questions:

  • Does the S-D naming format cause problems for you?
  • Would your backup jobs (and subsequent reporting) be better if dupReport didn’t require S-D job naming?
  • Do you have any suggestions on what the output reports might look like if they weren’t ordered/sorted by S-D pairs?
  • Any other comments you’d like to share on the concept?

So as not to clog up this thread, please head over to the issue opened up on GitHub to discuss this topic - Issue #132. I’m really interested in hearing everyone’s thoughts on the topic before investing what will be a lot of time unwravelling the S-D format in dupReport.

Thanks in advance,

HG

1 Like

Thank you for interesting idea, I will reply on Github. But regarding version 3.0, did you consider some kind visualization/graphing ? There are many libraries for python, for example Example Gallery — Altair 4.0.0 documentation
Having some kind generated images in report would certainly be very useful :slight_smile:

@mr-flibble, I opened up a new issue for this on GitHub. Would you be willing to go there to add some more explanation on what type of graphics might be helpful? Is there anything you’ve seen elsewhere that might be a good fit?

Version 2.2.11 has been released. This fixes an obscure SMTP connection bug. Thanks to @g1bs0nsg for helping to fix this one.

1 Like

Hi all,

I have spent the last several weeks working on dupReport version 3.0.0. Lots of exciting changes coming to this version and most of the base code is being rewritten. I’m looking for some brave souls to act as early beta testers of the new code. Before you jump in, here’s what you should know:

  • The code is not yet complete. Up to now, I’ve only managed to rewrite the reporting engine and .rc/.db file conversion routines. But since the reporting engine is brand new I really want to shake out this code as much as possible.
  • The code is fairly stable, but I wouldn’t use it as your production system just yet. I’ve been using it for a couple of weeks and it’s working OK for me, but count on at least a few crashes during this testing as you uncover issues.

If I haven’t scared you off yet and you’re willing to help bullet-proof this code, please head over to the dupReport 3.0.0 Beta Testing Thread on GitHub to read more and download the code. If you plan on doing some testing please add a comment to that discussion thread so I can keep track of who is helping out.

Also, any comments, bug reports, or discussion should happen on that GitHub thread. Please do not post comments to the Duplicati Forum (to keep this clear for Duplicati issues).

Thanks in advance,

HG

Hi All,

I’m happy to announce that dupReport version 3.0 is now available in Beta on the pre_prod branch in GitHub. This new version has many new and exciting features, including:

  • The reporting engine has been re-built to allow for user-defined reporting formats (see the new reporting documentation for more details).
  • Added the ability to specify multiple inbound (IMAP/POP3) and outbound (SMTP) email servers.
  • Standardized log format for easier searching and organization.
  • Added ability to send output to syslog server or log aggregator.
  • Added a Guided Setup for new users. If there is no .rc file when the program runs the guided Setup will take the user through the most common configurable options.
  • Can now send output to a JSON file.
  • Added ability to rollback (-b and -B) to a relative time (e.g., 1w,3h) instead of an absolute date time (i.e., “04/11/2020 8:00:00”).
  • Re-structured the documentation for better readability and made it easier to find specific settings.

Please download and check out the new code. The new version will (should) automatically convert your .db and .rc files to the new format, but as always I suggest you back up your .db and .rc files to a safe place before proceeding.

Bug reports, suggestions, and complaints can be put in the Issue Section of the GitHub site.

Thanks,

HG

Hi All,

dupReport 3.0 has been released to the master branch on GitHub. Please see my previous post and the changelog file for a list of the new features, changes, and fixes. Bug reports, suggestions, and complaints can be put in the Issue Section of the GitHub site.

Enjoy!,

HG