Announcing dupReport - A Duplicati Email Report Summary Generator


#42

I opened a new issue on GitHub for this: https://github.com/HandyGuySoftware/dupReport/issues/64 We can track progress on this there so as not to clog up the thread on this forum.

I made a couple of changes and added some debug statements to see if we can figure this out. Changes have been pushed to the Beta branch. If you can download the latest beta version and try again that will give us some more information.


#43

Feature request for DupReport. A summary of the summary. LOL.

I love how in compiles all the backup runs and generates the table. It would be helpful in troubleshooting. What Would really help me more is a simple table at the top that list All devices from the report and has a single second column that says Days since last Successful Backup. An added feature for this table would be color coding anything less than 3 days is green, between 3 and 7 days is highlighted yellow and more than 10 days is highlighted red. It appears the informaiton is already in the database to do this but I am not a fluent programmer. The 3, 7, 10 day setting could be configured in the RC file for flexibility. I like the report as is because it compiles all the info nicely but with several systems it is the nicest to read. A simple summary report at the top with days since last successful I think would be helpful to many.


#44

Interesting idea. I’ll add it to the 2.2 feature list and let you know if it’s doable.


#45

I’m happy to announce that dupReport version 2.1.0 has been officially released into the wild. A brief list of the new features includes:

  • Added new reports to organize backup jobs by source, destination, and run date
  • Easy modification of date/time formats for international use
  • Date/time format can be specified per src/dest pair, if jobs are running in different locales
  • Dates can now be displayed in 12- or 24-hour format
  • Report can now use keyword substitution for subheading titles
  • Reports can now be sent to one or more files in HTML, txt, or csv formats
  • Column headings can be customized
  • User can select which columns to display in a report
  • UTC offset information from email header now applied to backup endDate and startDate fields
  • Database & .rc file upgrades now handled automatically
  • Ability to roll back database to a specific time/date. Useful for failed runs or testing.
  • Ability to remove a source/destination pair from the database if it is no longer in use
  • Send separate warning email when a backupset has not been seen in a certain number of days
  • Added “Friendly Name” support to outgoing emails
  • Send report and warning emails to multiple recipients

See the changelog file for more details on new and changed features. The updated and expanded readme.md file has complete documentation of features and options.

The program can be downloaded from the master branch of the dupReport GitHub repository.

Bug reports and new feature requests can be made on the GitHub issues list page.

Enjoy!


#46

I’ve been playing with this tool for the last hour, and I have to be honest. I don’t get it.

  • The readme is big on technical details, but sparse on getting started for the first time. The key paragraph that describes getting started simply says: “Once the files are created the program will exit. You will then need to edit the dupReport.rc file with the appropriate entries to point to your database and log files as well as providing the locations and credentials for your email servers. More information on the .rc file configuration can be found below under “RC File Configuration.””

I’m guessing the database and log files are dupReport DB and log files, and not Duplicati log files?

  • I tried running a non-email version using --noemail -f test,txt, but got this error:
./dupReport.py --nomail -f test,txt
Traceback (most recent call last):
  File "./dupReport.py", line 193, in <module>
    retVal = globs.inServer.connect(globs.opts['intransport'], globs.opts['inserver'], globs.opts['inport'], globs.opts['inaccount'], globs.opts['inpassword'], globs.opts['inencryption'])
  File "/opt/dupReport/dremail.py", line 115, in connect
    self.server = imaplib.IMAP4_SSL(self.address,self.port)
  File "/usr/lib/python3.6/imaplib.py", line 1283, in __init__
    IMAP4.__init__(self, host, port)
  File "/usr/lib/python3.6/imaplib.py", line 197, in __init__
    self.open(host, port)
  File "/usr/lib/python3.6/imaplib.py", line 1296, in open
    IMAP4.open(self, host, port)
  File "/usr/lib/python3.6/imaplib.py", line 294, in open
    self.sock = self._create_socket()
  File "/usr/lib/python3.6/imaplib.py", line 1286, in _create_socket
    sock = IMAP4._create_socket(self)
  File "/usr/lib/python3.6/imaplib.py", line 284, in _create_socket
    return socket.create_connection((self.host, self.port))
  File "/usr/lib/python3.6/socket.py", line 724, in create_connection
    raise err
  File "/usr/lib/python3.6/socket.py", line 713, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

I don’t understand why a no email test would generate IMAP4 messages. When I created a gmail account and put in the account settings, I got the same error message. When I tried a regular run without the --noemail, I got the same error message.

  • How does dupReport interface with Duplicati anyway? I couldn’t see anywhere in the readme file where this is specified.

Overall, I’m just plain confused, I have no idea how to get started with this.


#47

Now I get it. DupReport doesn’t read a Duplicati log file and/or settings and then send emails based on that. DupReport instead scans an email folder (has to be IMAP, can’t be POP3 that sends email and then deletes it server-side) for pre-existing Duplicati emails, scrapes/parses the email text for relevant information, then tries to summarize them and sends out another email.

Ewwwwwww.

I’m sure this approach works for others, but I’ll need to find a different solution.


#48

@brad, sorry you’re having so much trouble wth the program. I’ll try to take your points one by one.

That’s the problem with having the guy who wrote the code also write the docs. I guess I assumed the user would know the program like I do. :frowning:

dupReport is supposed to be self-initializing, in that running the program for the first time creates the database and initializes the .rc file with a bunch of default values.The database and log file entries are initialized to the program directory, so unless you move those files somewhere else you can ignore those entries. The only thing it doesn’t know about is the tech specifics about your incoming and outgoing mail servers (location, ID, password, transport, encryption, etc). If you edit those entries in the .rc file everything s supposed to work like magic. Unfortunately, it sounds like that’s not happening for you.

That is correct, dupReport maintains its own .rc file and SQLite database in the program directory. The .rc file controls all aspects of the program.

That seems to be a quirk in the program. dR will open a connection to the outgoing SMTP server even if you select the --noemail option. That is because it has the option to send warning emails for backups it hasn’t seen in a while. This feature is independent from the normal report email, so --noemail will have no effect on this. It shouldn’t work that way. I’ll open up an issue in GitHub to fix this.

Have you enabled IMAP or POP3 on your Gmail account? If not, these can be enabled under the “Settings” menu in Gmail.You have to specifically enable one or both of these protocols for Gmail to allow you to access your mail remotely. I recommend IMAP, but either should work.

dR doesn’t interface directly with Duplicati. It reads the backup report emails Duplicati sends out and parses them to create its reports.

Try enabling the IMAP/POP3 settings on your Gmail account and see if that gets you any further. If not, let me know and I’ll try to help you debug the issue.


#49

Thanks for the quick reply on a Sunday night! I appreciate it.

I was frustrated because I expected the tool to work one way, and just didn’t conceive that it worked this different way. Had I understood that, then the instructions I think make sense.

Are there ever any plans to have dupReport scan log files/settings directly? I understand if it’s not on the radar…I’ve done open source work before…with so many people asking for features when you are doing it all for free. I’m just curious, otherwise, I may pick this project up several months from now when my current big projects in my life are done.


#50

I’ll take another look at the docs. They should be clearer for a first-time user.

The program was really designed for those who run backups from multiple locations and are getting multiple Duplicati emails per day. For example, in my case I have 14 separate emails coming in per day, way too much to keep track of manually. In these cases, dR will collect and collate all those emails and create a single report that summarizes all of them. If you only have a single Duplicati instance running backups you probably don’t need dR.

You can set Gmail to leave your emails on the server if you’re using POP3. I don’t know if that feature is available on other servers.


#51

Unfortunately, dR is typically run on systems other than the one Duplicati is running on, so it doesn’t have direct access to the Duplicati database. I would love the ability to read the Duplicati database directly, unfortunately, that just isn’t an available option. I have considered setting up some sort of relay code that sits on the Duplicati system, reads the Duplicati database, and then relays that information to dR. I sketched out a basic architecture, but the whole idea quickly got too complicated to be practical. Maybe someday…


#52

@brad, I updated the docs to be a bit more descriptive in what the program does and how it works. Take a look and let me know if that would’ve made your life a bit easier this weekend.


Duplicati 2 vs. Duplicacy 2
#53

A beta version of dupReport version 2.2 has been released into the pre_prod branch of the GitHub repository. A short list of new features includes:

  • Optimized email retrieval. Seeing 40%-60% reduction in program running time during testing
  • Added options to use keepalive logic for larger inboxes to prevent timeout errors
  • Added ‘date’ header to outgoing emails for RFC compliance (issue #77)
  • Added optional progress indicator to stdout to show emails are being read (issue #72)
  • Added option to add a “last seen date” summary table for all backup sets on report (issue #73)

See the changelog file for more details. As always, swim at your own risk. But I’m interested for folks to test out the new features. The email handling routines have been mostly rewritten for the first time since early in the 2.0 cycle, so I’m hoping people can give it a good workout.

Bugs and issues can be reported in the project’s Issues board.

Enjoy!


#54

Big changes in date/time processing for dupReport version 2.2 were just posted to the pre_prod branch on GitHub. dupReport will now be more forgiving of date format errors and exit more gracefully if it can’t resolve them. Check it out!


#55

@Charles_Beckler, I added your suggestion to version 2.2. A beta of the code is now available on the pre_prod branch on GitHub. Check it out and let me know if it fits what you were looking for.


#56

Wow This is amazing. It is exactly what I was looking for. Thanks!

Couple notes.

When I run the report back to back (white testing but I could conceive this happening in real life). I notice that if a computer hasn’t backed up since the last run the Main Original report the No recent activity is highlighted red even though the days since last backup is 0. Not sure if that is by design or a glitch.

Another bug (however it doesn’t bother me) is that some backups show “No new activity. Last activity on 01/31/2018 at 00:44:17 (-1 days ago)” notice the -1 not sure why it is doing that. Of my 18 computers showing in dupReport, 5 of them have a -1 for days since last backup.

thanks again. If I can be of more help in troubleshooting these please let me know.


#57

Good to hear it’s working for you. The red highlight is used for any backup that hasn’t been seen (i.e., hasn’t sent a result email) since the last run, even if that run was only a few minutes ago. Its just an artifact of the program design.

The “-1 days” problem is something I’ve seen before but haven’t been able to fully trace. I’ve opened an Issue for the problem on the GitHub repository . We can continue the discussion there.


#58

Hi all!

dupReport Version 2.2 is almost ready for release. I’m going to let it soak in for another week or so then release it. If you want to get an early look, it’s available from the pre-prod branch in the GitHub repository. See the changelog file for what’s new in this release.

Enjoy!


#59

(Drumroll please…)

dupReport Verison 2.2 is out! Here are some of the bigger changes:

  • Optimize email retrieval by analyzing headers first before downloading entire email. Seeing 40%-60% reduction in program running time for large imap mailboxes
  • Added options to use “keepalive” logic for larger inboxes to prevent server timeout problems
  • Added optional progress indicator to show emails are being read
  • Added option to add a “last seen date” summary table for all backup sets on report
  • Fixed “negative date difference” problems

As usual, the changelog file has all the boring details. The code is available on GitHub. Bug reports and suggestions can be made in the Issues section of the repository.

Enjoy!


#60

dupReport version 2.2.1 has been released to the pre_prod branch on GitHub. One small bug fix and a new reporting option in this one. Nothing major, but wanted to get those closed. As always, bug reports, comments, and suggestions can be made in the dupReport Issues area.

Enjoy!

HG


#61

Happy to say that dupReport 2.2.1 has been released to the Master branch on GitHub. No big fanfare for this release, but as of now all the bugs (that we know about) and suggestions have been addressed. Enjoy!

HG