Client-side agent and centralized management / dashboard

Pectojin · February 9, 2018, 10:10am

Interesting read. I hadn’t thought about it before, but a client would definitely benefit from this.

I believe python’s urllib3 supports keep-alive to similarly keep a HTTPS connection open. That’ll work well for the proof of concept.

LKits · February 9, 2018, 3:41pm

I like where this is heading.

I totally agree that better solution would be when Duplicati would have something built-in. Checking some URL every XX seconds, waiting for commands.
I am all for security and installing custom software on client system is not anyone would want to do (me including).

SSH is bad idea, because then you’d need administrative access on either side. Many web-hosts don’t allow SSH. But almost every host allows MySQL+PHP. And that would be enough for simple REST JSON PHP API.

About bandwidth - I’m pretty sure it’s nothing, compared to even simplest off-site backup sizes. If you keep connection open, then it would be even more optimized.
Though, keeping connection open might not be possible for long duration, since web servers tend to time out. Still, quite easy to check on client side, if connection is open or not.

Right now, I’m implementing graphs to dashboard - not hard, because SQL queries make it quite dynamically easy. Next, I probably have to implement some kind of cache, because it’s not normal to query the database on every move.

Pectojin · February 10, 2018, 1:57am

Heh, the password authentication is giving me a lot more trouble than I expected. It does quite a few things in javascript when hashing (salt, nonce, parsing uft8/base64/hex).

And as usual, I totally overengineered my proof of concept, so I’m sitting on 260 lines of python code. The CLI looks very nice, but it can only fetch information so far and I haven’t done any work to filter the output.

Passwordless connections work and has base functionality. The rest is just placeholders.

And of course notifications are important

Pectojin · February 10, 2018, 9:02pm

It’s getting close to functional.
I can start/stop backups and I set some filters on the output so it’s more readable

Haven’t looked at the password auth again yet, but it’ll fit right into the framework once I get it working.

You can check out what I have so far:

Pectojin · February 11, 2018, 5:14pm

I took a crack at password auth again and I got it working now. So it works for both password protected servers and non-protected servers.

So it can connect to any Duplicati server and perform operations:

list backups/notifications/serversettings
get more info on backups
describe all info on backups
schedule a backup run
abort a backup run

I’m polishing a few of the operations off and working on fetching general/live logs and backup logs. Then I’ll look into the Daemon mode so it can serve as a Central management client.

Here is the python example duplicati-client/duplicati.py at d14974a35a4a7f9866ffe4ba2edc44e28c66b0da · Pectojin/duplicati-client · GitHub

Pectojin · February 25, 2018, 12:45am

@handyguy and @crazy4chrissi, my Duplicati Client is starting to be able to do a lot of the configuration tasks as a CLI tool.
I’m looking to start working on the Daemon mode, where it will fetch instructions from a task server and execute them. I drafted a task server API specification. It’s pretty early stage and needs to be formalized and actually implemented.

Would you be interested in doing a bit of collaboration on that part? I can easily make a tiny proof of concept, but it won’t be as interesting or useful as an actual integration into Duplicati Monitoring or dupReport.

@JonMikelV, How do you think we best approach getting a currently external tool included in Duplicati?

crazy4chrissi · February 25, 2018, 12:14pm

@Pectojin Great work so far. Yeah, I would definitely like to collaborate. First things I noticed:

I would add a version field, so if the API changes in the future, client and server know which API version the other end is using.
A server with multiple users like Duplicati Monitoring needs to know which user this client belongs to. Therefore, any request initiated by the client would need to contain some kind of username. I would say it’s up to the server how these look like, could be user IDs, emails addresses or usernames.
Usually, one user would manage multiple systems that are backed up with one server. In Duplicati Monitoring, users usually group multiple “backup sets” in groups. I guess one client would then usually belong to one group. So additionally, the server would need some kind of client-identifier to know which group this client belongs to. I guess usually it would be the hostname of the host where the client is running.
We also need to authenticate the client against the server (e.g. using a password). Even if the client is opening the connection and only taking commands from the server, still we want to be sure that it is our client receiving the commands. Also, you guys suggest to keep a long-living HTTP connection. An attacker could easily open thousands of such connections as a DOS attack if no authentication is required.
As the client is taking commands from the server, this is even more critical and I think even though the server is configured at the client, I would authenticate the server to the client as well. Otherwise, a MitM could manipulate the DNS and return its own IP for the hostname of the server, and then send malicious commands to the client when it connects with the attacker’s server.

Just my first thoughts. I will try your client soon and think more about the API.

Pectojin · February 25, 2018, 1:12pm

Definitely.

Excellent points. I haven’t entirely made up my mind about the best way to do this, so I refrained from mentioning authentication/identification in detail just yet.
I was thinking something along with a pre-shared key being able to at least signify user and group identifiers. The key can be generated on the task server and used to configure the client.

Then there’s the two way authentication, which is a bit more tricky since we can’t take for granted that everything is SSL encrypted. It shouldn’t be possible to just do replay attacks, so relying purely on a pre-shared key isn’t enough.

crazy4chrissi · February 25, 2018, 1:43pm

Mmhh, replay attacks can be mitigated if all messages have sequence numbers so another message with the same sequence number would be just ignored.

And I think we could require that the server uses https, nowadays with letsencrypt this is easy and free to setup.

Maybe it would be easiest to have a pre-shared key and both client and server sign each message using this key. The key would be generated on the client when its installed and then configured on the server side (e.g. in Duplicati monitoring, the key would be pasted into a form of the web interface).
We could also use a private/public key pair for server and client, but I guess it’s a little overkill.

Pectojin · February 25, 2018, 1:50pm

Well, yes and no.
It’s safe to assume Duplicati Monitoring does, because we know it does. But for example with Client to Duplicati Server communication it will be unencrypted often. Since we see that tendency with Duplicati Server I’d prefer to not make hopeful assumptions with other services that may also want to implement this.

I don’t need full data secrecy, since that’s overkill, but I’d prefer for replay-attacks and mitm to not be feasible even though it’s not SSL.

I’d settle for symmetric encryption using the pre-shared key. Just encrypt something that changes on each request and is somewhat verifiable on the other end. Could be a timestamp or a hash of the message contents. That way it could both be verified and, to some extend, validated.

handyguy · February 25, 2018, 2:25pm

Hi folks, a bit late to the party here, been distracted on some other stuff. Happy to take a look at the tool (hopefully today) and see how it works, then I’ll be better informed to comment on features.

HG

JonMikelV · February 25, 2018, 2:35pm

Check in with @kenkendk and get his approval.

I haven’t looked at the code, but if it could be treated as Duplicati module that might be best. That way there is a “built in” way to disable it for users who may not want that functionally running in their environment.

kenkendk · February 27, 2018, 11:49am

Approval granted , we just need the details on how (pull from repo, or create subfolder in Duplicati repo?)

@Pectojin Is it cross platform? (I know it is Python, but there are details always)

Windows users: is it normal to have Python installed, or do we need some helpers there?

kenkendk · February 27, 2018, 12:03pm

I am not a fan of roll-your-own crypto stuff, it tends to have a bunch of non-obvious holes/problems.

Setting up SSL is not that hard, and it can optionally use self-signed certificates, with the certificate hash known by the client (similar to shared key setup). The Duplicati webserver supports SSL with self-signed certificates, so it should not be hard to set it up.

Something that executes stuff on the other machines needs to be really carefully thought through, otherwise you get deltree C:\Users or rm -rf /*.

@crazy4chrissi and @LKits I did a bunch of work towards setting up a monitoring service like duplicati-monitoring.com, but in the end I saw that I needed to do much more work to get it ready, and decided to fix other things in Duplicati instead, but I think a monitoring service is absolutely crucial for anyone setting up a backup. Without external monitoring you only discover that it did not work, when you really need it.

My plan back then was to have a plugin module (tiny bit of C# code) that allows to register a custom input field in the shared application settings, as well as on the last page in the edit process. The idea was the the user could register something like --service-key=xyz --send-report-url=abc either on individual jobs or for all jobs through the settings menu.

The module itself would then just register these extra fields, and invoke the HTTP reporting module and submit the contents. If your services can work something like that, we can add them to Duplicati, such that the user can simply choose a service (+maybe a service key) and have the monitoring configured in a simple way.

kenkendk · February 27, 2018, 12:05pm

Do we need to add another more straightforward REST-like login method?

Something like:

C: GET /login
S: <nonce>,<salt> or "No login required"
C: POST /login?key=base64(sha256(key|nonce|salt))

Pectojin · February 27, 2018, 12:35pm

I think the major hassle for me was that Python and Javascript has different crypto libraries that made it hard for me to understand exactly what was going wrong.
I wanted for it to be a simple login/password rest request, but that’s not very viable on a non-ssl connection.

Edit: We should update the REST API to provide a /login method, but I’m not sure there is a better way than the current nonce authentication.

Python works for all my platforms (Linux, Windows, macOS). I’ve also had success bundling it for each platform in a single selfcontained binary in the last release requiring no Python and no 3rd party Python libraries. Pectojin/duplicati_client/releases/tag/0.1.26_alpha. This seems to work well. No installer, but that’s very distro specific on Linux anyway.
Edit: I should add I’ve tested on RHEL 7.4, Ubuntu 16.04, macOS High Sierra, and Windows 10

I imagined cloning my entire repository into a new repository under the Duplicati organization.
Then perhaps having the releases bundled into the regular Duplicati releases as well as providing stand alone releases?

I’m not entirely sure how it’s best to distribute. It would be nice if it was “just there” on Duplicati systems but it should also not require the entire Duplicati package (and dependencies) to install on a system that won’t be running a Duplicati Server.

JonMikelV · February 27, 2018, 12:52pm

I’d say probably not.

I like the sound of that, though I’m curious if you ran into any issues testing the self contained binary on machines that DID have Python installed.

Pectojin · February 27, 2018, 1:12pm

I’ve actually not been able to find a Linux or macOS machine without Python But it works fine there since it references internal libraries in the binary (the binary is “huge”, the entire Github source is 97KB and the binary is almost 6MB)

On Windows it also seems to work fine regardless of whether Python is installed or not, but I think @Stephen has been testing more on Windows than I.

crazy4chrissi · February 27, 2018, 2:58pm

My plan back then was to have a plugin module (tiny bit of C# code) that allows to register a custom input field in the shared application settings, as well as on the last page in the edit process. The idea was the the user could register something like --service-key=xyz --send-report-url=abc either on individual jobs or for all jobs through the settings menu.

The module itself would then just register these extra fields, and invoke the HTTP reporting module and submit the contents. If your services can work something like that, we can add them to Duplicati, such that the user can simply choose a service (+maybe a service key) and have the monitoring configured in a simple way.

@kenkendk So what you propose is that we could introduce a “Duplicati-Monitoring.com Plugin for Duplicati” that can be installed in Duplicati where the user just configures his duplicati-monitoring.com account name and key and then the plugin automatically adds all the parameters to the backups? That of course would be great. But I guess this plugin would need to do a little more, like inform the monitoring service about the configured backups. And tell it, if backups get added, removed or changed (e.g. the schedule). And the service would then also need to send emails to the user whenever the plugin does something automatically. Especially, when a backup is deleted in Duplicati, it should not be deleted in the monitoring service without asking the user for confirmation by mail. Otherwise, some ignorant colleague who thinks backups only take system resources might delete all backups on his machine and the monitoring service would not tell the admin that the backups have been removed.

But I don’t see any big problem, and I would love to simplify the setup process of the monitoring service even more.

crazy4chrissi · February 27, 2018, 3:13pm

The bigger problem with python is that some distros can’t have python2 and python3 at the same time, but some stuff needs python 2 and other stuff needs 3, and then there are the libs in different versions etc. And the worst idea is writing the package manager of your distro in python, using the same python installation and libs that it manages itself. A colleague of mine managed to break dnf while messing with python packages in dnf itself, but this is another story.
I see that your client currently runs with python 2 and 3. I hope it is possible to keep it like this.

I would also say no, it is not. But for windows users, it is normal that each 10kb program comes with 10MB of some libs (especially you can easily have 10 copies of GTK on a Windows system). So I think Windows users would not mind the 6 MB for the binary