Hi there, I am new here, and a cursory search didn’t show me what I was looking for, but I promise I did search the forum first.
I am looking to see if Duplicati is possibly a contender in replacing my existing backup system for multiple separate networks. I currently use a system with backup agents on the remote computers in different networks, these agents send their data to s3, but the hosted server controls the backup sets and scheduling.
A good example for this would be CloudBerry MBS, they do what I am doing. Is this ‘a thing’ that Duplicati is designed to do? Or am I barking up the wrong tree?
Pectojin is correct in the Duplicati itself is client centric assuming a basic “dumb” destination and there is a separate tool designed to provide a centralized location from which to a control clients.
I’m curious what functionally you want from the server - is it just job control or do you want more like on-destination verification or other stuff?
Agreed if this means being the central administration server. If it means web-serving to a central server, it doesn’t need to know it’s doing that, however its web server only takes incoming connections, so it needs help if that’s hard to have. This probably depends on barriers such as firewall, NAT, company security, etc. Duplicati on a headless Linux box can be managed through either direct browsing, or SSH port forwarding. Reverse port forwarding through ssh -R or PuTTY plink could go out to bring remote port back to Duplicati.
The dumb storage destination seems to be the assumption of CloudBerry MMS too. The layout difference appears to be that there’s a managed backup site that works with both the on-client backup agent and the service provider. Duplicati (with its focus on open-source software, and not on sale of services) lacks this, preferring direct access from browser or Pectojin’s CLI client to the Duplicati server on that specific client.
Agreed. Depending on your needs and environment, it’s either pretty easy (if you can connect into clients), or takes some network work to set up. Monitoring status centrally is somewhat aided, thanks to third-party software which builds on built-in reporting options. Examples include duplicati-monitoring and dupReport.
Your central server might be an SSH server and a browser to go to SSH tunnel to Duplicati of your choice, however if you want less client autonomy and more centralized scripting, have the CLI client run the show. Unlike the client that ships with Duplicati, the Pectojin client works through a lower-level API to web server.
The problem with dumb clients is that it’s a lot of work to set up with auth and TLS on each client. Portability is also basically nil because you can’t move your client without the central server losing connection. Additionally ensuring security is tough if you need to reach clients over the internet because they’ll end up exposed on the internet.
IMO, the right way to implement a central server is to have the client be aware that it’s being managed, so that the traffic goes from the client to the central server.
However, before the client can be aware it’s managed we’re still missing the central management component, and obviously the client connection to it.
Moving clients can be solved by dynamic DNS, however opening Duplicati to inbound connections from the Internet adds hazards (it is not hardened) even if client auth and TLS are set up. The encrypted SSH tunnel avoids TLS setup, and (assuming it works) initiates SSH from the client system outbound without Duplicati knowing. All Duplicati sees is something like the unencrypted HTTP connection coming in from SSH client when the central web browser or CLI client connects to assigned port to tunnel to HTTP to Duplicati server.
There’s the “right” way, and then there’s “what can we do now”. Traffic (requests and responses) go both directions, so I speak not in terms of traffic direction, but connection direction. Why must client be aware?
While I agree Duplicati could someday reach out to central server to create the encrypted connection that management traffic can flow over, one can have pretty much the same by having the connect be external.
Linux systems appear to have better tools to set up communications. For example socat could be helpful.
I proposed all of that. Obviously it deserves a POC from someone (not likely me) to see if it really works…
EDIT: Obviously a fancy central management component would be lots of work. Proposal is for a minimal.
I interpreted the “Is * possible” generously, and turn it back to the requestor to find out if it’s too ugly for them.
Sometimes “Is * possible” sounds like a request for feature commit. If that ask was here too, I just missed it.
Perhaps there’s a revenue opportunity (at least to cover their costs) for somebody to write/host server code, and maybe it wouldn’t have to be one of the developers writing current Duplicati code (thanks for doing that). Sorry about your hurt.
Thank you all for your replies, I am in Alaska time zone, so I was pleasantly surprised to see an inbox full of all the responses. You all pretty much nailed it on the head, the headaches of managing different networks from OUTSIDE the network starts to stack up, and while things like dyndns and reverse proxies and ssh tunnels are great, it doesn’t scale well.
I agree, someone could really wrap the tool into a centralized server, but I lack the skills or time to tackle that. If I find another way (maybe I don’t need central server just central scheduling and logging via CLI tools), then I will write it up for the benefit of the community.
I will probably start using duplicati more for non-managed backup client scenarios, since it is really slick in the trials I have used it for. If I can get managed logging and purely cli (i.e. scripted) operation there might be a place for it in my larger managed clients as well.
All the best, and I hope someone in my position benefits from your wisdom as well!
If you’re into scripting your way out of it, then I made duplicati client just for you
It’s not too much trouble getting an external schedule to run using the tool and a systemd timer or something similar. Then you could control the configuration centrally with Ansible, Chef, Puppet, or whatever your shop usually does
I kind of hesitated to mention it originally because it in no way solves the origin problem without a good amount of setup
As it happens, I am in the (slow) process of DevOpsifying my business. It is tough going from 20 years of lazy sysadmin to full orchestration, but I am making progress… steady by jerks. I will come back around to this for a certain. Thanks!
I’ve been thinking about server/client setup as well . But mostly in context where compaction could be run on server side. At least in theory that should be possible. As example client could write expired block information which then could be finally pruned by the server.
As mentioned, compaction can be a huge burden with large data sets and consume lots of bandwidth and time. - I’ve written about this topic earlier, but there was no good solution. If compaction is done locally, it could be done more often (save disk space) and or also saving lot of bandwidth (moving stuff not being compacted back and forth).
In optimum case that could be also possible as raw blocks, without encryption key(s), but that also depends on many factors and I don’t know if that’s realistic. On the other hand, using smaller files can in some sense already solve this problem.
This is a bit of an edge case for people with destinations they can control, however I agree it would be nice. There was a comment a while ago about possibly saving some of the local database content out to the destination.
If that “remote database content” included flagged-for-deletion items then destination compacting becomes much more likely, HOWEVER - there’s still a lot of work involved with coordinating.
For example, what if the destination is compacting when the source decides to ALSO compact? Just starting a source backup while a destination compact is going on could cause issues due to “missing destination files”.
Some sort of communications mechanism needs to be set up to avoid scenarios like this. I suspect it will end up be a “status / semaphore log” type file on the destination that both work-ends and check.
But even then there are still edge cases such as “Destination updated semaphore file to indicate a compact has started then crashed before flagging completion - now Source won’t run any backups because it always things a compact is running.”
Not to mention the usual (extremely unlikely) ACTUAL contention of two work-ends trying to update the status / semaphore file at the same time.
@Pectojin & @ts678 - depending on how much one chooses to trust, a “private network” such as ZeroTier could resolve (or at least centralize) a lot of the security issues. If your client and “server” are both on an encrypted ZeroTier network they can talk to each other over those IPs ‘safely’.
It potentially even solves portability issues as I believe the ZeroTier IPs are individually manageable so even if you move to a different machine you could just move the IP to the new MAC.
I’ve done the client backup installation using puppet. When the backup job runs, it just updates the configuration from central server. Yet there are just so many ways of doing it. You don’t technically need to be able to connect the client, client can of course fetch the required configuration data from some other host. No need for private network.
But as mentioned earlier, technical guys can implement this in 10k different ways using different solutions and tools.
I’m personally trying to maintain a perfect balance (whatever that means), between centralized and distributed solutions, which all of course affect data security aspects. With current configuration / solution I’m quite happy.
Just started looking at the client/server options here as well. I have Duplicati running in docker on an Ubuntu laptop, which has a large external storage drive. Now looking to backup my windows workstations to that system. But NFS isn’t an option on Windows. Having the Windows workstations be able to push the backups to Duplicati on Ubuntu would make storage management easier. One instance that has all the target destinations, and clients that push their backups to an available central target. (Though overall having the single console with all the remote backups would help keep track of what has/hasn’t backed up, and setting policies centrally would be even better).
For now I’ll try and get the SFTP option configured, as that should take care of the drive access.
No client server setup is possible. I’ve posted about such solution earlier. But as long as key functionality is broken, there’s no point of doing client server. Why? It adds lots of complexity, and if system isn’t reliable, it just would make things exponentially worse. Yet I don’t see any reason why it wouldn’t be, ie. Remote compact request with key information and return package.