Issues to address to get out of beta

For what it’s worth database rebuilding has gotten REALLY good on my laptop after https://github.com/duplicati/duplicati/pull/3758
I have some periodic issues where the local DB is corrupted when backup over SSH tunnel is timed out (e.g. over a company VPN), so I’ve been rebuilding a good amount of times the last few weeks. I’m still surprised about how quick the rebuilding is without scavenging :smiley:

One of the sticking points I see with releases is back-ends really need to be plugins and not part of a monolithic package. Releases get pushed out because something external breaks in a back-end but other less tested development has been done in the core in the meantime. Then we may have a issues (sometimes not know immediately) and the users of previously broken back-end are caught between a rock in a hard place.

Also, I think we need to look at using milestones (as it was mentioned before). This helps set users’ expectations and lets developers focus.

Also one of the big things I think we need to start using is ‘feature freezes.’ I seems like Duplicati is doing a breakneck speed feature development but reliability and stability are going to the wayside. Feature freezes will help let the dust settle on bug and issues. I know right now, upgrading is always an anxiety for me because backup/recovery are critical aspect of data security. Especially now in the age of ransomware. Having a backup/restore systems that may just not work, is very wearing.

I completely agree on reliability. The file handling and database needs to be pounded so we can identify weak points. I’m thinking more unittests are really needed to identify such issues and to always know if any change impacts reliability. This is something I’m wanting to work on ASAP.

p.s. The only 2 features I’m working on are subfolders and adding parity files. Subfolders is needed for some backends to be usable for large backups. The parity files add a good safety net against bit rot etc. Neither is needed for the next release nor for coming out of beta but I think are important to have.

And as for getting a new canary or beta release out, @kenkendk says there are some blockers on getting it out, but if we’re fixing bugs why not get these fixes out to users? Besides adding more unittests as mentioned, how else will we know when it is ready for a release?

I’m not sure that’s what he was saying if you’re talking about

it seemed like agreement with your proposal, quoted above that. I’ll continue, based on that assumption.

This discussion is at various levels, but there are similarities. For example, if we say damage to backup destination is a big issue, and damage to database is a smaller issue (because backup destination can usually recreate DB), then we can apply same idea to any release variety, e.g. regarding regressions or newfound fixes. Sometimes a big issue simply has no fix available yet, although seeing it on a list might make people focus more – and one doesn’t need to be a coder to help the coders find good test cases.

To decide when to release something, can we look at what’s fixed-but-not-shipped versus desired-fixes? Big-issue fixes after v2.0.4.5-2.0.4.5_beta_2018-11-28 and v2.0.4.22-2.0.4.22_canary_2019-06-30 exist. Discussion is possible (I can nominate some), or we could put them on milestone as already closed so amount of closed issues can be weighed against not-yet-closed when weighing the decision to release.

My nominations for things worth squeezing into next canary, which is probably the path to the next beta:

CheckingErrorsForIssue1400 and FoundIssue1400Error test case, analysis, and proposal #3868 is the database-breaker I’d propose we ensure makes next beta (which may mean catching the next canary).

The somewhat obscure backup-breaker is that FluentFTP needs upgrading. Problems after upgrade to 2.0.4.21 describes how parallel uploads code change tickled a bug (now maybe fixed) in the aftp library.
Update framework to 462 #3844 is needed to update FluentFTP, then someone needs to actually test it.
EDIT: In the forum post, you can see how this regression was a big beta issue once, so let’s get it fixed.

If there are only blockers for a non-beta, what is preventing a canary or beta from going out more often? For the canary is it just automation?

I don’t agree. I think that was just his answer to “out of beta” (who knows) . I think beta can have blockers, also, typically big regressions found in canary or experimental, or maybe a critical fix it’s worth waiting for.

The person who does releases should answer that, but an opinion (hope it’s still valid) on that issue is:

Personally, I think it’s best for canary to have hand-curated documentation on the changes. Takes time.
There’s also some need to keep track of the state of things and not push a release at an unstable spot.
That’s where milestones could help the release-maker. Automated test success is not the total story…

Nightlies might have to make do with GitHub’s list of source changes, unless automation can do better.

That’s a what-if with blockers, I’m wanting to understand if there are currently blockers for putting a new canary or beta out. Like you said, that’s a question for the person handling releases.

For canary, if release documentation is desired that’s fine, then there should be automated nightly builds. IMHO early and often builds should be getting out to testers.

“Blockers” is ambiguous. I proposed two “blockers” as in “suggest-waiting-for-fix”. Any others? Where is:

Fix pausing and stopping after upload #3712

Stopping immediately also works but due to issues of corrupting the database when aborting, the stop now button has been removed until the corruption issues are fixed.

Fix ‘stop after current file’ #3836 is being actively worked, I don’t know if its impact is as bad as above one.

Anybody else have nominations for fixes that you really want to fit in next release? If so, please describe it.

If you mean non-code “blockers” from process viewpoint, we can use more thoughts. I gave one person’s, and his “anything meaningful” part is what I hope one can view by seeing closed items on some milestone. Such items would have been largely pre-chosen as worthy of planning, thus likely worthy of a release note.

In addition to Pectojin Discussion: release cycle, here’s my view from Release: 2.0.4.23 (beta) 2019-07-14:

Those are already broken in the current beta release. If some things are not fixed why hold up a beta release when other items are ready to go? Not everything needs to be fixed to deliver a new release.

I’m not sure I’m understanding the requirement need to be met for a new release. If there are incremental fixes get them out. For issues still not fixed then like other projects, put then in the release notes as “Known Issues”. We certainly will not want to knowingly push out a release that is worse.

I must be missing something. Thanks for your patience.

I likely confused things by referencing an issues post, then asking for talk on two others. Do you mean Issue1400 and friends one which is an old bug but very impactful as seen in impact analysis of issue?
That had so much advance work including analysis and proof-of-concept fix that it seemed worth a fix.

The FTP (Alternative) bug does not exist in the current beta. If you refer to both stop issues, they exist, however I wanted to see if anybody would speak to them as worthy of squeezing in, given, as you say:

This is exactly why I’m proposing we weigh the fixes versus not-yet-fixes. Some fixes just have to wait. Which of the issues (if any) do you think should get fixed before we PM kenkendk to release a canary?

This is exactly why I was lobbying for just an incremental fix for aftp instead of TargetFramework move, however the consensus of those with opinions (including on current thread) seemed to like bigger step.

What I’d really like would be to fix the beta blockers (including the aftp backend fix I keep mentioning

Reasonable idea if we get better at release notes (volunteers?) and can filter 800+ issues to limited list, possibly an “Errata” idea with known workarounds, and other issues at least having to be describable… Describable issues might also be candidates for a milestone. Random breakages are hard to deal with.

Anyone else have input? You want to PM for a canary right now, or if not, then what should be awaited?

As long as the current master branch has no known additional bugs then I would say do a canary release and wait for no other PR’s, just get the canary out. There is no need to wait for the .net462 framework move, that can go in another canary release.

There doesn’t seem to be a reason to hold back a canary or beta release.

Regarding AFTP, according to that link the that bug was known about by at least July 3rd and the last beta was July 14th. The AFTP bug is not in the current beta?

It was made in v2.0.4.16-2.0.4.16_canary_2019-03-28. 2.0.4.23 is basically 2.0.4.5. Release note:

v2.0.4.23-2.0.4.23_beta_2019-07-14

Changes in this version:
This update only contains warnings to inform users on Amazon Cloud Drive that the service is discontinued.

To confirm that, click for what latest beta brought, compared to November 2018:

https://github.com/duplicati/duplicati/compare/v2.0.4.5-2.0.4.5_beta_2018-11-28…v2.0.4.23-2.0.4.23_beta_2019-07-14

I’d like to know how others feel. I don’t know of other issues holding back a canary, but I’d want to get another canary soon to fix the aftp regression and the huge corruption fix which is the Issue1400 one.

Question is – what’s the Canary plan before Experimental and Beta? More Canary == a slower Beta, however the post that Pectojin commented on pointed out that in times past Canary came very often:

before 2.0.4.15_canary new versions were are released every week

although I don’t think kenkendk would like the rapid pace because he’s incredibly busy at the moment.

What does everyone think about the path to Beta? There’s also forced Google Drive issue (still TBD).

EDIT:

Release has link you can click to see what’s changed since then, but go back too far and it overloads.

v2.0.4.22-2.0.4.22_canary_2019-06-30

image

I’m not seeing anything too earth-shattering after that, though we’d lose some good fixes if this was called Experimental. 2.0.4.22 canary actually seems canary-adjacent to 2.0.4.21 experimental which I think was 2.0.4.20 canary. Alternative beta push would be to say testing is enough, and beta at 2.0.4.22 code. Code prior to 2.0.4.22 would lose fixes for backup-breaker Upload throttle corrupts backup, especially OneDrive. Analyzed, with code proposed. #3787 and database-breaker “Unexpected difference in fileset” test case and code clue #380, and those look pretty critical, if we agree we want to focus on those breakage types.

Pushing past 2.0.4.22 onto current master fixes an occasional restore-breaker Hash mismatch download corruption, likely from throttle buffer offset being reused #3782 and limited backend-breaker Backends are not disposed after upload operations #3808, plus 2.0.4.13 DB bug report code keeps original FileLookup Path intact (privacy regression) #3790. Shall we pick those up too, then try Canary - Experimental - Beta and perhaps release-note aftp not working when server OS has different line endings than Duplicati has? Not sure I want to release note the Issue1400 and say we decided to Beta without it, so they get to wait…

I don’t know of any issues currently that should block a canary release. I think we should get a canary out soon.

The pull request from @BlueBlock fix for the very painful Issue1400 went into master an hour ago via @warwickmm and I’m not going to push hard for standalone FluentFTP update, but it’d still be nice if it can squeak in if anyone will do it. The FTP (Alternative) users can (to some extent) move to regular FTP instead, and I’m not certain what fraction are affected anyway, extrapolating from a single canary report.

Most common client OS is Windows, so main server mismatch may be if Linux FTP (e.g. NAS) is used.

@warwickmm would you like to ask kenkendk for canary? I’m not sure how fast he can do that anyway.

Doesn’t this PR fix it or is there another issue? #3866

AFAIK it requires FluentFTP 26.0.0 to pick up this fix, which isn’t exactly what I described, but might work:

OpenRead with EnableThreadSafeDataConnections = true does ASCII when it shouldn’t #428

The 3866 PR does not look like it does anything to fix FTP mistaken conversion between OS line endings. That is entirely as expected, because this was a bug in the .dll that was exposed in Duplicati canary code.

PR Update framework to 462 #3844 updates from 21.0.0 to 27.0.1 but is being proposed to not hit canary. Latest nuget.org FluentFTP version is 27.0.3, and I haven’t looked to see what has changed since 26.0.0.

I see… thanks. That would seem to be an important fix to get out for those impacted.

Yes, although the impact is unknown. Avoiding breaking such users is one reason why 2.0.4.23 was just 2.0.4.5 plus a warning. Of course, the other big reason for not going Beta was Experimental and Canary only had about two weeks of testing, whereas now they’ve had about two months and seem to work OK.

For past FluentFTP background, please see current post here or the framework update discussion here.