Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Enhance) dependency update process #27442

Open
MichaIng opened this issue Jun 9, 2021 · 10 comments
Open

(Enhance) dependency update process #27442

MichaIng opened this issue Jun 9, 2021 · 10 comments
Labels
0. Needs triage Pending check for reproducibility or if it fits our roadmap automated pr developer experience enhancement overview

Comments

@MichaIng
Copy link
Member

MichaIng commented Jun 9, 2021

Hey guys, not sure if a meta discussion is right here, but I couldn't find a better matching sub forum either. Sorry for the long text, I hope to understand things a little better and probably I'm then able to effectively help bots finishing their job during calm night hours or so 😄.

To fix CI/CD tests for a PR of mine, I reviewed them a bit, compared results with other PRs to verify whether a failure is caused by my changes or not, and indeed some checks were solved by a rebase onto current master.

I then noticed that a lot of dependabot PRs hang as well on such "formal" check failures, while the bot reports 100% compatibility, nextcloud-bot requested a merge (what is actually the trigger for that?) and approved it together with github-actions bot, so that it could have been merged automatically without human interaction, if the CI/CD tests were gone through. I played a bit with the bots PRs and indeed in some cases a simply rebase request solved it and hence triggered an automated merge, so far so good.

Now there are some other possible issues which may delay or break those automated dependency bumps, also for security-related ones, being open for months. Aside of the fact that the related security patch is then missing, it also accumulated a long list of open PRs, making it difficult to keep human-made PRs in view, which are moved back page by page.

One commonly failing check is the one for JavaScript build changes. From what I understand, this is as the repository does not only contain source code, but compiled JavaScript as well. So when a Node module is bumped, which has an effect on any of these scripts, a re-build within the test leads to changed files and then we get something like this:

Run bash -c "[[ ! \"`git status --porcelain `\" ]] || ( echo 'Uncommited changes in webpack build' && git status && exit 1 )"
Uncommited changes in webpack build
HEAD detached at pull/26730/merge
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   apps/files/js/dist/personal-settings.js
	modified:   apps/files/js/dist/personal-settings.js.map
	modified:   apps/user_status/js/user-status-menu.js
	modified:   apps/user_status/js/user-status-menu.js.map
	modified:   apps/weather_status/js/weather-status.js
	modified:   apps/weather_status/js/weather-status.js.map
	modified:   apps/workflowengine/js/workflowengine.js
	modified:   apps/workflowengine/js/workflowengine.js.map
	modified:   core/js/dist/login.js
	modified:   core/js/dist/login.js.map
	modified:   core/js/dist/main.js
	modified:   core/js/dist/main.js.map
	modified:   core/js/dist/unified-search.js
	modified:   core/js/dist/unified-search.js.map

To solve this, as far as I understand, /compile amend / can be done to trigger a rebuild and amend by npmbuildbot-nextcloud. The issue then is that dependabot denies to further handle the PR, as it has been modified by someone/something else. So a manual merge is required then.

  • I'm not sure how these bots can be configured, but wouldn't it be possible to have a JavaScript module bump directly combined with a rebuild, and/or allowing dependabot to auto-merge requests which were modified by npmbuildbot-nextcloud, as long as all tests went through, of course?
  • Else what is the best approach to handle such cases?
    EDIT: In the meantime, dependabot is able to auto-merge PRs after the compile bot did its job, given all checks passed and two approvals.
  • EDIT: Another suggestion is to have the compile bot remove obsolete files as well: Remove renamed compiled files skjnldsv/npmbuildbot#174

And then there is dependabot-preview, which is still opening PRs (at least it did 1.5 days ago), but is now not able to further deal with them anymore, so all it's PRs need to be in case rebased or merged manually. Open PRs by it block dependabot (non-preview) from opening own ones, as the branch it would create does already exist. As the PR is only handled by the bot (version) that opened it, I guess dependabot-preview should be disabled, and all it's PRs either closed (to have them re-opened by dependabot) or manually reviewed and merged. This is where most security dependency bumps are hold a long time, like #26730. EDIT: Disabled in the meantime.

Then another minor issue with the drone Selenium acceptance tests:

Starting ChromeDriver 90.0.4430.24 (4c6d850f087da467d926e8eddb76550aed655991-refs/branch-heads/4430@{#429}) on port 21084
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.
[1623151991.858][SEVERE]: bind() failed: Cannot assign requested address (99)

All logs are spammed with these errors, while it seems to not affect the test results. I just recognised as I had a look into the long taking ones, and thought that this might be the reason.
EDIT: In the meantime I think I understand the tests better and Selenium is just running as a provider for the actual tests and prints the above warning on every actual test scenario, right?

And generally: While the GitHub tests are for free (?), the drone has some limit, doesn't it? I mean not only for concurrent tests but also overall tests in a certain time range, so that commits/pushes shouldn't be spammed (anyway)? I see the drone sometimes Waiting for status to be reported, but sometimes it starts regardless 🤔.

@MichaIng MichaIng added enhancement 0. Needs triage Pending check for reproducibility or if it fits our roadmap automated pr labels Jun 9, 2021
@szaimen
Copy link
Contributor

szaimen commented Jul 27, 2021

cc @nextcloud/server for some more feedback/discussion on this

@MichaIng
Copy link
Member Author

Some things have been sorted in the meantime, either bot-wise or because I have a better understanding how things work. It's now basically two things that may be great to automate dependency bumps as much as possible:

  • Have the compile bot doing its job automatically, either when dependabot opens a PR or even before that, invoked by dependabot to amend on the branch but before the PR is opened, to prevent doubled checks. Or as a trigger when that git status --porcelain test fails, to avoid compilation when not required? Not sure what weights heavier.
  • Have the compile bot automatically remove obsolete files instead of only adding new/modified ones: Remove renamed compiled files skjnldsv/npmbuildbot#174

@skjnldsv skjnldsv changed the title META | (Enhance) dependency update process (Enhance) dependency update process Jul 30, 2021
@skjnldsv
Copy link
Member

Hi @MichaIng !
Thanks for diving into this and taking the time to expand your thoughts 🙂
There is ongoing internal discussions about removing those shipped bundles. There are two reasoning behind this:

  1. The repo is working as-is just by cloning (easy for newcomers)
  2. The release script does not have a building step. It would need to be added and can have various potential issues (different configs, one app failing, etc etc)

We will discuss this more in-dept in the upcoming weeks 😉

@MichaIng
Copy link
Member Author

The repo is working as-is just by cloning (easy for newcomers)

Indeed, I do also benefit from this, when there is no need to install a development stack for testing a branch 🙂. Another argument is that many CI/CD tests need the building step as well when the bundles are gone. Most drone tests currently do the clone only, so a drone run would be significantly heavier, I guess. On the other hand, without bundles, no additional compile bot call is needed and hence 1-2 drone calls less on every PR opened/rebased where re-compiling is need.

In this case it makes sense to wait for the outcome of your internal discussion before putting any efforts into the bots, which (the efforts, not the bots) may become obsolete.

@MichaIng
Copy link
Member Author

MichaIng commented Jul 31, 2021

Currently, the compile actions take > 10 minutes. I think this would lengthen the drone checks by far too much to be applicable. If there was a single clone + compile step per drone run only, then probably, but currently this would mean ~70 times 10 minutes additional processing time, assuming that the drone takes similarly long then the GitHub action runners.

@skjnldsv
Copy link
Member

We also need to improve the compile build.
Stop having multiple webpack configs, but one big config file at the root, so we can also split vendors and merge dependencies.
That will also reduce the size and build time

@max-nextcloud
Copy link
Contributor

Some ideas about how to make live easier with compiled js are also discussed here: nextcloud/.github#28

@skjnldsv
Copy link
Member

Stop having multiple webpack configs, but one big config file at the root, so we can also split vendors and merge dependencies.
That will also reduce the size and build time

This is done btw, building server is muuuch faster!

@MichaIng
Copy link
Member Author

Indeed. The only left thing I have in mind, is when dependabot PRs/commits would trigger the compile command automatically, so that there is a chance to have it merge automatically even when the dependency chance causes actual JavaScript changes, given no CI errors, of course. But it implies some issues:

  • Often unnecessary additional actions runners load, as most dependabot PRs do not cause JavaScript changes but compile command still takes quite some time before realising that.
  • Quickly repeating CI triggers, first when the PR is opened, then with some delay due to compile command commit. Skipping the CI on dependabot's commit and only triggering it on compile command commit does not work as of above point: In most cases there is no compile command commit.
  • dependabot cannot automatically rebase its PR when the compile command added/amended its changes. When on Saturdays a bunch of dependency bumps are created, it currently is a likely scenario that the first may get merged, after following ones are created, so that in case of JavaScript changes the compile command creates conflicts, requiring a manual recreate+compile loop, hence no time saved but just CI queue/load unnecessarily raised. Individual dependabot PRs would need to be opened with sufficient delay so that one has a chance to be merged before the next is even created, to avoid conflicts.

@max-nextcloud
Copy link
Contributor

Would it be feasible to use app build artifacts as the source for the server build?
We do have some automation in place for releasing apps but we don't use it for apps shipped with the server. Instead of git cloning shipped apps - could they be retrieved from a package and then unzipped?

That way compiling the assets could still be the responsibility of the app itself - but it would not have to keep them in the git repo at all times. In addition further steps such as removing unnecessary files or signing could be part of the packaging of the app before it's included in the server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0. Needs triage Pending check for reproducibility or if it fits our roadmap automated pr developer experience enhancement overview
Projects
None yet
Development

No branches or pull requests

5 participants