Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prov/opx: Fix Coverity scan waiting while holding lock errors. #9959

Merged
merged 1 commit into from
Apr 1, 2024

Conversation

belynam
Copy link
Contributor

@belynam belynam commented Mar 31, 2024

Remove busy-waiting/sleeping while SHM FIFO has not been created/initialized. Cast result of CPU_ISSET() to a uint64_t to fix a coverity warning about potentially left-shifting by more than 31 bits. Whitespace/formatting fixups.

PR Details

This should fix most of, if not all of, the 37 "Waiting while holding a lock" Coverity scan defects (numerous CIDs).

It should also address the bad bit-shift operation (CID 376792) defect.

Lots of whitespace/formatting changes (removing trailing whitespace, proper tab use).

Remove busy-waiting/sleeping while SHM FIFO has not been created/initialized.
Cast result of CPU_ISSET() to a uint64_t to fix a coverity warning about
potentially left-shifting by more than 31 bits. Whitespace/formatting fixups.

Signed-off-by: Ben Lynam <[email protected]>
@j-xiong
Copy link
Contributor

j-xiong commented Apr 1, 2024

@zachdworkin I can no longer access Intel CI status pages by clicking the "details" link of the check results. I got a page referring to a AWS site.
image

This is likely related to #9951 and the installation of the aws CI app: https://github.com/ofiwg/libfabric/settings/installations. Could you take a look?

@a-szegel Could you check from the aws ci app side?

@wenduwan
Copy link
Contributor

wenduwan commented Apr 1, 2024

AWS CI failure is unrelated.

@wenduwan
Copy link
Contributor

wenduwan commented Apr 1, 2024

bot:aws:retest

@shijin-aws
Copy link
Contributor

shijin-aws commented Apr 1, 2024

I can no longer access Intel CI status pages by clicking the "details" link of the check results. I got a page referring to a AWS site.

@j-xiong Is it still happening now? If I click the Intel CI details it directs me to https://cje-fm-owrp-prod03.devtools.intel.com/satg-dse-ofilibfabric/job/OFIWG_Libfabric/job/ofi_libfabric/job/PR-9959/2/display/redirect which seems Intel

@zachdworkin
Copy link
Contributor

@shijin-aws I just manually triggered Intel CI. It looks like both Jenkins CI's get triggered on PR but whichever one reports last takes the github PR check link. I am still investigating how we can fix this and have both Jenkinsfiles.

@shijin-aws
Copy link
Contributor

@zachdworkin Thanks, @a-szegel is OOO today. Let us know if you need help from our side

@zachdworkin
Copy link
Contributor

@shijin-aws I can't seem to find examples on google where users are sharing a git repo and are trying to connect multiple controllers AND are having this problem. It seems that the original AWS hook works (which is a different controller) so I'm curious why the new test hook is overwriting the Intel one. I think we maybe need to add a new git hook for this new Jenkinsfile or we could try what the only article I could find on this suggested; naming Jenkinsfiles differently. EG: "Jenkinsfile.intel" and "Jenkinsfile.aws". Other questions I have about the configuration of your controller for this job. What is the quiet period? Where is it looking for Jenkinsfile? What git plugins are you using?

Our configurations are:
Jenkinsfile is in contrib/intel/jenkins
Quiet period is 5 min
Git plugin
Github Branch Source Plugin
GitHub Plugin
GitHub PR Comment Build Plugin
GitHub Notify Step Plugin

@j-xiong do we know how the original (last week when we only had one job for AWS controller and one job for Intel controller) jenkins hooks got created? Can we add a new one from Git side? Are they only creatable from Jenkins side?

@shijin-aws
Copy link
Contributor

shijin-aws commented Apr 1, 2024

@zachdworkin I am not quite familiar with the questions u are asking. I think the best we can do right now is to temporarily uninstall this github app and re-install it when @a-szegel is back.

@zachdworkin
Copy link
Contributor

@shijin-aws Agreed. I can work this out with @a-szegel when they get back. In the meantime, do you think you can disable the new multibranch pipeline job from #9951 that is causing the issues? There should be a button in the top right of the job that says "disable multibranch pipeline" and that will prevent it from running/reporting.

@shijin-aws
Copy link
Contributor

@zachdworkin I disabled this multibranch pipeline project. Let's see whether it fixes it

@j-xiong
Copy link
Contributor

j-xiong commented Apr 1, 2024

I uninstalled the AWS CI github app.

@shijin-aws
Copy link
Contributor

@j-xiong @zachdworkin is everything all good this time?

@j-xiong
Copy link
Contributor

j-xiong commented Apr 1, 2024

@shijin-aws It appears to work properly now.

@j-xiong j-xiong merged commit b60d462 into ofiwg:main Apr 1, 2024
13 checks passed
@a-szegel
Copy link
Contributor

a-szegel commented Apr 2, 2024

I think I found the solution, from this issue, it seems like we need to install this plugin, and name the context so that the results do not get overridden. This should he done by both AWS and Intel to avoid this from happening again (in case another provider wants to add Jenkins multi branch pipeline to libfabric for CI)

@belynam belynam deleted the ben.lynam/coverity_lock branch April 2, 2024 17:38
@j-xiong
Copy link
Contributor

j-xiong commented Apr 4, 2024

@zachdworkin Could you check the above comment from @a-szegel and take the necessary actions for Intel CI?

@zachdworkin
Copy link
Contributor

@j-xiong Yes, its been applied. The new tag for intel ci is continuous-integration/intel/jenkins.

@j-xiong
Copy link
Contributor

j-xiong commented Apr 4, 2024

@zachdworkin Great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants