Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Become root #50

Merged
merged 1 commit into from
Apr 3, 2024
Merged

Become root #50

merged 1 commit into from
Apr 3, 2024

Conversation

dominikl
Copy link
Member

@dominikl dominikl commented Mar 7, 2024

Fixes the issue on rocky9:

TASK [ome.omero_web : omero web | delete mod file] *********************************************************************************************************************************
fatal: [2f5d1f82-2aad-433b-ab9f-98c5fe25ffa5]: FAILED! => {"changed": false, "gid": 0, "group": "root", "mode": "0644", "msg": "unlinking failed: [Errno 1] Operation not permitted: b'/tmp/django.mod' ", "owner": "root", "path": "/tmp/django.mod", "secontext": "unconfined_u:object_r:user_tmp_t:s0", "size": 1104, "state": "file", "uid": 0}

Also added selinux command to allow nginx to serve omero.web, based on https://github.com/openmicroscopy/management_tools/pull/1710/files#r1516332291 (thanks @pwalczysko !)
There was a comment saying "SELinux should be handled by openmicroscopy.omero-web-runtime" but omero-web-runtime: "This repository has been archived by the owner on Jan 8, 2021. It is now read-only. ", so maybe shouldn't rely on that.

@@ -42,4 +42,7 @@
name: nginx
state: started

# SELinux should be handled by openmicroscopy.omero-web-runtime
Copy link
Contributor

@khaledk2 khaledk2 Mar 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may not need this change as it is already there
name: omero web | selinux ports
But I think the issue comes from the selinux_utils role as I think the selinux_enabled variable value does not reflect the SeLinux status so this task has been never run.

I think it is better to check the SELinux status on this task using the ansible facts, i.e.
when: ansible_facts.selinux.status == 'enabled'
instead of
when: selinux_enabled

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should httpd_can_network_connect be added to the booleans set in

- name: omero web | selinux booleans
become: true
seboolean:
name: "{{ item }}"
state: true
persistent: true
with_items:
- httpd_read_user_content
- httpd_enable_homedirs
when: selinux_enabled
instead?

This might also fix the idempotence failure

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what is needed is deleting the added task, and checking the SeLinux status using ansible facts. I have made these changes in this local branch . It has passed the molecule tests, I think Dom should apply these changes to his branch and test on a Rocky Linux deployment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Sounds good, thanks @khaledk2 !

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to test the latest commit with the upcoming prod121 deployment but as a data point some of the SELinux booleans have been set in the logs:

[root@pilot-rocky9-omeroreadwrite audit]# grep httpd audit.log.2 
type=MAC_CONFIG_CHANGE msg=audit(1709807975.836:7503): bool=httpd_read_user_content val=1 old_val=0 auid=1000 ses=1AUID="rocky"
type=MAC_CONFIG_CHANGE msg=audit(1709807981.584:7540): bool=httpd_enable_homedirs val=1 old_val=0 auid=1000 ses=1AUID="rocky"

so I am not convinced that the selinux_enabled conditional logic was really skipped and we might still be missing a SElinux boolean on Rocky Linux 9. This would also match the findings of @pwalczysko while working on the UoD deployments.

On the usage of ansible_facts, no objection to replacing the usage of selinux_enabled by ansible_facts.selinux.status == 'enabled'. Note there are several roles and playbooks that should be reviewed - see https://github.com/search?q=org%3Aome%20selinux_enabled&type=code so we might need to decide how to rollout this change across the board.

Let's discuss at the weekly infrastructure call

@sbesson
Copy link
Member

sbesson commented Mar 13, 2024

As an update on this front, using the latest commit on this role for the deployment of prod121, Nginx was returning 502 Bad gateway on all OMERO VMs. Setting httpd_can_network_relay using

 for i in omeroreadwrite omeroreadonly-1 omeroreadonly-2 omeroreadonly-3 omeroreadonly-4; do ssh $i sudo setsebool -P httpd_can_network_relay 1; done

was sufficient to fix the proxying of the OMERO.web application through Nginx

@khaledk2
Copy link
Contributor

khaledk2 commented Mar 13, 2024

I have checked the issue of having the 502 error after the deployment, I found out that the port 4080 is already opened so the selinux ports task has run.

When turning on httpd_can_network_connect, it works fine and this will fix the 502 error. But it will work even after deleting the 4080 open port rule, so I think it will work for any port even if it does not have a rule.

Turning on httpd_can_network_relay works fine and fix the issue only if port 4080 is opened. I think it will work only with the opened ports, so I think we should use httpd_can_network_relay.

I think the only change needed is adding a new item, i.e. httpd_can_network_relay to the omero web | selinux booleans task inside the web-dependencies.yml file.

I think we should keep using the selinux_enabled variable until I check the ome.selinux_utils role as I have found comments inside the role about a bug that leads to the use of this logic to determine the selinux_enabled variable value.

@khaledk2
Copy link
Contributor

I have tested adding httpd_can_network_relay to the omero web | selinux booleans task items and copied it to an ansible-book. I have tested on pilot-rocky9-omeroreadwrite and it seems working fine and fixed the issue.

@sbesson
Copy link
Member

sbesson commented Mar 14, 2024

I agree with the discussion and the proposal above. In the typical deployment scenario where OMERO.web is served via Nginx, my understanding is that httpd_can_network_relay should be sufficient.
For more advanced scenarios where OMERO.web is under multiple proxy layers like IDR, the other proxies should be responsible for setting httpd_can_network_connect accordingly as in https://github.com/ome/ansible-role-nginx-proxy/blob/58be997c4a68674187e91a6359d42bbc75b53888/tasks/nginx-selinux.yml#L4-L9

@pwalczysko it is probably worth testing that setting this SELinux boolean is sufficient to fix the OMERO.web issues in the context of th UoD RHEL9 OMERO systems by running

sudo setsebool -P httpd_can_network_connect 0
sudo setsebool -P httpd_can_network_relay 1

@pwalczysko
Copy link
Member

sudo setsebool -P httpd_can_network_network 0

@sbesson - is it a typo ? See below please

[pwalczysko@ome-ci-upgrade ~]$ sudo setsebool -P httpd_can_network_network 0
[sudo] password for pwalczysko: 
Boolean httpd_can_network_network is not defined

@sbesson
Copy link
Member

sbesson commented Mar 14, 2024

Yes, it was a typo. Updated my comment

@pwalczysko
Copy link
Member

I agree with the discussion and the proposal above. In the typical deployment scenario where OMERO.web is served via Nginx, my understanding is that httpd_can_network_relay should be sufficient. For more advanced scenarios where OMERO.web is under multiple proxy layers like IDR, the other proxies should be responsible for setting httpd_can_network_connect accordingly as in https://github.com/ome/ansible-role-nginx-proxy/blob/58be997c4a68674187e91a6359d42bbc75b53888/tasks/nginx-selinux.yml#L4-L9

@pwalczysko it is probably worth testing that setting this SELinux boolean is sufficient to fix the OMERO.web issues in the context of th UoD RHEL9 OMERO systems by running

sudo setsebool -P httpd_can_network_network 0
sudo setsebool -P httpd_can_network_relay 1

@sbesson Yes, after I have adjusted your two suggested cmds as

sudo setsebool -P httpd_can_network_connect 0
sudo setsebool -P httpd_can_network_relay 1

Then I have a Bad Gateway after running the first cmd (....can_network_connect 0) where there was functional webclient previously.
This is fixed by the second cmd as (I hope) the test was expecting.

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coming back to this and checking the status.

  • should we merge this with only the become: true fix and open a follow-up PR fixing the SELinux boolean?
  • are we happy about using when: ansible_facts.selinux.status == 'enabled' or should that be reverted and/or handled separately?

@khaledk2
Copy link
Contributor

khaledk2 commented Mar 25, 2024

We should revert the change and keep using the selinux_enabled variable from the ome.selinux_utils role as PR 15 uses ansible facts to check the Selinux status.

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should revert the change and keep using the selinux_enabled variable from the ome.selinux_utils role as ome/ansible-role-selinux-utils#15 uses ansible facts to check the Selinux status.

I extracted all the relevant SELinux changes to #51. @dominikl could you force push away all last commits to this PR and restore it to the state of 8f50390 ?

@jburel jburel merged commit c493ecb into ome:master Apr 3, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants