Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues installing a fully working openminted platform #6

Open
2 tasks
bscopenminted opened this issue Jul 9, 2019 · 45 comments
Open
2 tasks

Issues installing a fully working openminted platform #6

bscopenminted opened this issue Jul 9, 2019 · 45 comments

Comments

@bscopenminted
Copy link

bscopenminted commented Jul 9, 2019

We are having issues to succesfully deploy a working setup of the openminted platform following the current installation tutorial.

I'll describe here our setup a bit, and I have forked and uploaded the changes to the two repos, onto the bsc_changes branches.

https://github.com/bscopenminted/install-tutorial
https://github.com/bscopenminted/omtd-standalone-setup

The following is a brief description on how we setup the test nodes:

Current setup

General components

  • Web Frontend

    • Services:
      • Nginx server (proxy to different components)
    • Interfaces
      • public interface
      • private interface (10.43.0.3)
  • Manager node:

    • Services:
      • rmohr/activemq:5.14.0-alpine
      • docker.openminted.eu/omtd-registry:latest
      • bitnami/redis:latest
      • docker.openminted.eu/omtd-workflow-server:latest
      • docker.openminted.eu/omtd-content-service:latest
      • docker.openminted.eu/omtd-postgres:latest
      • docker.elastic.co/elasticsearch/elasticsearch:5.2.2
      • docker.openminted.eu/omtd-platform:latest
      • docker.openminted.eu/omtd-store-service:latest
    • Interfaces
      • Private interface (10.43.0.11)
  • OMTD worker node:

    • Services:
      • Apache reverse proxy to galaxy instances
      • Galaxy editor
      • Galaxy executor
    • Interfaces:
      • private interface (10.43.0.12)

Current status

  • We can login using google, and search, and view applications, corpora etc, from what is uploaded during the installation.

We can't:

  • Run applications: Fails to run as we get ''Unable to locate named workflow'' (galaxy editor and worker seem empty)
  • Create a new application: Using Add->Application->Build an application with existing components

In the case of create an application, I see a connection to the apache proxy as seen in the logs from the OMTD-worker node

omtd-worker:80 10.43.0.11 - - [09/Jul/2019:13:35:20 +0200] "GET /galaxy/api/workflows?key=5e2b68d881835eb265aca70fa800e2da HTTP/1.1" 200 300 "-" "Java/1.8.0_212"
omtd-worker:80 10.43.0.11 - - [09/Jul/2019:13:39:44 +0200] "POST /editor/api/workflows?key=4ffe33f541cb31be5eb7dbee2ad9bda1 HTTP/1.1" 200 632 "-" "Java/1.8.0_201"
omtd-worker:80 10.43.0.3 - - [09/Jul/2019:13:39:44 +0200] "GET //workflow/editor?id=8237ee2988567c1c HTTP/1.0" 200 3991 "http://bscopenmint01.bsc.es/buildWorkflow/09802384-9a12-4a74-be05-64a42b99c9a4" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"

However we can't see anything on the browser. It does seem that the galaxy instances are in fact empty, but there might be still something missing as nothing shows up in the browser, just: 'powered by Galaxy' banner.

Is there something that we might have missed during the setup, or some other tests/checks we can make?

Let me know if you need me to clarify or provide configs or other data.

Regards,

@galanisd
Copy link
Member

galanisd commented Jul 9, 2019

However we can't see anything on the browser. It does seem that the galaxy instances are in fact empty, but there might be still something missing as nothing shows up in the browser, just: 'powered by Galaxy' banner.

Do the Galaxy instances have omtdImporter tool installed? Please look in the left ... under "Tools"?

I do not know which applications are you running/trying but I think that the respective workflows
do not exist in the Galaxies. Where did you find these applications and how did you load them in the Registry?

@bscopenminted
Copy link
Author

However we can't see anything on the browser. It does seem that the galaxy instances are in fact empty, but there might be still something missing as nothing shows up in the browser, just: 'powered by Galaxy' banner.

Do the Galaxy instances have omtdImporter tool installed? Please look in the left ... under "Tools"?

Both galaxy instances have the omtdImporter installed. This should enable the application creation from within galaxy by itself or do I need to load components by hand into galaxy first, so they are available in the portal when we try to create an application?

I do not know which applications are you running/trying but I think that the respective workflows
do not exist in the Galaxies. Where did you find these applications and how did you load them in the Registry?

The data available in the registry is that uploaded by the upload.yml
ansible script, which curls the dump.zip into the registry:

   - name: Upload dump.zip to {{ groups['manager'][0] }}
     shell: curl -F 'datafile=@../properties/dump.zip' http://{{ groups['manager'][0] }}:8080/omtd-registry/restore/

I assumed the data is valid and used for testing purposes.

@galanisd
Copy link
Member

Both galaxy instances have the omtdImporter installed.

OK great. This means that at least a part of the whole thing has been installed correctly.

This should enable the application creation from within galaxy by itself or do I need to load components by hand into galaxy first, so they are available in the portal when we try to create an application?

If Registry is correctly installed and integrated to Galaxies you can register TDM components
using the respective forms (see https://services.openminted.eu/resourceRegistration/component).
The registered components will automatically appear in Galaxy editor and you will be able to create
workflows/applications.

The data available in the registry is that uploaded by the upload.yml
ansible script, which curls the dump.zip into the registry:
....
I assumed the data is valid and used for testing purposes.

Yes I think that the data are valid.
I believe that we have to check 2 things

  1. Does component&application importing work as it should? OK the components and applications are imported to the Registry
    but does this triggers the process of creating
    a. the respective Galaxy XML tool files ?
    b. the respective Galaxy workflows.
    @antleb @Jodee90 ????
    If not then this is why the Galaxies are empty.

  2. The Galaxies and Registry are integrated correctly.
    E.g. for Galaxy executor the correct IP/URL and API key is required to be configured in the Registry config files. Please can you send me the properties & values that you have configured?

I assume that if these two things are working as they should then it is pretty easy to have a replica of
OpenMinTeD in your lab given that everything else is already installed.

@bscopenminted
Copy link
Author

Both galaxy instances have the omtdImporter installed.

OK great. This means that at least a part of the whole thing has been installed correctly.

This should enable the application creation from within galaxy by itself or do I need to load components by hand into galaxy first, so they are available in the portal when we try to create an application?

If Registry is correctly installed and integrated to Galaxies you can register TDM components
using the respective forms (see https://services.openminted.eu/resourceRegistration/component).
The registered components will automatically appear in Galaxy editor and you will be able to create
workflows/applications.

I can create components and they show up on the galaxy editor, but do not on the executor. I'm trying to find what might be the issue here. Then if I upload an application (one from the dump.zip) it shows up in the executor, but not in the editor.

The data available in the registry is that uploaded by the upload.yml
ansible script, which curls the dump.zip into the registry:
....
I assumed the data is valid and used for testing purposes.

Yes I think that the data are valid.
I believe that we have to check 2 things

  1. Does component&application importing work as it should? OK the components and applications are imported to the Registry
    but does this triggers the process of creating
    a. the respective Galaxy XML tool files ?
    b. the respective Galaxy workflows.
    @antleb @Jodee90 ????
    If not then this is why the Galaxies are empty.
    Where can I see the log for this? in the registry container, or some other?
  2. The Galaxies and Registry are integrated correctly.
    E.g. for Galaxy executor the correct IP/URL and API key is required to be configured in the Registry config files. Please can you send me the properties & values that you have configured?

All my config changes are in the forked repo, like the registry data, and also my nginx config.
I checked the galaxy instances api keys with the ones in my config and they match. I think there might be something missconfigured either at the nginx level, or maybe one of the urls for the services, but I can't find which one.

I assume that if these two things are working as they should then it is pretty easy to have a replica of
OpenMinTeD in your lab given that everything else is already installed.

@galanisd
Copy link
Member

I can create components and they show up on the galaxy editor, but do not on the executor. I'm trying to find what might be the issue here. Then if I upload an application (one from the dump.zip) it shows up in the executor, but not in the editor.

There are 2 things that should be synced

  • Components: When a TDM component is registered to OMTD Registry the respective Galaxy XML wrapper is generated and stored in the dedicated folder of the Galaxy editor. The Galaxy editor instance scans this folder and the component appears in Galaxy UI almost immediately (under Tools). This folder is shared via NFS with the Galaxy executor therefore the 2 Galaxy instances should always be synced. If they are not please check the mounts and NFS installation. The scripts that install NFS (+Galaxies, + other things) are here
    https://github.com/openminted/omtd-standalone-setup
    OR here
    https://github.com/openminted/omtd-stack-setup/blob/master/docs/deployment_guide.md
    depending on the setup (standalone vs. cloud-based) that you have tried to install.

  • Workflows: When a workflow is created in Galaxy editor and you press the save button you are redirected to Registry. There you provide the required metadata and finally an"Application" record is saved that includes the Galaxy workflow definition. Not sure but I think that when a processing is started for this Application the workflow definition is imported in Galaxy executor via the Galaxy REST API. So you have to check the Galaxy API key. You said that did that. Also check the Registry logs for any errors/messages in the communication with the Galaxy executor.

@galanisd
Copy link
Member

galanisd commented Jul 10, 2019

It seems that in your case the editor and executor are installed in the same machine. I think that in this
specific case NFS is not used but something much simpler. A symbolic link between the 2 folders.
@saxtouri is this correct?

https://github.com/openminted/omtd-standalone-setup/blob/master/roles/editor/tasks/main.yaml#L70
?

@saxtouri
Copy link
Collaborator

saxtouri commented Jul 10, 2019 via email

@galanisd
Copy link
Member

NFS is still needed to share data between Galaxy and workers.

Correct but in the case of BSC (standalone setup) it seems that editor <-> executor syncing is done with the symbolic link https://github.com/openminted/omtd-standalone-setup/blob/master/roles/editor/tasks/main.yaml#L70.

Not sure about that..... but I think that you have to check if the link was created because it might cause the problem you have mentioned.

@bscopenminted
Copy link
Author

The link was created and is shared between both galaxy instances, the /srv/editor/tools points to /srv/executor/tools. However I don't see the same thing on both. Restarting galaxy instances improves nothing. I have also checked the postgres db and contents for the workflow table differ between executor and editor databases.


executor=# select * from workflow;
 id |        create_time         |        update_time         | stored_workflow_id |                 name                 | has_cycles | has_errors |               uuid               | parent_workflow_id
----+----------------------------+----------------------------+--------------------+--------------------------------------+------------+------------+----------------------------------+--------------------
  1 | 2019-07-10 11:18:11.805486 | 2019-07-10 11:18:11.805508 |                  1 | 976aa474-7c13-4e69-a014-c44e131ea1a3 | f          | t          | 85d2a1e878664fd0959b74e12d09598e |
  2 | 2019-07-10 11:18:12.249107 | 2019-07-10 11:18:12.249122 |                  2 | 976aa474-7c13-4e69-a014-c44e131ea1a3 | f          | t          | 0538fe50ddf043e48e511dca9563bf91 |
(2 rows)


editor=# select * from workflow;
 id |        create_time         |        update_time         | stored_workflow_id |                         name                          | has_cycles | has_errors |               uuid               | parent_workflow_id
----+----------------------------+----------------------------+--------------------+-------------------------------------------------------+------------+------------+----------------------------------+--------------------
  1 | 2019-06-14 12:23:09.787717 | 2019-06-14 12:23:09.787734 |                  1 | 0931730851637864-8b0dd040-c56d-4ce5-af5d-460fcae2cb2e | t          | f          | 6e5afe3991b14256b67ea976c1bd3a95 |
  2 | 2019-06-14 12:28:35.100396 | 2019-06-14 12:28:35.100413 |                  2 | 0931730851637864-24b9f889-3371-47ba-be4b-8bf6bc4198a8 | t          | f          | 8206f2bb61ad4efc86a8bd1768b7bf8c |
  3 | 2019-06-20 07:27:12.502402 | 2019-06-20 07:27:12.502443 |                  3 | 0931730851637864-ad05f235-8a74-4299-bdbb-c5d980dbf956 | t          | f          | 6e8a2575b43d47699ceef069f4010923 |             
  4 | 2019-06-20 07:28:36.150766 | 2019-06-20 07:28:36.150777 |                  4 | 0931730851637864-5b0c173b-ab12-4527-8ac2-bb3b9972840b | t          | f          | bde65d45b26f4d77bc0d886f40d7dd5d |             
  5 | 2019-06-20 07:30:06.910266 | 2019-06-20 07:30:06.91028  |                  5 | 0931730851637864-cdaa7704-d3b4-42ae-9933-51c2ac700911 | t          | f          | 83bfe11e739c489eaa1a48a2a5f6b238 |             
  6 | 2019-06-20 07:34:35.634265 | 2019-06-20 07:34:35.634332 |                  6 | 0931730851637864-0eb9e34b-4613-4eb3-9018-0b95b20bb040 | t          | f          | 7b641faf9844401c9b0bd5a6899f157b |             
  7 | 2019-06-20 07:35:26.736496 | 2019-06-20 07:35:26.736508 |                  7 | 0931730851637864-eeb49756-00c2-41ad-b638-a5d5cf03d687 | t          | f          | dd5d803e128d40bea4fd552540f6969f |             
  8 | 2019-06-20 07:40:47.195782 | 2019-06-20 07:40:47.195793 |                  8 | 0931730851637864-fd92c670-4f91-4c87-ad0f-8d84f0e7b41e | t          | f          | 5af29971ed3943a7957c867abed9398d |             
  9 | 2019-06-20 07:42:33.930992 | 2019-06-20 07:42:33.931003 |                  9 | 0931730851637864-fd007e21-7509-4841-821e-383655d001e1 | t          | f          | 070d35b38070414daeac51d1002d1d02 |             
 10 | 2019-06-20 07:43:47.469144 | 2019-06-20 07:43:47.469161 |                 10 | 0931730851637864-281b2ee4-2b50-4690-98e8-1b7f70c239ce | t          | f          | 8a37a1dcc1ab4057bba61faa50893dec |             
 11 | 2019-06-20 09:00:09.741264 | 2019-06-20 09:00:09.741279 |                 11 | 0931730851637864-26ff76d9-8879-4a8e-ae26-2ccac859996f | t          | f          | 249fa6139e174fa18d0c844a9c4815f1 |             
 12 | 2019-06-20 09:01:52.132312 | 2019-06-20 09:01:52.132325 |                 12 | 0931730851637864-4c7cb3d8-7d0f-453d-9f0a-3ec64e675059 | t          | f          | bae990c5f737451b84c2ae6eb477072e |             
 13 | 2019-06-20 09:06:07.31313  | 2019-06-20 09:06:07.313157 |                 13 | 0931730851637864-2e8921af-0a59-41a5-b61a-229970751a8c | t          | f          | 39a0b7e6e7954e70819c467d62862d28 |             
 14 | 2019-06-20 09:06:29.489084 | 2019-06-20 09:06:29.489102 |                 14 | 0931730851637864-dff02d41-3f87-45bb-b007-5e07fffe0e9b | t          | f          | 540e31d94bf7400798b8769e26deb6bf |             
 15 | 2019-06-20 09:27:46.951689 | 2019-06-20 09:27:46.951716 |                 15 | 0931730851637864-353076c0-5fa6-408a-b74a-7d5ce3d944a4 | t          | f          | 2cb26b71266a4b01b1c0de10e5bfd5df |             
 16 | 2019-06-20 09:31:25.039601 | 2019-06-20 09:31:25.039613 |                 16 | 0931730851637864-af668690-1a02-4b7a-9f5c-7da5b5ce6a6f | t          | f          | ac883ac1783243cb847defb8df9d13d1 |             
 17 | 2019-06-20 09:33:27.364426 | 2019-06-20 09:33:27.364439 |                 17 | 0931730851637864-ee2cb99d-76fb-4600-8459-efc21e1857e3 | t          | f          | acd0a85ae5f048baa20e651d74411bc6 |             
 18 | 2019-07-08 12:31:54.910331 | 2019-07-08 12:31:54.910346 |                 18 | 0931730851637864-5c05dc3f-033d-48c5-8650-3ab35b8a3716 | t          | f          | 1c9064628c9f451cba292a11d25cedb8 |             
 19 | 2019-07-09 06:32:46.950173 | 2019-07-09 06:32:46.950185 |                 19 | 0931730851637864-2bd0770f-4712-4e42-bc37-5517f5518171 | t          | f          | d9b34018d694451e9a3966d7248c7b4e |             
 20 | 2019-07-09 06:43:52.420454 | 2019-07-09 06:43:52.420469 |                 20 | 0931730851637864-3641d2f0-076c-4aba-8059-c2ae3350fb1f | t          | f          | b36d7c2be4c74e928ec18f00f2ffda8b |             
 21 | 2019-07-09 07:01:44.23534  | 2019-07-09 07:01:44.235355 |                 21 | 0931730851637864-d936094d-c308-4d43-8c50-62ed7153a6bf | t          | f          | e7f1a27013f741c7a156d69a66d44a6b |             
 22 | 2019-07-09 11:39:44.146836 | 2019-07-09 11:39:44.146858 |                 22 | 0931730851637864-e180794c-4a2d-4e76-95cc-926c0998648f | t          | f          | 3c3c685050bc4319b080adc803792802 |             
 23 | 2019-07-10 09:45:40.250306 | 2019-07-10 09:45:40.250317 |                 23 | 0931730851637864-a3df46c1-ea74-42b3-97a6-a3fc068be03a | t          | f          | 9a3ceab8cc8b420896f496d546218bcb |             
 24 | 2019-07-10 09:46:22.03181  | 2019-07-10 09:46:22.03182  |                 24 | 0931730851637864-f02eedd1-b985-445f-b1d2-1cc9bef26752 | t          | f          | 54d2665785de4c63bec0807bd5c1e708 |             
 25 | 2019-07-10 10:43:17.821563 | 2019-07-10 10:43:17.821584 |                 25 | 0931730851637864-92c53b59-381a-4c02-b323-bc3958949196 | t          | f          | 8b6bad491376457080626f0ff039953b |             
 26 | 2019-07-10 10:44:36.224329 | 2019-07-10 10:44:36.224343 |                 26 | 0931730851637864-f3ec360f-737c-4dd6-abea-e62aa7931de0 | t          | f          | e02e776feabe410096a3915b66b952aa | 
 27 | 2019-07-10 10:47:28.532489 | 2019-07-10 10:47:28.532505 |                 27 | 0931730851637864-04e8c9e3-2e27-4b70-be5d-ddb558aa6945 | t          | f          | f4ae927ccf6547b2ad87dc752018511d |             
 28 | 2019-07-10 10:54:34.639236 | 2019-07-10 10:54:34.639256 |                 28 | 0931730851637864-d81064f6-b81f-4e1b-93ab-65499810c3ab | t          | f          | ed9d8f419f034644b3338919168ed711 |             
 29 | 2019-07-10 10:57:52.207035 | 2019-07-10 10:57:52.207053 |                 29 | 0931730851637864-868c1bad-25c8-4467-bc58-e60470995ca0 | t          | f          | 4f941e6f35dc43ce9756aa2cd1138ca0 |             
 30 | 2019-07-10 11:02:59.001316 | 2019-07-10 11:02:59.001326 |                 30 | 0931730851637864-53ddbf10-017c-4a26-8590-25d2e4ccd8fc | t          | f          | c1038ad694b642fe9f42fdf89f91635b |             
 31 | 2019-07-10 11:07:39.006896 | 2019-07-10 11:07:39.006935 |                 31 | 0931730851637864-8af64da8-55f6-4a34-8ec8-f7c5335f8dd2 | t          | f          | 75a6795cf78542ff82f8ba74178ddf59 |             
 32 | 2019-07-11 09:25:27.77419  | 2019-07-11 09:25:27.774228 |                 32 | 0931730851637864-5a0b28f5-fa13-4fb6-9181-597e2e808c0b | t          | f          | cacd5c76aebb43a5894cbab0f8364a81 |             
 33 | 2019-07-11 09:28:06.674375 | 2019-07-11 09:28:06.674396 |                 33 | 0931730851637864-c29078dd-5ddc-4194-82b7-932fa87dfd42 | t          | f          | 0a01130c187e43689d7e4f7cc9b47921 |                        
(33 rows)

Regarding errors in the registry, if try to create an application the following messages show up in the registry container:

10:24:41.548 INFO  [http-nio-8080-exec-9] WorkflowDefinitionImpl - Created a new workflow with name 0931730851637864-9aa1be58-aa52-499d-8a2b-f8cf23338df5
10:24:42.056 INFO  [http-nio-8080-exec-9] OtherGenericService - Creating resource for resourceType: workflow

Then after some time this (which repeats every few minutes)

10:27:34.877 INFO  [pool-4-thread-1] StatsServiceImpl - Checking for totals
10:27:37.605 ERROR [pool-4-thread-1] StatsServiceImpl - Request on http://10.43.0.11:8888/content-connector-service/ failed
org.springframework.web.client.HttpServerErrorException$InternalServerError: 500 
	at org.springframework.web.client.HttpServerErrorException.create(HttpServerErrorException.java:79) ~[spring-web-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at org.springframework.web.client.DefaultResponseErrorHandler.handleError(DefaultResponseErrorHandler.java:99) ~[spring-web-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at org.springframework.web.client.DefaultResponseErrorHandler.handleError(DefaultResponseErrorHandler.java:79) ~[spring-web-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at org.springframework.web.client.ResponseErrorHandler.handleError(ResponseErrorHandler.java:63) ~[spring-web-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at org.springframework.web.client.RestTemplate.handleResponse(RestTemplate.java:777) ~[spring-web-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:735) ~[spring-web-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:669) ~[spring-web-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at org.springframework.web.client.RestTemplate.postForEntity(RestTemplate.java:444) ~[spring-web-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at eu.openminted.registry.service.omtd.StatsServiceImpl.totals(StatsServiceImpl.java:82) [classes/:2.1.1-SNAPSHOT]
	at eu.openminted.registry.service.omtd.StatsServiceImpl.scheduled(StatsServiceImpl.java:101) [classes/:2.1.1-SNAPSHOT]
	at sun.reflect.GeneratedMethodAccessor373.invoke(Unknown Source) ~[?:?]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_201]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_201]
	at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84) [spring-context-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) [spring-context-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_201]
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_201]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_201]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_201]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
10:27:37.646 [pool-4-thread-1] ERROR o.s.s.s.TaskUtils$LoggingErrorHandler - Unexpected error occurred in scheduled task.
eu.openminted.registry.core.service.ServiceException: 500 
	at eu.openminted.registry.service.omtd.StatsServiceImpl.totals(StatsServiceImpl.java:91)
	at eu.openminted.registry.service.omtd.StatsServiceImpl.scheduled(StatsServiceImpl.java:101)
	at sun.reflect.GeneratedMethodAccessor373.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)
	at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
org.mitre.openid.connect.model.OIDCAuthenticationToken@5c4ac0a: Principal: {[email protected], iss=https://aai.openminted.eu/oidc/}; Credentials: [PROTECTED]; Authenticated: true; Details: null; Granted Authorities: ROLE_USER

It makes reference to content-connector-service which is handled by nginx to point to the omtd-content-service container

The omtd-content-service container

10:21:02.752 [main] DEBUG org.springframework.beans.factory.xml.PluggableSchemaResolver - Loaded schema mappings: {http://www.springframework.org/schema/task/spring-task-3.0.xsd=org/springframework/scheduling/config/spring-task-3.0.xsd, http://www.springframework.org/schema/oxm/spring-oxm-3.0.xsd=org/springframework/oxm/config/spring-oxm-3.0.xsd, http://www.springframework.org/schema/aop/spring-aop-4.3.xsd=org/springframework/aop/config/spring-aop-4.3.xsd, http://www.springframework.org/schema/lang/spring-lang-2.0.xsd=org/springframework/scripting/config/spring-lang-2.0.xsd, http://www.springframework.org/schema/tx/spring-tx-3.0.xsd=org/springframework/transaction/config/spring-tx-3.0.xsd, http://www.springframework.org/schema/data/repository/spring-repository-1.5.xsd=org/springframework/data/repository/config/spring-repository-1.5.xsd, http://www.springframework.org/schema/lang/spring-lang-3.0.xsd=org/springframework/scripting/config/spring-lang-3.0.xsd, http://www.springframework.org/schema/cache/spring-cache-4.0.xsd=org/springframework/cache/config/spring-cache-4.0.xsd, http://www.springframework.org/schema/util/spring-util-4.1.xsd=org/springframework/beans/factory/xml/spring-util-4.1.xsd, http://www.springframework.org/schema/tx/spring-tx-4.0.xsd=org/springframework/transaction/config/spring-tx-4.0.xsd, http://www.springframework.org/schema/beans/spring-beans.xsd=org/springframework/beans/factory/xml/spring-beans-4.3.xsd, http://www.springframework.org/schema/tool/spring-tool-4.1.xsd=org/springframework/beans/factory/xml/spring-tool-4.1.xsd, http://www.springframework.org/schema/beans/spring-beans-3.0.xsd=org/springframework/beans/factory/xml/spring-beans-3.0.xsd, http://www.springframework.org/schema/beans/spring-beans-4.0.xsd=org/springframework/beans/factory/xml/spring-beans-4.0.xsd, http://www.springframework.org/schema/task/spring-task-4.3.xsd=org/springframework/scheduling/config/spring-task-4.3.xsd, http://www.springframework.org/schema/context/spring-context-3.2.xsd=org/springframework/context/config/spring-context-3.2.xsd, http://www.springframework.org/schema/oxm/spring-oxm-4.3.xsd=org/springframework/oxm/config/spring-oxm-4.3.xsd, http://www.springframework.org/schema/context/spring-context-4.2.xsd=org/springframework/context/config/spring-context-4.2.xsd, http://www.springframework.org/schema/tx/spring-tx-2.0.xsd=org/springframework/transaction/config/spring-tx-2.0.xsd, http://www.springframework.org/schema/jee/spring-jee-2.5.xsd=org/springframework/ejb/config/spring-jee-2.5.xsd, http://www.springframework.org/schema/util/spring-util-3.1.xsd=org/springframework/beans/factory/xml/spring-util-3.1.xsd, http://www.springframework.org/schema/jms/spring-jms.xsd=org/springframework/jms/config/spring-jms-4.3.xsd, http://www.springframework.org/schema/tool/spring-tool-3.1.xsd=org/springframework/beans/factory/xml/spring-tool-3.1.xsd, http://www.springframework.org/schema/lang/spring-lang-4.3.xsd=org/springframework/scripting/config/spring-lang-4.3.xsd, http://www.springframework.org/schema/beans/spring-beans-2.0.xsd=org/springframework/beans/factory/xml/spring-beans-2.0.xsd, http://www.springframework.org/schema/jdbc/spring-jdbc-3.1.xsd=org/springframework/jdbc/config/spring-jdbc-3.1.xsd, http://www.springframework.org/schema/jms/spring-jms-4.3.xsd=org/springframework/jms/config/spring-jms-4.3.xsd, http://www.springframework.org/schema/jdbc/spring-jdbc-4.1.xsd=org/springframework/jdbc/config/spring-jdbc-4.1.xsd, http://www.springframework.org/schema/tool/spring-tool.xsd=org/springframework/beans/factory/xml/spring-tool-4.3.xsd, http://www.springframework.org/schema/jee/spring-jee-4.1.xsd=org/springframework/ejb/config/spring-jee-4.1.xsd, http://www.springframework.org/schema/data/repository/spring-repository-1.8.xsd=org/springframework/data/repository/confi...skipping...
        at org.springframework.beans.factory.support.DefaultListableBeanFactory.doResolveDependency(DefaultListableBeanFactory.java:1136)
        at org.springframework.beans.factory.support.DefaultListableBeanFactory.resolveDependency(DefaultListableBeanFactory.java:1064)
        at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.inject(AutowiredAnnotationBeanPostProcessor.java:585)
        ... 39 common frames omitted
Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'serviceConfiguration': Injection of autowired dependencies failed; nested exception is java.lang.IllegalArgumentException: Could not resolve placeholder 'db.username' in string value "${db.username}"
        at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.postProcessPropertyValues(AutowiredAnnotationBeanPostProcessor.java:372)
        at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.populateBean(AbstractAutowireCapableBeanFactory.java:1225)
        at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:552)
        at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:483)
        at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:306)
        at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:230)
        at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:302)
        at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:197)
        at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:372)
        at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(AbstractAutowireCapableBeanFactory.java:1134)
        at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1028)
        at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:513)
        at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:483)
        at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:306)
        at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:230)
        at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:302)
        at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:202)
        at org.springframework.beans.factory.config.DependencyDescriptor.resolveCandidate(DependencyDescriptor.java:207)
        at org.springframework.beans.factory.support.DefaultListableBeanFactory.doResolveDependency(DefaultListableBeanFactory.java:1136)
        at org.springframework.beans.factory.support.DefaultListableBeanFactory.resolveDependency(DefaultListableBeanFactory.java:1064)
        at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.inject(AutowiredAnnotationBeanPostProcessor.java:585)
        ... 52 common frames omitted
Caused by: java.lang.IllegalArgumentException: Could not resolve placeholder 'db.username' in string value "${db.username}"
        at org.springframework.util.PropertyPlaceholderHelper.parseStringValue(PropertyPlaceholderHelper.java:174)
        at org.springframework.util.PropertyPlaceholderHelper.replacePlaceholders(PropertyPlaceholderHelper.java:126)
        at org.springframework.core.env.AbstractPropertyResolver.doResolvePlaceholders(AbstractPropertyResolver.java:236)
        at org.springframework.core.env.AbstractPropertyResolver.resolveRequiredPlaceholders(AbstractPropertyResolver.java:210)
        at org.springframework.context.support.PropertySourcesPlaceholderConfigurer$2.resolveStringValue(PropertySourcesPlaceholderConfigurer.java:172)
        at org.springframework.beans.factory.support.AbstractBeanFactory.resolveEmbeddedValue(AbstractBeanFactory.java:823)
        at org.springframework.beans.factory.support.DefaultListableBeanFactory.doResolveDependency(DefaultListableBeanFactory.java:1084)
        at org.springframework.beans.factory.support.DefaultListableBeanFactory.resolveDependency(DefaultListableBeanFactory.java:1064)
        at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.inject(AutowiredAnnotationBeanPostProcessor.java:585)
        at org.springframework.beans.factory.annotation.InjectionMetadata.inject(InjectionMetadata.java:88)
        at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.postProcessPropertyValues(AutowiredAnnotationBeanPostProcessor.java:366)
        ... 72 common frames omitted
10:27:37.524 [Finalizer] DEBUG org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager - Connection manager is shutting down
10:27:37.524 [Finalizer] DEBUG org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager - Connection manager shut down
10:27:37.525 [Finalizer] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection manager is shutting down
10:27:37.525 [Finalizer] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection manager shut down
10:27:37.525 [Finalizer] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection manager is shutting down
10:27:37.525 [Finalizer] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection manager shut down
10:27:37.525 [Finalizer] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection manager is shutting down
10:27:37.525 [Finalizer] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection manager shut down
10:27:37.525 [Finalizer] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection manager is shutting down
10:27:37.525 [Finalizer] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection manager shut down
10:27:37.525 [Finalizer] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection manager is shutting down
10:27:37.525 [Finalizer] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection manager shut down
10:27:37.526 [Finalizer] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection manager is shutting down
10:27:37.526 [Finalizer] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection manager shut down
10:27:37.526 [Finalizer] DEBUG org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager - Connection manager is shutting down
10:27:37.526 [Finalizer] DEBUG org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager - Connection manager shut down

@bscopenminted
Copy link
Author

Another detail, when I try to create an application with the registry galaxy editor, the page stays blank, but the registry container is able to create the workflow on the galaxy editor:

11:08:49.857 INFO  [http-nio-8080-exec-10] WorkflowDefinitionImpl - Created a new workflow with name 0931730851637864-96b9cadc-0d91-4b0d-9611-2a53d0cfd376
11:08:50.333 INFO  [http-nio-8080-exec-10] OtherGenericService - Creating resource for resourceType: workflow

And in the postgres instance (and also the galaxy editor)

 37 | 2019-07-11 11:08:50.230456 | 2019-07-11 11:08:50.230484 |                 37 | 0931730851637864-96b9cadc-0d91-4b0d-9611-2a53d0cfd376 | t          | f          | 7f4d41bfa41b42149f3f72ba5602f109 |

@galanisd
Copy link
Member

The link was created and is shared between both galaxy instances, the /srv/editor/tools points to /srv/executor/tools. However I don't see the same thing on both. Restarting galaxy instances improves nothing.

This is very strange. I have tested the standalone installation and I remember that everything worked as it should. The standalone installation was also tested and by another engineer. A few questions that will help in understanding what happens...

  • Do you mean that if you ls /srv/editor/tools and ls /srv/executor/tools you see different folders/files?
    Please send me the results of the ls for both of them. If the ls outputs are the same then it might be something in the configuration of galaxies that allows to automatically load a new tool from /srv/{editor|executor}/tools dirs.

  • What do you see in the UI of the editor and what in the executor? Please send 2 screenshots.

  • The omtdImporter appears in both of them?

@bscopenminted
Copy link
Author

The link was created and is shared between both galaxy instances, the /srv/editor/tools points to /srv/executor/tools. However I don't see the same thing on both. Restarting galaxy instances improves nothing.

This is very strange. I have tested the standalone installation and I remember that everything worked as it should. The standalone installation was also tested and by another engineer. A few questions that will help in understanding what happens...

  • Do you mean that if you ls /srv/editor/tools and ls /srv/executor/tools you see different folders/files?
    Sorry, I meant on the browser, not the cli.
    Please send me the results of the ls for both of them. If the ls outputs are the same then it might be something in the configuration of galaxies that allows to automatically load a new tool from /srv/{editor|executor}/tools dirs.
  • What do you see in the UI of the editor and what in the executor? Please send 2 screenshots.
  • The omtdImporter appears in both of them?
    Editor:
    galaxy_editor
    Executor:
    galaxy_executor

@galanisd
Copy link
Member

galanisd commented Jul 12, 2019

I suggest the following plan

Step 1:
Given that
a. the symbolic link is created
b. the ls commands are returning the same things and
c. that both Galaxies show the same tools; i.e. just the omtdImporter (as it is shown in the screenshots)

then ... it seems that the tools are synced. If you want to verify that you have to register a component/tool by using the respective form (https:///resourceRegistration/component e.g. https://services.openminted.eu/resourceRegistration/component).
The component/tool should appear in both instances (executor, editor).

Step 2:
If step 1 succeeds then create a workflow in Galaxy
omtdImporter -> registered tool of Step 1

and save it as application in the Registry.
Then process a corpus with this application and tell me what happened.


Based on the screenshots it seems that the only component/tool that is available right now in the
Galaxies is omtdImporter
. I assume that if you open (in Galaxy) one of the workflows that are shown in the screenshots you will get errors for non-existent tools. As I said above a component/tool might exist in the Registry but not in the Galaxies. You will not get any errors only if the workflow contains just the omtdImporter.

@bscopenminted
Copy link
Author

I suggest the following plan

Step 1:
Given that
a. the symbolic link is created
b. the ls commands are returning the same things and
c. that both Galaxies show the same tools; i.e. just the omtdImporter (as it is shown in the screenshots)

then ... it seems that the tools are synced. If you want to verify that you have to register a component/tool by using the respective form (https:///resourceRegistration/component e.g. https://services.openminted.eu/resourceRegistration/component).
The component/tool should appear in both instances (executor, editor).

I tried adding a component as suggested, and registry reports everything going ok, but I see nothing new on the galaxy tools.

10:10:15.303 INFO  [http-nio-8080-exec-9] OmtdGenericService - Added new Resource with id 72e6ae56-ae8c-4e65-96b0-8d4bf6ef5533
10:10:15.304 INFO  [http-nio-8080-exec-9] ComponentListener - Added component
10:10:15.304 INFO  [http-nio-8080-exec-9] WorkflowEngineComponentRegistry - Registering component -> UIMA
10:10:15.345 INFO  [http-nio-8080-exec-9] WorkflowEngineComponentRegistry - selected galaxyTrgFolderName: omtdTools/AnnotatorOfSemanticRoleLabels/
Assigned docker image:null
Framework:UIMA
10:10:15.376 INFO  [http-nio-8080-exec-9] WorkflowEngineComponentRegistry - Wrapper tmp:/usr/local/tomcat/temp/wrapper_72e6ae56-ae8c-4e65-96b0-8d4bf6ef55336232513920883107935.xml
10:10:15.376 INFO  [http-nio-8080-exec-9] WorkflowEngineComponentRegistry - copyViaNFSToGalaxyToolsFolder -> make parent:true
10:10:21.786 INFO  [pool-4-thread-1] StatsServiceImpl - Checking for totals
10:10:23.324 ERROR [pool-4-thread-1] StatsServiceImpl - Request on http://10.43.0.11:8888/content-connector-service/ failed
org.springframework.web.client.HttpServerErrorException$InternalServerError: 500 
	at org.springframework.web.client.HttpServerErrorException.create(HttpServerErrorException.java:79) ~[spring-web-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at org.springframework.web.client.DefaultResponseErrorHandler.handleError(DefaultResponseErrorHandler.java:99) ~[spring-web-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at org.springframework.web.client.DefaultResponseErrorHandler.handleError(DefaultResponseErrorHandler.java:79) ~[spring-web-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at org.springframework.web.client.ResponseErrorHandler.handleError(ResponseErrorHandler.java:63) ~[spring-web-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at org.springframework.web.client.RestTemplate.handleResponse(RestTemplate.java:777) ~[spring-web-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:735) ~[spring-web-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:669) ~[spring-web-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at org.springframework.web.client.RestTemplate.postForEntity(RestTemplate.java:444) ~[spring-web-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at eu.openminted.registry.service.omtd.StatsServiceImpl.totals(StatsServiceImpl.java:82) [classes/:2.1.1-SNAPSHOT]
	at eu.openminted.registry.service.omtd.StatsServiceImpl.scheduled(StatsServiceImpl.java:101) [classes/:2.1.1-SNAPSHOT]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_201]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_201]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_201]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_201]
	at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84) [spring-context-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) [spring-context-5.1.0.RELEASE.jar:5.1.0.RELEASE]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_201]
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_201]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_201]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_201]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
10:10:23.326 [pool-4-thread-1] ERROR o.s.s.s.TaskUtils$LoggingErrorHandler - Unexpected error occurred in scheduled task.
eu.openminted.registry.core.service.ServiceException: 500 
	at eu.openminted.registry.service.omtd.StatsServiceImpl.totals(StatsServiceImpl.java:91)
	at eu.openminted.registry.service.omtd.StatsServiceImpl.scheduled(StatsServiceImpl.java:101)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)
	at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Step 2:
If step 1 succeeds then create a workflow in Galaxy
omtdImporter -> registered tool of Step 1

and save it as application in the Registry.
Then process a corpus with this application and tell me what happened.

Based on the screenshots it seems that the only component/tool that is available right now in the Galaxies is omtdImporter. I assume that if you open (in Galaxy) one of the workflows that are shown in the screenshots you will get errors for non-existent tools. As I said above a component/tool might exist in the Registry but not in the Galaxies. You will not get any errors only if the workflow contains just the omtdImporter.

@bscopenminted
Copy link
Author

I've been trying to figure out why what I import/create from the registry does not show up on the galaxy instances. After checking the registry logs, this two lines:

10:10:15.345 INFO  [http-nio-8080-exec-9] WorkflowEngineComponentRegistry - selected galaxyTrgFolderName: omtdTools/AnnotatorOfSemanticRoleLabels/

and:

10:10:15.376 INFO  [http-nio-8080-exec-9] WorkflowEngineComponentRegistry - copyViaNFSToGalaxyToolsFolder -> make parent:true

I checked the the registry repo in search of a path or variable to define the target location for the tool to be copied and could not find any (all references use relative path omtdTools/). Then went into the registry container, and found multiple xml files in /usr/local/tomcat/temp, I copied one inside the /srv/exectutor/tools/omtdTools/AnnotatorOfSemanticRoleLabels and it shows now on both galaxy instances. But it complains that it could not find the tool with id...
How is the data transferred from the registry to the galaxy instances? via the NFS mount? if so, which path should be defined, and where? or should I make the nfs mount available as a volume in the registry container? I

@galanisd
Copy link
Member

  • 1

How is the data transferred from the registry to the galaxy instances? via the NFS mount?

Yes via NFS. However (as far as I understand) for some reason copying does not work. You have to check that NFS mount is OK.

  • 2

I copied one inside the /srv/exectutor/tools/omtdTools/AnnotatorOfSemanticRoleLabels and it shows now on both galaxy instances.

Great. So syncing works

  • 3

But it complains that it could not find the tool with id

I remember that some tool ids were causing some issues in specific Galaxy pages. E.g we were getting errors like the following ... "Could not find tool with id 'mvn:de.tudarmstadt.ukp.dkpro.core:de.tudarmstadt.ukp.dkpro.core.clearnlp-asl:1.9.2'."
But I think that you can drag-and-drop the tool and create a workflow when you are in the workflow editor env. Correct? In the workflow editor page these tools were not causing an issue. So I think this is not a problem since we have created and tested workflows with such tool IDs and they appear as they should in the workflow editor environment page.


The problem is 1.
It is probably NFS. @saxtouri any ideas?
The Registry and Galaxy machines are different? If yes, is it possible to ping the one from the other?

I also see exceptions that have to do with the content service

10:10:21.786 INFO  [pool-4-thread-1] StatsServiceImpl - Checking for totals
10:10:23.324 ERROR [pool-4-thread-1] StatsServiceImpl - Request on http://10.43.0.11:8888/content-connector-service/ failed

Not sure how this affects NFS copying. It shouldn't .... I think
@antleb @Jodee90 ?

@bscopenminted
Copy link
Author

The problem is 1.
It is probably NFS. @saxtouri any ideas?
The Registry and Galaxy machines are different? If yes, is it possible to ping the one from the other?

I also see exceptions that have to do with the content service

10:10:21.786 INFO  [pool-4-thread-1] StatsServiceImpl - Checking for totals
10:10:23.324 ERROR [pool-4-thread-1] StatsServiceImpl - Request on http://10.43.0.11:8888/content-connector-service/ failed

Not sure how this affects NFS copying. It shouldn't .... I think
@antleb @Jodee90 ?

Registry resides in 10.43.0.11 and galaxy instances in 10.43.0.12.

10.43.0.12 exports /srv/executor/tools via NFS and is mounted by 10.43.0.11, the node that runs the registry and other containers at /media/galaxy. The tool that I imported, I copied via NFS so the mountpoint and sync work as expected. I took the mountpoint from the ansible playbook

shell: echo "83.212.99.0:/srv/galaxy /media/galaxy/ nfs rsize=8192,wsize=8192,timeo=14,intr" >> /etc/fstab
however I could not find /media/galaxy neither in the environment configuration files nor any of the repository code.

I'm not sure how or where should I point registry to that folder.

@saxtouri
Copy link
Collaborator

saxtouri commented Jul 15, 2019 via email

@bscopenminted
Copy link
Author

The exports:

root@omtd-worker:/root/omtd-installations/omtd-store/scripts# cat /etc/exports
/srv/executor/database 10.43.0.10(rw,sync,no_subtree_check,no_root_squash)
/srv/executor/database 10.43.0.11(rw,sync,no_subtree_check,no_root_squash)

/srv/executor/tools 10.43.0.10(rw,sync,no_subtree_check,no_root_squash)
/srv/executor/tools 10.43.0.11(rw,sync,no_subtree_check,no_root_squash)

The client mount options in /etc/fstab

10.43.0.12:/srv/executor/tools   /media/galaxy/  nfs rsize=8192,wsize=8192,timeo=14,intr

@galanisd
Copy link
Member

This is the OMTD Registry code that writes the Galaxy XML wrappers.
https://github.com/openminted/omtd-registry/blob/develop/omtd-registry-service/src/main/java/eu/openminted/registry/service/tool/WorkflowEngineComponentRegistryGalaxyImpl.java

The Registry runs in a Docker container and the code uses a specific path in the container.
So, 2 mounts are required so that an XML is correctly transferred to Galaxies.

  1. "OMTD Registry Docker container" <-> "OMTD Registry host machine (10.43.0.11)".

  2. "OMTD Registry host machine (10.43.0.11)" <->OMTD Galaxies machine (10.43.0.12)

Please check that mount 1 is also OK.

@galanisd
Copy link
Member

Please check that mount 1 is also OK.

"docker inspect ..." shows mounts I think.

@saxtouri
Copy link
Collaborator

saxtouri commented Jul 15, 2019 via email

@bscopenminted
Copy link
Author

I managed to get it working by editing the extra mount at

- registrydata:/media/maven-data
I changed it so that it mounts the local nfs client mount:

    volumes:
      - /media/galaxy:/opt/galaxy/tools

This from the recommendation of checking the java code that @galanisd pointed out, where I found:https://github.com/openminted/omtd-registry/blob/0f3a0bec7c76411bd58c09bd32e94425941205fa/omtd-registry-service/src/main/java/eu/openminted/registry/service/tool/WorkflowEngineComponentRegistryGalaxyImpl.java#L32

applying the changes and after the registry restart I can now add a maven tool and it shows up in the galaxy instances :)

Now what I'm missing is the galaxy workflow editor when I try to add an app, it shows a blank frame where the galaxy editor canvas should show up. Any ideas? could this be the nginx conf or some other thing?

Thanks!

@galanisd
Copy link
Member

galanisd commented Jul 15, 2019

Nginx acts as proxy for Galaxy editor. So, please check this URL (https://nginx IP/galaxy) and tell me what happens. Nginx should be configured with with secret key of Galaxy editor etc.

@galanisd
Copy link
Member

Before checking https://<nginx IP>/galaxy I would also restart the Galaxy editor. Sometimes it gets stuck.

@bscopenminted
Copy link
Author

Frontend nginx

      location /galaxy {
          proxy_set_header        Host $host;
          proxy_set_header        X-Real-IP $remote_addr;
          proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header        X-Forwarded-Proto $scheme;
          proxy_set_header    REMOTE_USER <REMOTE_USER>;
          proxy_set_header    GX_SECRET <executoromtdsecret>;


          rewrite ^/galaxy(.*) /$1 break;
          proxy_pass          http://10.43.0.12;
          proxy_read_timeout  90;

          sub_filter "/static/(.*)" "/galaxy/static/$1";
      }

galaxy node 10.43.0.12 has apache as reverse proxy for both galaxies

<VirtualHost *:80>
  ServerName omtd-worker


  ProxyPass /executor http://localhost:8080
  ProxyPassReverse /executor http://localhost:8080

  ProxyPass /editor http://localhost:8081
  ProxyPassReverse /editor http://localhost:8081



</VirtualHost>

This is what the omtd-standalone-setup apache role sets (abeit ProxyPass twice instead of ProxyPass and ProxyPassReverse) https://github.com/openminted/omtd-standalone-setup/blob/master/roles/apache2/templates/vhosts.conf.j2

The nginx conf also contains (from the config you sent me by mail )

      location /faq {
          proxy_set_header        Host $host;
          proxy_set_header        X-Real-IP $remote_addr;
          proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header        X-Forwarded-Proto $scheme;

          proxy_pass          http://10.43.0.12:5555/api;
          proxy_read_timeout  90;
      }

The proxy_pass is adapted to the ip of the container host, but I see no services listening on 5555, nor any config that references it.

If I try to reach http://nginx-frontend/galaxy I reach the apache server in 10.43.0.12 .

Do you see anything that stands out as wrong here? and should the 5555 port be configured for any of the containers?

@galanisd
Copy link
Member

proxy_set_header REMOTE_USER <REMOTE_USER>;

<REMOTE_USER> should be substituted with something else. I think that it does not matter with what
(e.g. workflow_editor). The Galaxy editor when it receives a remote user access request it just needs a user name that will be used in the respective messages/menus. E.g. "Logged-in as worfklow_editor"
Not 100% sure though. Please try it.

proxy_set_header GX_SECRET ;

<executoromtdsecret> should be substituted with the value that is set to
remote_user_secret var in the .../config/galaxy.ini file of the Galaxy editor.

There are some instructions on how to set this secret when the editor is installed here.
https://github.com/openminted/omtd-standalone-setup
-->

editor_remote_user_secret: your-secret-here

Check also in the .../config/galaxy.ini file that
use_remote_user=True
and what is set to remote_user_maildomain variable.

@galanisd
Copy link
Member

Any updates? Did it work?

@bscopenminted
Copy link
Author

Hi, sorry I've been caught up with other issues :/

I did check the config and updated the nginx config to remove the <> as the values where correct. I had galaxy configured without the use_remote_user=True to debug it, and changed it as suggested. I get permission error if I try to go to http://nginx-proxy/galaxy or if I try to use the app editor (iframe for galaxy) within the platform.

I checked the headers nginx is passing to the apache proxy that sits in front of galaxy and the values are correct:

+24404:5d2ece20:1|GET /galaxy HTTP/1.0|Host:bscopenmint01.bsc.es|X-Real-IP:xxx.xxx.xxx.xxx|X-Forwarded-For:xxx.xxx.xxx.xxx|X-Forwarded-Proto:http|REMOTE_USER:REMOTE_USER|GX_SECRET:editoromtdsecret|Connection:close|Cache-Control:max-age=0|Upgrade-Insecure-Requests:1|User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36|DNT:1|Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3|Accept-Encoding:gzip, deflate|Accept-Language:en-US,en;q=0.9,es;q=0.8,fr;q=0.7|Cookie:_ga=GA1.2.1537331807.1544788358; cookieLawSeen-openminted=true; galaxysession=c6ca0ddb55be603affd19651ed476fc7a92a8df759ade8ac00edd8e4ec18bf30ca33f0d969986daf; SESSION=6e7f7abc-4769-419d-9b53-50860b8cc613; [email protected]
-24404:5d2ece20:1

Then in the galaxy log

Jul 17 09:25:44 omtd-worker run.sh[24306]: xxx.xxx.xxx.xxx - - [17/Jul/2019:09:25:44 +0200] "GET /editor/ HTTP/1.1" 403 - "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"

the lines in galaxy.ini for this values:

use_remote_user = True
...
remote_user_secret = "editoromtdsecret"

From what I understand, the user field is not relevant, but required. Could this be the issue?

@galanisd
Copy link
Member

If I understand well you got the same message as in #4
Correct? Please send a screenshot.

At the nginx side

  1. proxy_set_header REMOTE_USER <user>;
    <user> = ?
    However as already said I think that it does not matter what is the value of <user>

  2. proxy_set_header "editoromtdsecret" ????

At the Galaxy side (Galaxy.ini)

remote_user_maildomain = ????

@bscopenminted
Copy link
Author

If I understand well you got the same message as in #4
Correct? Please send a screenshot.

Correct, the same message

At the nginx side

  1. proxy_set_header REMOTE_USER ;
    = ?
    I setup as also "REMOTE_USER"
    However as already said I think that it does not matter what is the value of
  2. proxy_set_header "editoromtdsecret" ????

Yes, not secure but it's just for testing

At the Galaxy side (Galaxy.ini)

remote_user_maildomain = ????
`remote_user_maildomain = 'bscopenmint01.bsc.es'

The headers that nginx sends are:

+24404:5d2ece20:1|GET /galaxy HTTP/1.0|Host:bscopenmint01.bsc.es|X-Real-IP:xxx.xxx.xxx.xxx|X-Forwarded-For:xxx.xxx.xxx.xxx|X-Forwarded-Proto:http|REMOTE_USER:REMOTE_USER|GX_SECRET:editoromtdsecret|Connection:close|Cache-Control:max-age=0|Upgrade-Insecure-Requests:1|User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36|DNT:1|Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3|Accept-Encoding:gzip, deflate|Accept-Language:en-US,en;q=0.9,es;q=0.8,fr;q=0.7|Cookie:_ga=GA1.2.1537331807.1544788358; cookieLawSeen-openminted=true; galaxysession=c6ca0ddb55be603affd19651ed476fc7a92a8df759ade8ac00edd8e4ec18bf30ca33f0d969986daf; SESSION=6e7f7abc-4769-419d-9b53-50860b8cc613; [email protected]
-24404:5d2ece20:1

It correctly sends REMOTE_USER:REMOTE_USER and GX_SECRET:editoromtdsecret, and from galaxy.ini file, if I understood correctly, the galaxy wsgi prepends HTTP_ to any header it receives, so no need to set HTTP_ for the REMOTE_USER nor GX_SECRET in nginx?

@galanisd
Copy link
Member

You config seems similar to ours.
The only difference I see is that proxy_pass in our config points to a https address (not http).

I do not know if there are any other settings that should be changed/added in
nginx @Jodee90 ? or in Galaxy @saxtouri ?

@galanisd
Copy link
Member

galanisd commented Jul 18, 2019

Also just confirmed that REMOTE_USER value does not matter. I changed it in nginx config and restarted it.
I had access to Galaxy server with the new REMOTE_USER value.

@bscopenminted
Copy link
Author

I was wondering if the apache that proxies the galaxy instances was not forwarding all headers, so I changed the nginx conf to point directly to the galaxy port of the editor, but the result is the same.

Here's the data received by galaxy editor and sent from nginx:

Jul 18 13:25:08 omtd-worker run.sh[24306]: xxx.xxx.xxx.xxx - - [18/Jul/2019:13:25:08 +0200] "GET /editor/api/page/route?q=/resourceRegistration/application HTTP/1.0" 404 - "http://bscopenmint01.bsc.es/resourceRegistration/application" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"
Jul 18 13:25:11 omtd-worker run.sh[24306]: 10.43.0.11 - - [18/Jul/2019:13:25:11 +0200] "POST /editor/api/workflows?key=4ffe33f541cb31be5eb7dbee2ad9bda1 HTTP/1.1" 200 - "-" "Java/1.8.0_201"
Jul 18 13:25:11 omtd-worker run.sh[24306]: xxx.xxx.xxx.xxx - - [18/Jul/2019:13:25:11 +0200] "GET /editor/api/page/route?q=/buildWorkflow/dce9851f-ae77-4ebb-b165-16473e2401b2 HTTP/1.0" 404 - "http://bscopenmint01.bsc.es/buildWorkflow/dce9851f-ae77-4ebb-b165-16473e2401b2" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"
Jul 18 13:25:11 omtd-worker run.sh[24306]: xxx.xxx.xxx.xxx - - [18/Jul/2019:13:25:11 +0200] "GET /editor/galaxy/workflow/editor?id=b8a0d6158b9961df HTTP/1.0" 403 - "http://bscopenmint01.bsc.es/buildWorkflow/dce9851f-ae77-4ebb-b165-16473e2401b2" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"

*The redacted IP is the one of the client connecting to nginx(my computer), not the one of the nginx server.

@galanisd
Copy link
Member

galanisd commented Jul 18, 2019

Not sure whether the problem is in the nginx side or in the Galaxy side.

A way to solve this is the following

  1. Install to your browser a plugin for setting headers . E.g. https://chrome.google.com/webstore/detail/modheader/idgpnmonknjnojddfkpgkljpfnnfcklj?hl=en

  2. Set the required headers and access Galaxy directly...not through nginx

If you get the same message then probably the problem is in Galaxy configuration.

@bscopenminted
Copy link
Author

I get the same error, so it seems a galaxy issue then.

The galaxy version I'm using is the one pulled from the ansible playbook that is used in omtd-standalone-setup which clones the editor branch of https://github.com/openminted/galaxy (commit: 23bb846ef20188a941bcf7721c6ee2083f42c29d ) Is this the version I should be using?

@galanisd
Copy link
Member

As far as I know this is the correct one. My guess is that you are installing the correct version, however, the problem is in the configuration.

I just had a look into the config.ini file of the editor
and to https://github.com/openminted/omtd-standalone-setup/blob/master/roles/editor/templates/galaxy.ini.j2
that is the file that the ansible scripts use to generate the galaxy.ini

I noticed one difference.
allow_user_creation = False in our config (the one for services.openminted.eu/galaxy)
vs.
allow_user_creation = True in the galaxy.ini.j2 template file.

Please try it. Set allow_user_creation = False

@galanisd
Copy link
Member

I noticed one difference.
allow_user_creation = False in our config (the one for services.openminted.eu/galaxy)
vs.
allow_user_creation = True in the galaxy.ini.j2 template file.

Please try it. Set allow_user_creation = False

However I do not understand why allow_user_creation should affect "remote_user" logins.

@galanisd
Copy link
Member

I tried allow_user_creation = True but it didn't make any difference.

so I changed the nginx conf to point directly to the galaxy port of the editor, but the result is the same.

Good idea. Strange that it didn't work.

@galanisd
Copy link
Member

I just discussed the issue with @saxtouri from GRNET and he remembered that enabling
remote_user access was a little bit tricky.

Please try the following steps:

Step 1:
Set use_remote_user = False
comment remote_user_secret line
set allow_user_creation = True
restart Galaxy

Step 2:
Use browser + Galaxy UI to register a user.

Step 3:
Set use_remote_user = True
uncomment remote_user_secret line
set allow_user_creation = False
restart Galaxy and retry

It seems that Galaxy requires one user in order to enable remote_user access.

@bscopenminted
Copy link
Author

Yes, that's required as is the only way to generate the api token.
I'm going to reinstall the galaxy instances from scratch, as I've discovered some erros in the deploy logs regarding some package installation pathtools

pkg_resources.VersionConflict: (Paste 2.0.2 (/srv/editor/.venv/lib/python2.7/site-packages), Requirement.parse('Paste>=3.0'))

Not sure if this might be the cause, but its worth checking.

@bscopenminted
Copy link
Author

bscopenminted commented Jul 26, 2019

I managed to make the auth work, it seems that I shouldn't use quotes for the secret string in the ini file.

After that I got the following error trying to add a app (no issue with reaching /galaxy)

StoredWorkflow is not owned by the current user

I can comment the https://github.com/openminted/galaxy/blob/23bb846ef20188a941bcf7721c6ee2083f42c29d/lib/galaxy/managers/base.py#L61-L62 lines and then the galaxy frame shows up in the registry. Tried getting both values but both are empty when I print them in the same error line. I'm not sure if just removing the check is the best idea, but also I don't know why it happens. Any ideas?

StoredWorkflow is not owned by the current user item.user __ trans.user __

Using:

        if item.user != trans.user:
            raise exceptions.ItemOwnershipException("%s is not owned by the current user item.user _%s_ trans.user _%s_" % (item.__class__.__name__,item.user,trans.user), type='error')

@galanisd
Copy link
Member

I managed to make the auth work, it seems that I shouldn't use quotes for the secret string in the ini file.
....(no issue with reaching /galaxy)

Great!

I can comment the https://github.com/openminted/galaxy/blob/23bb846ef20188a941bcf7721c6ee2083f42c29d/lib/galaxy/managers/base.py#L61-L62 lines and then the galaxy frame shows up in the registry. Tried getting both values but both are empty when I print them in the same error line. I'm not sure if just removing the check is the best idea, but also I don't know why it happens. Any ideas?

I remember that we had some ownership issues and these were solved by (I think, not sure) by either changing the source code in Galaxy or by generating the appropriate calls to Galaxy api. I suspect that our installation (services.openminted.eu) uses a version of Galaxy code that contains the required changes. I just checked the commit of the editor that is deployed for services.openminted.eu: --> 15372f7369137b964534e9ddb8e81fb150dc2677. You reported a different commit in a previous comment. I hope that are not any additional manual changes.

I can't provide any help on this .....I didn't work on this part at all.
@saxtouri any ideas why the commit ids are different? Why the installation script installs
a different version of Galaxy than the one we run for services.openminted.eu

@greenwoodma
Copy link
Member

This new issue about the workflow not being owned by the current user rings a vague bell. I have a feeling that this happened if the workflow had been added to Galaxy by a different user than the proxy was logging in as. In our deployment version (assuming it hasn't changed since I worked on the project) we ensured that the registry pushed workflows into Galaxy using the same user as the proxy was configured to use. Is this the case in your configuration (apologies that I've not followed the thread as closely as I no longer work on the project so you might have covered this already).

There were also two bugs we found related to permissions in Galaxy galaxyproject/galaxy#4401 and galaxyproject/galaxy#5955 but both of those issues were fixed and should have been merged into the version you are using so I doubt they are relevant, but if the issue persists after checking that the usernames match then they might be a useful starting point for further debugging.

@bscopenminted
Copy link
Author

@greenwoodma I checked the database and I had two users, yesterday during the worker reinstall I did some tests with the http headers and created a second user by mistake, which I was using to authenticate now, and was not the original. Reverting to the first user fixes the StoredWorkflow is not owned by the current user error.

@galanisd the commit I'm using for the editor is the latest of the editor branch, but fixing the user issue it seems to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants