Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CWL workflow outputs within nested binding not handled #371

Closed
fmigneault opened this issue Nov 24, 2021 · 3 comments · Fixed by #372
Closed

CWL workflow outputs within nested binding not handled #371

fmigneault opened this issue Nov 24, 2021 · 3 comments · Fixed by #372
Assignees
Labels
process/workflow Related to a Workflow process. triage/bug Something isn't working

Comments

@fmigneault
Copy link
Collaborator

As presented in: #358 (comment)

Workflows that define nested directories such as outputBindings: {glob: "somedir/*.patterm"} do not find outputs for mapping from step(i) out ->step(i+1) in because the somedir hierarchy is not preserved since results are staged-in for remote directly under job UUID.

Sample definitions to test:

  1. There is an application that generates the images
{
    "processDescription": {
        "process": {
            "visibility": "public",
            "id": "test_generation",
            "title": "Test image generation",
            "abstract": "Generates some test GeoTIFF",
            "version": "0.0.1",
            "outputs": [
                {
                    "id": "output_tifs",
                    "title": "GeoTIFF Images",
                    "formats": [
                        {
                            "default": true,
                            "mimeType": "image/tiff"
                        }
                    ]
                }
            ]
        },
        "processVersion": "2.0"
    },
    "immediateDeployment": true,
    "executionUnit": [
        {
            "unit": {
                "cwlVersion": "v1.0",
                "class": "CommandLineTool",
                "baseCommand": [
                    "generate",
                    "--output",
                    "output/"
                ],
                "requirements": {
                    "DockerRequirement": {
                        "dockerPull": "image_utils:latest"
                    }
                },
                "inputs": {
                    "base_name": {
                        "type": "string",
                        "inputBinding": {
                            "position": 1,
                            "prefix": "--base-name"
                        }
                    },
                    "count": {
                        "type": "int",
                        "inputBinding": {
                            "position": 2,
                            "prefix": "--count"
                        }
                    }
                },
                "outputs": {
                    "output_tifs": {
                        "type": {
                            "type": "array",
                            "items": "File"
                        },
                        "outputBinding": {
                            "glob": "output/*.tif"
                        }
                    }
                }
            }
        }
    ],
    "deploymentProfileName": "http://www.opengis.net/profiles/eoc/dockerizedApplication"
}
  1. There is an application that blurs the images.
{
    "processDescription": {
        "process": {
            "visibility": "public",
            "id": "test_blurring",
            "title": "Test image blurring",
            "abstract": "Test image blurring",
            "version": "0.0.1",
            "inputs": [
                {
                    "id": "input_files",
                    "title": "Input images",
                    "formats": [
                        {
                            "mimeType": "image/tiff",
                            "default": true
                        }
                    ],
                    "minOccurs": "1",
                    "maxOccurs": "unbounded"
                }
            ],
            "outputs": [
                {
                    "id": "output_tifs",
                    "title": "Blurred images",
                    "formats": [
                        {
                            "mimeType": "image/tiff",
                            "default": true
                        }
                    ]
                }
            ]
        },
        "processVersion": "2.0"
    },
    "immediateDeployment": true,
    "executionUnit": [
        {
            "unit": {
                "cwlVersion": "v1.0",
                "class": "CommandLineTool",
                "baseCommand": [
                    "blur",
                    "-o",
                    "output/"
                ],
                "requirements": {
                    "DockerRequirement": {
                        "dockerPull": "image_utils:latest"
                    }
                },
                "inputs": {
                    "input_files": {
                        "type": {
                            "type": "array",
                            "items": "File"
                        },
                        "inputBinding": {
                            "position": 1,
                            "prefix": "--image"
                        }
                    }
                },
                "outputs": {
                    "output_tifs": {
                        "type": {
                            "type": "array",
                            "items": "File"
                        },
                        "outputBinding": {
                            "glob": "output/*.tif"
                        }
                    }
                }
            }
        }
    ],
    "deploymentProfileName": "http://www.opengis.net/profiles/eoc/dockerizedApplication"
}
  1. Register the workflow
{
    "processDescription": {
        "process": {
            "visibility": "public",
            "id": "test_workflow",
            "title": "Workflow to test Weaver",
            "abstract": "Workflow to test Weaver",
            "version": "0.0.1"
        }
    },
    "executionUnit": [
        {
            "unit": {
                "cwlVersion": "v1.0",
                "class": "Workflow",
                "inputs": {
                    "base_name": "string",
                    "count": "int"
                },
                "outputs": {
                    "output": {
                        "type": {
                            "type": "array",
                            "items": "File"
                        },
                        "outputSource": "blurring/output_tifs"
                    }
                },
                "steps": {
                    "generation": {
                        "run": "test_generation",
                        "in": {
                            "base_name": "base_name",
                            "count": "count"
                        },
                        "out": [
                            "output_tifs"
                        ]
                    },
                    "blurring": {
                        "run": "test_blurring",
                        "in": {
                            "input_files": "generation/output_tifs"
                        },
                        "out": [
                            "output_tifs"
                        ]
                    }
                }
            }
        }
    ],
    "deploymentProfileName": "http://www.opengis.net/profiles/eoc/workflow"
}
  1. Execute the workflow
{
    "mode": "async",
    "response": "document",
    "inputs": [
        {
            "id": "base_name",
            "data": "castellon"
        },
        {
            "id": "count",
            "data": 4
        }
    ],
    "outputs": [
        {
            "id": "output",
            "transmissionMode": "reference"
        }
    ]
}

Again, the workflow is failing with the following error:

gip-ems-worker  | [2021-11-23 21:38:07,374] INFO     [MainThread][weaver.processes.execution] 00:00:00   5% accepted   Following updates could take a while until the Application Package answers...
gip-ems-worker  | [2021-11-23 21:38:07,383] ERROR    [MainThread][PYWPS] Missing parameter value: input_files
gip-ems-worker  | [2021-11-23 21:38:07,383] ERROR    [MainThread][PYWPS] Exception: code: 400, description: input_files, locator: input_files
gip-ems-worker  | NoneType: None
gip-ems-worker  | [2021-11-23 21:38:07,383] ERROR    [MainThread][weaver.processes.execution] Failed running [Job <0f8d2d9f-2826-4bd3-8d10-c643bcda91d2>]
gip-ems-worker  | Traceback (most recent call last):
gip-ems-worker  |   File "/opt/local/src/weaver/weaver/processes/execution.py", line 165, in execute_process
gip-ems-worker  |     mode=mode, job_uuid=job.id, remote_process=process)
gip-ems-worker  |   File "/opt/local/src/weaver/weaver/wps/service.py", line 271, in execute_job
gip-ems-worker  |     wps_response = super(WorkerService, self).execute(worker_process_id, wps_request, job_uuid)
gip-ems-worker  |   File "/usr/local/lib/python3.7/site-packages/pywps/app/Service.py", line 79, in execute
gip-ems-worker  |     return self._parse_and_execute(process, wps_request, uuid)
gip-ems-worker  |   File "/usr/local/lib/python3.7/site-packages/pywps/app/Service.py", line 136, in _parse_and_execute
gip-ems-worker  |     inpt.identifier, inpt.identifier)
gip-ems-worker  | pywps.exceptions.MissingParameterValue: 400 MissingParameterValue: input_files

Originally posted by @lhcorralo in #358 (comment)

@fmigneault
Copy link
Collaborator Author

@lhcorralo
I managed to track the cause of the problem you found. I will track it in this separate issue.
I will work on a solution at a later time (somewhat limited in time right now).

Note that your step test_blurring was missing the "blur" mode in the command.

@fmigneault fmigneault self-assigned this Nov 24, 2021
@fmigneault fmigneault added process/workflow Related to a Workflow process. triage/bug Something isn't working labels Nov 24, 2021
fmigneault added a commit that referenced this issue Nov 24, 2021
@lhcorralo
Copy link

lhcorralo commented Nov 25, 2021

Thank you for noticing the missing "blur" parameter :)

I just wanted to confirm that if I do not use nested folders, version 4.4 is running fine 👍

Let me know if I can help in any way

@fmigneault
Copy link
Collaborator Author

@lhcorralo
I have this PR (with highlighted fix for nested directories): https://github.com/crim-ca/weaver/pull/372/files#diff-5a27985507181f33a326cac07412db4739b9491a154c1ad74dbc1f926aa8c70eR284-R290
I want to test it further before integration in a new version, but you can give it a try if you need nested dirs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
process/workflow Related to a Workflow process. triage/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants