Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy or move the files instead of creating symlinks in the LocalizeReads task for compatibility with CoA/TES. #619

Merged
merged 5 commits into from
Dec 8, 2023

Conversation

VJalili
Copy link
Member

@VJalili VJalili commented Nov 22, 2023

The task LocalizeReads creates symlinks for the localized cram and its corresponding index file in the current working directory to simplify referencing them in the task output. Specifically, the files are localized in the following directory, and symlinks are created in the /cromwell_root/.

/cromwell_root/…/analysis/filename.cram

Symlinks as this are incompatible with GA4GH TES and Cromwell on Azure.

Therefore, this PR updates the script to move or copy the files instead of creating symlinks. Moving files (mv) is faster and requires less disk space on the VM than copying the files. However, run on HPC, moving files could lead to moving them from their source directory compared to moving them from the localized directory to the root directory on the VM. Therefore, we provide both options for moving and copying to avoid moving files unexpectedly and set the copy as default.

Cromwell submission ID for a successful test of this PR is bd10381c-53b3-4c67-a396-291135af0f6f.

The following can be an alternative to moving the files, though it still needs to be tested.

File output_file = basename(reads_path)  --> File output_file = reads_path

@VJalili VJalili requested a review from mwalker174 November 22, 2023 13:45
Copy link
Collaborator

@mwalker174 mwalker174 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @VJalili! Did you happen to test using cp as well? We've had complaints in the past about moving mv on input files from folks using shared filesystems. I realize it's slower and would require extra disk, but I would think that the cost would be small.

@VJalili
Copy link
Member Author

VJalili commented Nov 22, 2023

That is a good point, thank you, @mwalker174.

Yes, I first tested using cp, but it failed with an out-of-disk error, so I tried with mv. I think this is moving a file localized to a directory on the runner's VM to another directory on the VM. I think even when using a shared file system, this VM still has its own copy/localized version of the file, so it should not be moving a file on a different file system. Though I have not tested, and am happy to test, I will only need some references to files on the shared file system.

@VJalili VJalili changed the title Move the file instead of using symlinks for compatibility with CoA/TES. Copy or move the files instead of creating symlinks in the LocalizeReads task for compatibility with CoA/TES. Dec 6, 2023
Copy link
Collaborator

@mwalker174 mwalker174 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Minor comments

wdl/GatherSampleEvidence.wdl Show resolved Hide resolved
wdl/GatherSampleEvidence.wdl Outdated Show resolved Hide resolved
@VJalili
Copy link
Member Author

VJalili commented Dec 8, 2023

Thank you, @mwalker174!!

@VJalili VJalili merged commit 7607a32 into broadinstitute:main Dec 8, 2023
2 checks passed
@VJalili VJalili deleted the localize-reads branch December 8, 2023 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants