-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check broken links #89
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little puzzled, and unsure what the best approach is. Your idea is to first check the presence of files using your curl
-based module, the let them go to staging? If you'd do it the other way around, first stage, then check, you'd loose the idea with this I suppose?
You're calling the new module for every file. Some sort of loop over a collect
ed channel would be much more effective. You could perhaps have a module that just returns a channel of correct urls after looping over all? (One might end up with long commands, but that could be dealt with by splitting perhaps? I can't see a way of splitting a channel now though; only based on reading files.)
Even better would be if one could do this directly with nextflow/groovy and avoid the module altogether.
Let's discuss tomorrow.
# Use curl to check if the URL returns 404 | ||
if curl -Is "${genome_fna}" | grep -q "404 Not Found"; then | ||
echo "Broken link: ${genome_fna}" | ||
exit 0 # Exit successfully but don't emit anything |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only works on remote files, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it does.
Co-authored-by: Daniel Lundin <[email protected]>
Co-authored-by: Daniel Lundin <[email protected]>
|
PR checklist
nf-core lint
).nextflow run . -profile test,docker
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).I tried to address the #90 bug that stops the pipeline when it finds a broken link from outside source (like ncbi).
Now there is a module "check_broken_links.nf" that checks whether the links are broken or not. if they aren't, the links are then used to stage the genome, otherwise it will be discarded.