Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring + other issues with cadd and samtools merge #538

Merged
merged 25 commits into from
May 17, 2024
Merged

Conversation

ramprasadn
Copy link
Collaborator

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/raredisease branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Ensure the test suite passes (nextflow run . -profile test_one_sample,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Copy link

github-actions bot commented Apr 16, 2024

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 9da1d8d

+| ✅ 183 tests passed       |+
#| ❔   7 tests were ignored |#
!| ❗   1 tests had warnings |!

❗ Test warnings:

  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required

❔ Tests ignored:

  • files_exist - File is ignored: conf/modules.config
  • files_unchanged - File ignored due to lint config: .github/CONTRIBUTING.md
  • files_unchanged - File ignored due to lint config: .github/PULL_REQUEST_TEMPLATE.md
  • files_unchanged - File ignored due to lint config: assets/nf-core-raredisease_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-raredisease_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-raredisease_logo_dark.png
  • modules_config - modules_config

✅ Tests passed:

Run details

  • nf-core/tools version 2.14.1
  • Run at 2024-05-17 08:44:15

@fa2k
Copy link
Contributor

fa2k commented Apr 17, 2024

The test profile works perfectly for this patch branch on RHEL 9 and Fedora 39 with singularity and local executor (no executor/cluster setting). [I am testing 2.0.1 with a full-scale dataset and will also test the patch branch, if it can complete in time]

@Jakob37
Copy link
Contributor

Jakob37 commented Apr 18, 2024

Hi! I am trying out running this patch PR on the dataset I have been running before (i.e. GIAB on our server).

I run into errors, trying to see whether they are on my part or in the code.

ERROR ~ Cannot get property 'case_id' on null object

 -- Check script '<path>/raredisease/./workflows/raredisease.nf' at line: 166 or see '.nextflow.log' file for more details
ERROR ~ No such variable: Exception evaluating property 'gtbi' for nextflow.script.ChannelOut, Reason: groovy.lang.MissingPropertyException: No such property: gtbi for class: groovyx.gpars.dataflow.DataflowBroadcast

 -- Check script '<path>/raredisease/./workflows/../subworkflows/local/call_snv.nf' at line: 77 or see '.nextflow.log' file for more details

In a quick look it indeed looks like there is a mismatch for the Sentieon workflow.

Here is what is asked for from the "parent" processes using the Sentieon SNV calling subprocess.

ch_sentieon_gtbi = CALL_SNV_SENTIEON.out.gtbi

Here are the emitted arguments. Maybe a typo for gvcf_tbi?

emit:
vcf = BCFTOOLS_ANNOTATE.out.vcf // channel: [ val(meta), path(vcf) ]
tabix = TABIX_ANNOTATE.out.tbi // channel: [ val(meta), path(tbi) ]
gvcf = SENTIEON_DNASCOPE.out.gvcf // channel: [ val(meta), path(gvcf) ]
gvcf_tbi = SENTIEON_DNASCOPE.out.gvcf_tbi // channel: [ val(meta), path(gvcf_tbi) ]
versions = ch_versions // channel: [ path(versions.yml) ]

Need to run for a meeting now, will continue testing (and checking the case error) later today

@jemten
Copy link
Collaborator

jemten commented Apr 18, 2024

Here are the emitted arguments. Maybe a typo for gvcf_tbi?

indeed looks like a typo. Thanks for testing and reporting

@Jakob37
Copy link
Contributor

Jakob37 commented Apr 19, 2024

Quick update. I am running into more downstream issues, but I think these are on my side. I will continue working through the GIAB run and raise any issues I find, but will probably complete when you are done with this PR.

I would like to run the test data, but we have an offline-only server, and it is not feasible to pull all containers to my local computer.

saveAs: { filename -> filename.equals('versions.yml') ? null :
filename.contains('_renam.vcf.gz.tbi') ? "${meta.id}_repeat_expansion.vcf.gz.tbi" :
filename },
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
]
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jemten I am thinking if it might be better to add a parameter to generate reviewer compatible output files, like we are doing for cram files at the moment... Then we can publish the compressed files by default, and we can choose to publish the uncompressed files based on whether that parameter is set to true or not.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion!

@jemten jemten linked an issue May 7, 2024 that may be closed by this pull request
@fa2k
Copy link
Contributor

fa2k commented May 13, 2024

I've tested the most recent version of this PR and I do get an error. I made a separate issue because I also get the error in pipeline version 2.0.x #542 . The error is this:

[a8/b58d57] NFCORE_RAREDISEASE:RAREDISEASE:CALL_MOBILE_ELEMENTS:ME_INDEX_SPLIT_ALIGNMENT (NA12878)          [100%] 25 of 25, cached: 25 ✔

Join mismatch for the following entries:
- key=[id:NA12878, sample:NA12878, lane:1, sex:2, phenotype:2, paternal:0, maternal:0, case_id:NA12878, num_lanes:1, read_group:'@RG\tID:NA12878\tPL:illumina\tSM:NA12878', single_end:false, interval:chr13, nr_of_intervals:25] values=[/data0/paalmbj/na12878_med_pipeline/work/83/20f04adedf5dc4bb5ee4b6b85e262c/NA12878_chr13.bam, /data0/paalmbj/na12878_med_pipeline/work/d0/156dd6f94e30b5c404ec292aad5b8c/NA12878_chr13.bam.bai]

@ramprasadn ramprasadn changed the base branch from master to dev May 17, 2024 08:01
@ramprasadn ramprasadn changed the title Patch Refactoring + other issues with cadd and samtools merge May 17, 2024
@ramprasadn ramprasadn marked this pull request as ready for review May 17, 2024 08:57
Copy link
Collaborator

@jemten jemten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Overzealous html escapes in VCF?
4 participants