-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dev ratio #288
base: dev
Are you sure you want to change the base?
Conversation
Coda workflow. Files that are unique to the coda workflow have been added to the corresponding folders (conf, modules, subworkflows...). Files that are present in the pipeline have been added as ***_coda to avoid overwriting the original ones (modules_coda.config, nextflow_coda.config, main_coda.nf... etc).
Update dev-ratio with changes from dev branch
Include CoDa workflow in main workflow
- use contrast file information for differential analysis with PropD
add GPROFILER2 module to ENRICHMENT subworkflow
Co-authored-by: WackerO <[email protected]>
Co-authored-by: WackerO <[email protected]>
Co-authored-by: WackerO <[email protected]>
322 add module deseq2
move pathway related logic out of differential and correlation subworkflows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may have a difference of opinion here, but I really think components should be kept simple, and complexity around channel manipulations should be pushed up to workflows, it's just the nf-core way.
I think I'm going to have a go at pushing the differential workflow up into nf-core/modules (in simplified form).
take: | ||
ch_counts // [ meta_exp, counts ] with meta keys: method, args_diff | ||
ch_samplesheet // [ meta_exp, samplesheet ] | ||
ch_contrasts // [ meta_contrast, contrast_variable, reference, target ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ch_contrasts // [ meta_contrast, contrast_variable, reference, target ] | |
ch_contrasts // [ meta_contrast, contrast_variable, reference, target ] | |
method // limma, deseq2 ... |
ch_counts | ||
.branch { | ||
propd: it[0]["method"] == "propd" | ||
deseq2: it[0]["method"] == "deseq2" | ||
limma: it[0]["method"] == "limma" | ||
} | ||
.set { ch_counts } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ch_counts | |
.branch { | |
propd: it[0]["method"] == "propd" | |
deseq2: it[0]["method"] == "deseq2" | |
limma: it[0]["method"] == "limma" | |
} | |
.set { ch_counts } |
Sorry, I still don't like this. We should keep the interface simple rather than forcing people to stuff things into meta maps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see your problem. The meta is existing in every nf-core module and subworkflow and from what I see it is also not uncommon to use the meta information for downstream analysis decisions. Take for example the single_end flag in RNAseq, ChIPseq, ATACseq,... Both modules (e.g. trimgalore ) and subworkflows are using it, and thus forcing the flag to exist. This is even also used for branching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The single-end thing is a special case, we try and discourage the propagation of assumptions about fields in the meta.
The branching you point at is necessitated by the need to direct outputs form the first processes to the right place. It's not engineered in to the inputs. If I were writing that subworkflow without that first process I'd be saying the same thing- the single end status would be a straighforward subworkflow input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point here is that the tool status has nothing to do with the count file. You're forcing people to build it into the meta to suit the use case, but it would just make life harder for other users of the subworkflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
workflow DIFFERENTIAL { | ||
take: | ||
ch_counts // [ meta_exp, counts ] with meta keys: method, args_diff |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ch_counts // [ meta_exp, counts ] with meta keys: method, args_diff | |
ch_abundance // [ meta_exp, counts ] with meta keys: method, args_diff |
This won't always be counts- e.g. for arrays it will be 'intensities', for PROTEUS it will be whatever that techology uses.
// Perform differential analysis with propd | ||
// ---------------------------------------------------- | ||
|
||
// TODO propd currently don't support blocking, so we should not run propd with same contrast_variable, reference and target, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can assume that will be resolved at the workflow level (filtering the contrasts channel)
include { DESEQ2_DIFFERENTIAL } from '../../../modules/nf-core/deseq2/differential/main' | ||
include { DESEQ2_DIFFERENTIAL as DESEQ2_NORM } from "../../../modules/nf-core/deseq2/differential/main" | ||
include { LIMMA_DIFFERENTIAL } from '../../../modules/nf-core/limma/differential/main' | ||
include { FILTER_DIFFTABLE as FILTER_DIFFTABLE_LIMMA } from '../../../modules/local/filter_difftable' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This subworkflow will never run BOTH Limma AND DESeq2, so you don't need to import and call it twice. You should output DESeq2 + Limma to the same channels, which get filtered by the same process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but that is the idea with the multiply running options no?
If the user wants to run --pathway deseq2_gsea,limma_gsea
Both deseq2 and limma would be run, but the output would be channeled into different places.
How would you deal with that instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh yes, I think maybe this is down to our difference in thoughts on architecture.
In my view the subworkflow would run twice to make that happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahhhhh so you would have import various subworkflows (eg DIFFERENTIAL_DESEQ2, DIFFERENTIAL_LIMMA) inside the workflow and run them.... actually that is nice :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. We keep all components as simple as possible, and manage complexity via channel operations in the calling workflow.
'community.wave.seqera.io/library/bioconductor-limma_r-ggplot2_r-propr:17abd3f137436739' }" | ||
|
||
input: | ||
tuple val(meta), path(count), path(samplesheet), val(contrast_variable), val(reference), val(target) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very un-nf-core, and doesn't match the analagous differential modules. I think this will need to be split for nf-core
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay that is not a problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
even though why this is not recommended by nf-core?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh- has there been a change? Could you point at that guideline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if there a change/guideline for that...that is why i asked because i was not sure why you say this is un-nf-core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, the convention in modules is not to combine everything into a single input channel, but only to combine related things. You could copy the interface of the other differential modules, which would have better consistency and would prevent us having to do special operations to fit this module.
|
||
// conda "${moduleDir}/environment.yml" | ||
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? | ||
'oras://community.wave.seqera.io/library/bioconductor-limma_r-ggplot2_r-propr:209490acb0e524e3' : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to use the https versions of the singularity URIs for Seqera Containers. It's a bit of a pain, but here's how: https://nfcore.slack.com/archives/CJRH30T6V/p1729023924708159?thread_ts=1729023915.530259&cid=CJRH30T6V
…nd not arrayLists
…annels and not arrayLists" This reverts commit 655ad61.
Fix tests; add version channels; remove params access in subworkflow
311 add module gsea
PR checklist
nf-core lint
).nf-test test main.nf.test -profile test,docker
).nextflow run . -profile debug,test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).