Homologous Recombination Deficiency (HRD) Prediction Workflow using sig.tools
java -jar cromwell.jar run hrDetect.wdl --inputs inputs.json
Parameter | Value | Description |
---|---|---|
outputFileNamePrefix |
String | Name of sample matching the tumor sample in .vcf |
structuralVcfFile |
File | Input VCF file of structural variants (eg. from delly) |
smallsVcfFile |
File | Input VCF file of SNV and indels (small mutations) (eg. from mutect2) |
smallsVcfIndex |
File | Index file for smallsVcfFile |
segFile |
File | File for segmentations, used to estimate number of segments in Loss of heterozygosity (LOH) (eg. from sequenza) |
reference |
String | Reference genome version |
Parameter | Value | Default | Description |
---|
Parameter | Value | Default | Description |
---|---|---|---|
filterStructural.modules |
String | "bcftools/1.9" | Required environment modules |
filterStructural.structuralQUALfilter |
String | "PASS" | filter for filter calls to keep, eg. PASS |
filterStructural.jobMemory |
Int | 5 | Memory allocated for this job (GB) |
filterStructural.threads |
Int | 1 | Requested CPU threads |
filterStructural.timeout |
Int | 1 | Hours before task timeout |
filterINDELs.VAF |
Float | 0.01 | minimum variant allele frequency to retain variant |
filterINDELs.QUALfilter |
String | "FILTER~'haplotype' | FILTER~'clustered_events' |
filterINDELs.jobMemory |
Int | 10 | Memory allocated for this job (GB) |
filterINDELs.threads |
Int | 1 | Requested CPU threads |
filterINDELs.timeout |
Int | 2 | Hours before task timeout |
filterSNVs.VAF |
Float | 0.01 | minimum variant allele frequency to retain variant |
filterSNVs.QUALfilter |
String | "FILTER~'haplotype' | FILTER~'clustered_events' |
filterSNVs.jobMemory |
Int | 10 | Memory allocated for this job (GB) |
filterSNVs.threads |
Int | 1 | Requested CPU threads |
filterSNVs.timeout |
Int | 2 | Hours before task timeout |
hrdResults.modules |
String | "sigtools/2.4.1 sigtools-data/1.0 hrdetect-rscript/1.5.8" | Required environment modules |
hrdResults.sigtoolrScript |
String | "$HRDETECT_RSCRIPT_ROOT/scripts/sigTools_runthrough.R" | .R script containing sigtools |
hrdResults.SVrefSigs |
String | "$SIGTOOLS_DATA_ROOT/RefSigv0_Rearr.tsv" | reference signatures for SVs |
hrdResults.SNVrefSigs |
String | "$SIGTOOLS_DATA_ROOT/COSMIC_v1_SBS_GRCh38.txt" | reference signatures for SNVs |
hrdResults.sigtoolsBootstrap |
Int | 200 | Number of bootstraps for sigtools |
hrdResults.indelCutoff |
Int | 50 | minimum number of indels to run analysis |
hrdResults.jobMemory |
Int | 50 | Memory allocated for this job (GB) |
hrdResults.threads |
Int | 1 | Requested CPU threads |
hrdResults.timeout |
Int | 15 | Hours before task timeout |
Output | Type | Description | Labels |
---|---|---|---|
hrd_signatures |
File | JSON file of hrdetect signatures | vidarr_label: hrd_signatures |
SBS_exposures |
File | JSON of structural variant signatures | vidarr_label: SBS_exposures |
SV_exposures |
File | JSON cataloguing structural variants | vidarr_label: SV_exposures |
ID_catalog |
File | JSON cataloguing indels | vidarr_label: ID_catalog |
This section lists command(s) run by hrDetect workflow
- Running hrDetect
set -euo pipefail
$BCFTOOLS_ROOT/bin/bcftools view -f '~{structuralQUALfilter}' ~{structuralVcfFile} >> ~{outputFileNamePrefix}.structural.PASS.vcf
awk '$1 !~ "#" {print}' ~{structuralVcfFile} | wc -l >~{outputFileNamePrefix}.structural.filteringReport.txt
awk '$1 !~ "#" {print}' ~{outputFileNamePrefix}.structural.PASS.vcf | wc -l >>~{outputFileNamePrefix}.structural.filteringReport.txt
set -euo pipefail
$BCFTOOLS_ROOT/bin/bcftools norm --multiallelics - --fasta-ref ~{genome} ~{difficultRegions} ~{smallsVcfFile} | \
$BCFTOOLS_ROOT/bin/bcftools filter -i "TYPE='~{smallType}'" | \
$BCFTOOLS_ROOT/bin/bcftools filter -e "~{QUALfilter}" | \
$BCFTOOLS_ROOT/bin/bcftools filter -i "(FORMAT/AD[0:1])/(FORMAT/AD[0:0]+FORMAT/AD[0:1]) >= ~{VAF}" >~{outputFileNamePrefix}.~{smallType}.VAF.vcf
bgzip ~{outputFileNamePrefix}.~{smallType}.VAF.vcf
tabix -p vcf ~{outputFileNamePrefix}.~{smallType}.VAF.vcf.gz
zcat ~{smallsVcfFile} | awk '$1 !~ "#" {print}' | wc -l >~{outputFileNamePrefix}.~{smallType}.filteringReport.txt
zcat ~{outputFileNamePrefix}.~{smallType}.VAF.vcf.gz | awk '$1 !~ "#" {print}' | wc -l >>~{outputFileNamePrefix}.~{smallType}.filteringReport.txt
set -euo pipefail
Rscript ~{sigtoolrScript} \
--sampleName ~{outputFileNamePrefix} \
--snvFile ~{snvVcfFiltered} \
--indelFile ~{indelVcfFiltered} \
--SVFile ~{SV_vcf_location} \
--LOHFile ~{lohSegFile} \
--bootstraps ~{sigtoolsBootstrap} \
--genomeVersion ~{genomeVersion} \
--indelCutoff ~{indelCutoff} \
--SVrefSigs ~{SVrefSigs} \
--SNVrefSigs ~{SNVrefSigs}
For support, please file an issue on the Github project or send an email to [email protected] .
Generated with generate-markdown-readme (https://github.com/oicr-gsi/gsi-wdl-tools/)