Nextflow Conversion of BlastSimilarityTask

THIS REPO IS 🚧 UNDER CONSTRUCTION 🚧 and NOT Used in ANY production CODE

Nextflow Conversion of BlastSimilarityTask

blastSimilarity

flowchart TD
    p0((Channel.fromPath))
    p1([splitFasta])
    p2(( ))
    p3[nonConfiguredDatabase:createDatabase]
    p4(( ))
    p5[nonConfiguredDatabase:blastSimilarity]
    p6([collectFile])
    p7(( ))
    p8([collectFile])
    p9(( ))
    p10([collectFile])
    p11(( ))
    p0 --> p1
    p1 -->|seqs| p5
    p2 -->|newdb.fasta| p3
    p3 --> p5
    p4 -->|fastaName| p5
    p5 --> p6
    p5 --> p8
    p5 --> p10
    p6 --> p7
    p8 --> p9
    p10 --> p11

Explanation of nextflow.config file parameters:

param	value type	description
blastProgram	string	Name of NCBI blast tool you want to run
seqFile	string	Path to input file
preConfiguredDatabase	boolean	If you have databasefiles generated from NCBI's makeblastdb, there is no need to generate these files. If this is set to true, you will need to supply databaseDir and databaseBaseName.
databaseDir	string	The path to the directory containing the database files. There can be other files in this directory, but any file beginning with the databaseBaseName will be brought into the process.
databaseBaseName	string	The rootname for you database files. For example, "newdb.fasta" would be used for the files in blastSimilarity/data/database
databaseFasta	string	The location of the fasta file that you would like to use to create your database. Needed if preConfiguredDatabase is false.
databaseType	string	The type of database you are using. Either "prot" or "nucl". Only needed if preConfiguredDatabase is false.
dataFile	string	How you would like the main output file to be named.
logFile	string	How you would like the log file to be named.
outputDir	string	Path to where you would like output files stored
saveAllBlastFiles	boolean	If true, the blast output for each time blast is ran. If you have 9 sequences in your input file, and you have fastaSubsetSize as 1, you will recieve 9 zipped files. If fastaSubsetSize is equal to three, you will recieve 3 zipped files. Zipped file names will be the sequence identifier for the first sequence in the group being run that is put into the file (also will be the last in the zip file).
saveGoodBlastFiles	boolean	Similar to saveAllBlastFiles, expect only files that contain a hit will be saved. saveGood and saveAll should not both be true.
doNotParse	boolean	This tool operates in two steps, running blast and grepping through the output to collect and return values. If doNotParse is true, only the blast output is generated and returned. If false, then the output will continue on to the processing step.
printSimSeqsFile	boolean	Changes the output format of dataFile. Returns sequence accession from seqFile, the taxon it matched with from the database, the p-value, the exponent for the p-Value, and some stats per identity and per match.
blastParamsFile	string	The file location of the file containing additional blast paramenters. These can just be written out in the file as if you were using them on the command line.
fastaSubsetSize	Int	Number of sequences per split of seqFile passed to blastSimilarity process.

Get Started

Install Nextflow

curl https://get.nextflow.io | bash
Run the script

nextflow run VEuPathDB/blastSimilarity -with-trace -c <config_file> -r main

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
bin		bin
data		data
modules		modules
Dockerfile		Dockerfile
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nextflow Conversion of BlastSimilarityTask

Get Started

About

Releases

Packages

Contributors 3

Languages

License

VEuPathDB/blast-similarity-nextflow

Folders and files

Latest commit

History

Repository files navigation

Nextflow Conversion of BlastSimilarityTask

Get Started

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages