-
Notifications
You must be signed in to change notification settings - Fork 10
Spritz commandline usage
Anthony edited this page Aug 4, 2021
·
16 revisions
If you are working on a protected-access machine, please perform the external Spritz setup before running Spritz.
- Install or activate
conda
, such as by installing miniconda3. - Install
git
withconda install git
- Clone Spritz with
git clone https://github.com/smith-chem-wisc/Spritz.git; cd Spritz/Spritz/workflow/
- Create a
conda
environment for spritz by runningconda env create --name spritzbase --file envs/spritzbase.yaml; conda activate spritzbase
. -
Adapt the
config/config.yaml
file manually. Briefly:
- Specify your analysis directory, which should have any input FASTQ files, and which will be used for saving output. If you downloaded SRAs externally in Spritz setup, make sure the SRA
- Please place the data you intend to use in
sra
,fq
,fq_se
,sra_se
. Leave empty the ones you don't intend to use; for example,sra: []
indicates you do not intend to use paired-end SRAs. - Please note that FASTQ filenames should be located in the specified analysis directory. Input FASTQs must have a filename with the format {prefix}_1.fastq, and the prefixes should be listed in the
fq
orfq_se
fields, respectively. The filenames themselves should not be listed, just the prefixes. - Specify the organism, genome version, and gene model version.
- Run Spritz with
snakemake -j {threads} --use-conda --conda-frontend mamba --resources mem_mb={memory_megabytes}
, where{threads}
and{memory_megabytes}
are replaced with your specifications. For example, this would besnakemake -j 24 --use-conda --conda-frontend mamba --resources mem_mb=100000
if using 24 threads and 100 GB of RAM.
Spritz requires access to these URLs to perform its setup:
http://www.uniprot.org
https://api.nuget.org
http://ftp.ensembl.org
https://ftp.ncbi.nih.gov/
https://github.com/
You can test whether your analysis machine can access these addresses by running ping http://www.uniprot.org
and such.
- Install or activate
conda
, such as by installing miniconda3. - Install
git
withconda install git
- Clone Spritz with
git clone https://github.com/smith-chem-wisc/Spritz.git; cd Spritz/Spritz/workflow/
- Create a
conda
environment for setting up spritz by runningconda env create --name spritzbase --file envs/spritzbase.yaml; conda activate spritzbase
. - Specify SRAs, organism, and gene model version in the
config/config.yaml
file manually. Briefly:
- Please place the data you intend to use in
sra
,fq
,fq_se
,sra_se
. Leave empty the ones you don't intend to use; for example,sra: [SRR629563]
specifies to download this SRA, andsra: []
indicates you do not intend to use paired-end SRAs. - Specify the organism, genome version, and gene model version.
- If you intend to use your own FASTQs, specify them in the next section after setting up Spritz and copying it to your analysis server.
- Run
snakemake -j {threads} --use-conda --conda-frontend mamba ../resources/setup.txt
to set up Spritz with{threads}
replaced with the number of threads on your machine. For example, usesnakemake -j 16 --use-conda --conda-frontend mamba ../resources/setup.txt
if 16 threads are available. - Run
cd ../../../
to exit the Spritz folder. - Bundle and compress Spritz with
tar cvzf Spritz.tar.gz Spritz
- Copy
Spritz.tar.gz
to the server for your analysis - Uncompress Spritz on the analysis server with
tar xvzf Spritz.tar.gz
After this setup, you can follow the steps above, starting at step 5. Some notes:
- You may need to run
module load conda
instead of downloading and installing miniconda. - You may need to run
snakemake
using a SLURM command, such assrun -A sens2020### -t 2-0 -c 16 snakemake -j 16 --resources mem_mb=112000