This repository contains useful code focused on bioinformatics analysis of NGS data.
- bash_commands:
One-liners and small scripts useful in environment setup, package management, resource requirements, and NGS file processing.
- guideseq:
Tips for running the python2.7 build of the guideseq analysis package.
- recipes_rev_comp
Recipe to reverse complement fastq files in batch.
Data management:
https://genomespot.blogspot.com/2021/02/storing-your-sequence-data.html
Packages:
https://bioinf.shenwei.me/seqkit/tutorial/ (seqkit)
https://sourceforge.net/projects/bbmap/ (bbmap)
https://anaconda.org/bcbio/htseq (htseq for python2.7 support)
https://www.golinuxcloud.com/tmux-commands/#Create_your_first_tmux_session tmux tutorial (terminal/CLI tool for uninterrupted running of processes)
Linux:
https://www.linuxvasanth.com/category/shell-scripting/ Linux blog with useful tips
https://meta.stackexchange.com/questions/82718/how-do-i-escape-a-backtick-within-in-line-code-in-markdown Markdown tips on escaping special chars
https://stackoverflow.com/questions/36374267/how-to-fix-fatal-error-zlib-h-no-such-file-or-directory z library error\
https://bioinformatics.cvr.ac.uk/short-command-lines-for-manipulation-fastq-and-fasta-sequence-files/ SHORT COMMAND LINES FOR MANIPULATION FASTQ AND FASTA SEQUENCE FILES
https://www.automateexcel.com/formulas/remove-numbers-from-text/ Remove numbers from text string in Exel. See: "SUBSTITUTE Function Formula" setting.
https://en.wikibooks.org/wiki/Next_Generation_Sequencing_(NGS)/Bioinformatics_from_the_outside#The_command_line Next Generation Sequencing (NGS)/Bioinformatics from the outside