This repository provides information-reproducibility on how compressible different sequences are using different data compressors.
Data Compressor | Repository | Description |
---|---|---|
bsc-m03 v0.2.1 | code | article |
bzip2 1.0.8 | code | article |
DMCompress | code | article |
GeCo2 | code | article |
GeCo3 | code | article |
JARVIS2 | code | article |
JARVIS3 | code | under review |
lzma 5.2.5 | code | article |
MemRGC | code | article |
MFCompress | code | article |
NAF | code | article |
paq8l | code | article |
Change directory and give permitions:
cd scripts/ chmod +x *.sh ./Main.sh
Alternatively:
# ./InstallTools.sh # install listed compressors, GTO, and AlcoR ./DownloadFASTA.sh # downloads FASTA files ./GetCassava.sh # gunzip cassava files ./GetAlcoRFASTA.sh # simulates and stores 2 synthetic FASTA sequences ./FASTA2seq.sh # cleans FASTA files and stores raw sequence files ./DownloadDNAcorpus.sh # download raw sequences from a balanced sequence corpus ./GetDSinfo.sh # map sequences into their ids, sorted by size; view sequences info # ./RunTestsExample.sh # run bench ./ProcessBenchRes.sh # sort results by BPS and time ./Plot.sh # plot sorted results
# ./InstallTools.sh # install listed compressors, GTO, and AlcoR ./DownloadFASTA.sh -id NC_000024.1 -id NC_000913.3 # downloads CY and Escherichia Coli FASTA files ./FASTA2seq.sh # cleans FASTA files and stores raw sequence files ./GetDSinfo.sh # map sequences into their ids, sorted by size; view sequences info # ./RunTestsExample.sh # run bench ./ProcessBenchRes.sh # sort results by BPS and time ./Plot.sh # plot sorted results
The implemented features are listed in the following scripts:
./Main.sh -h ./CleanCandDfiles.sh -h ./DownloadDNAcorpus.sh -h ./DownloadFASTA.sh -h ./FASTA2seq.sh -h ./GetAlcorFASTA.sh -h ./GetCassava.sh -h ./GetDSinfo.sh -h ./InstallTools.sh -h ./Plot.sh -h ./ProcessBenchRes.sh -h ./Run.sh -h ./RunTestsExample.sh -h