Skip to content

Benchmark

Bo Han edited this page Aug 8, 2014 · 9 revisions

Benchmark of piPipes

In this document we presents the running time and space taken by piPipes.

Details

small RNA-seq

We randomly sampled N millions of reads from an unpublished HiSeq SE50 small RNA-seq library with 27,990,838 reads (23,712,713 mappable to dm3) and ran piPipes small RNA pipeline with 8 CPUs. Duration and space was determined by date and du.

for i in `seq 1 3 26`; do 
  seqtk sample -s$((RANDOM%100)) $SMALLRNAFQ ${i}000000 | \
  gzip > ${i}M.fq.gz && \
  date > ${i}.time && \
  piPipes small \
    -i ${i}M.fq.gz \
    -g dm3 \
    -o ${i}M.out && \
  date >> ${i}.time && \
  du -skh ${i}M.out > ${i}.size && \
  rm -rf ${i}M.out ${i}M.fq.gz
done

RNA-seq

We randomly sampled N millions of reads from an unpublished HiSeq PE100 RNA-seq library with 15,963,640 reads (with 430,376 mapped to rRNA and 14,710,380 mappable to dm3) and ran piPipes rna pipeline with 8 CPUs. Duration and space was determined by date and du.

for i in `seq 1 15`; do 
  seqtk sample -s$((RANDOM%100)) $RNASEQFQ1 ${i}000000 | \
    gzip > ${i}M.r1.fq.gz && \
  seqtk sample -s$((RANDOM%100)) $RNASEQFQ2 ${i}000000 | \
    gzip > ${i}M.r2.fq.gz && \
  time > ${i}.time && \
  piPipes rna -l ${i}M.r1.fq.gz -r ${i}M.r2.fq.gz -g dm3 -o ${i}M.out && \
  time >> ${i}.time && \
  du -skh  ${i}M.out > ${i}.size && \
  rm -rf ${i}M.out ${i}M.r1.fq.gz ${i}M.r2.fq.gz 
done
Clone this wiki locally