Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUM doesn't know it stopped #162

Open
asowalsky opened this issue Jan 22, 2013 · 11 comments
Open

RUM doesn't know it stopped #162

asowalsky opened this issue Jan 22, 2013 · 11 comments

Comments

@asowalsky
Copy link

I have a pair of FastQ's from paired end sequencing that had some adapter contamination, so I trimmed them. But since RUM doesn't like FQ files with different sizes (HUGE ISSUE) I padded all the reads with N's and @'s. Before trimming RUM had no problem (just poor mapping).
After trimming, RUM stalls after step 16. I run this on a Cluster and the job is still running but RUM thinks it is stopped.

Here's the screen output:

RUM Version v2.0.4

         _   _   _   _   _   _    _
           // \// \// \// \// \// \/
          //\_//\_//\_//\_//\_//\_//
    o_O__O_ o
       | ====== |   .-----------.
       `--------'   |||||||||||||
    || ~~ ||    |-----------|
    || ~~ ||    | .-------. |
    ||----||    ! | UPENN | !
       //      \\    \`-------'/
      // /!  !\ \\    \_  O  _/
     !!__________!!     \   /
     ||  ~~~~~~  ||      `-'
     || _    ||
     |||_|| ||\/|||
     ||| \|_||  |||
     ||      ||
     ||  ~~~~~~  ||
     ||__________||
.----|||    |||------------------.
     ||\\      //||         /|
     |============|        //
     `------------'       //
---------------------------------'/
---------------------------------'
  ____________________________________________________________
- The RNA-Seq Unified Mapper (RUM) Pipeline has been initiated -

I'm going to try to figure out how much RAM you have. If you see some error
messages here, don't worry, these are harmless.
Use of uninitialized value $params{"message"} in print at /usr/share/perl5/Log/Log4perl/Appender/File.pm l
ine 245.
Use of uninitialized value $params{"message"} in print at /usr/share/perl5/Log/Log4perl/Appender/File.pm l
ine 245.

It seems like you have 94.00 Gb of RAM on your machine. Unless you have too
much other stuff running, RAM should not be a problem.

Processing as paired-end data
Saving job configuration
If this is a big job, you should keep an eye on the rum_errors*.log files
in the output directory. If all goes well they should be empty. You can
also run "/home/as459/RUM_mapper/bin/rum_runner status -o
/scratch/as459/trimmed_S11-49946_RNA/G3" to check the status of the job.

Preprocessing

Processing as paired-end data
Reformatting reads file... please be patient.
Splitting fastq file into 8 chunks with separate reads and quals
*** Note: I am going to report alignments of length 35, based on a read
length of 50 . If you want to change the minimum size of alignments
reported, use the --min-length option

Processing in 8 chunks

Checking that files were split properly

And here's what happens when I poll the status:

Preprocessing

XXXXXXXX Split input files

Processing in 8 chunks

XXXXXXXX 1. Run Bowtie on genome
XXXXXXXX 2. Run Bowtie on transcriptome
XXXXXXXX 3. Merge unique mappers together
XXXXXXXX 4. Merge non-unique mappers together
XXXXXXXX 5. Make unmapped reads file for blat
XXXXXXX 6. Run BLAT
XXXXXXX 7. Merge bowtie and blat results
XXXXXXX 8. Clean up RUM files
XXXXXXX 9. Produce RUM_Unique
XXXXXXX 10. Sort RUM_Unique by location
XXXXXXX 11. Sort cleaned non-unique mappers by ID
XXXXXXX 12. Remove duplicates from NU
XXXXXXX 13. Create SAM file
XXXXXXX 14. Create non-unique stats
XXXXXXX 15. Sort RUM_NU
XXXXXXX 16. Generate quants

Postprocessing

  1. Merge RUM_NU files
  2. Make non-unique coverage
  3. Merge RUM_Unique files
  4. Compute mapping statistics
  5. Make unique coverage
  6. Finish mapping stats
  7. Merge SAM headers
  8. Concatenate SAM files
  9. Merge quants
  10. make_junctions
  11. Sort junctions (all, bed) by location
  12. Sort junctions (all, rum) by location
  13. Sort junctions (high-quality, bed) by location
  14. Get inferred internal exons
  15. Quantify novel exons

All error log files are empty. That's good.

RUM is not running.

Any ideas what is going on?

@delagoya
Copy link

Couple of questions:

What sort of cluster are you running on? A grid-engine cluster? If so, did you use the --qsub flag to submit?

@asowalsky
Copy link
Author

I am not splitting the chunks over multiple nodes. I am using LFS and it supports MPI but I didn't try here -- just splitting into 8 chunks and requesting 8 cores from the same node. RUM finished successfully using the same configuration on untrimmed reads, so I am wondering whether the trimming or padding messed up RUM.

I mentioned the cluster because I submit the job and wait for an email to tell me it finished but after 48h the time limit expires and RUM did not finish.
I decided to check the log subdirectory for this job and there are 8 log files (1 for each chunk) and the 8th file is considerably smaller than the rest. Files 1-7 conclude with something like:

2013/01/21 08:25:37 clarinet001-255 19222   INFO    RUM.Workflow    FINISH  Generate quants
2013/01/21 08:25:37 clarinet001-255 19222   INFO    RUM.Workflow    Cleaning up after a workflow step

File 8's log ends with:

2013/01/21 02:43:25 clarinet001-255 32015   INFO    RUM.Script.ParseBlatOut Parsing output from blat and mdust
2013/01/21 02:46:36 clarinet001-255 32015   INFO    RUM.Script.ParseBlatOut Done parsing output; waiting for blat process 32247

so it decided to stop early?
All of the _errors logs are empty. rum.log says it split into 8 chunks but only finished 7, consistent with these files. Why would it stop on the last chunk?

In the tmp directory is a file called preproc-error-log which contains:

Use of uninitialized value $line in scalar chomp at /home/as459/RUM_mapper/bin/../bin/parsefastq.pl line 229, <$_[...]> line 191262
302.
Use of uninitialized value $line in string eq at /home/as459/RUM_mapper/bin/../bin/parsefastq.pl line 230, <$_[...]> line 191262302
.
ERROR: in script parsefastq.pl: something is wrong, the forward file seems to end with an incomplete record...

Could this be the culprit?

@mdelaurentis
Copy link
Contributor

Can you zip or tar up the log directory and send it to me? Also, if you can
send any scripts that you used to kick of the job, that might be helpful.

Thanks,

Mike

On Tue, Jan 22, 2013 at 10:23 AM, asowalsky [email protected]:

I have a pair of FastQ's from paired end sequencing that had some adapter
contamination, so I trimmed them. But since RUM doesn't like FQ files with
different sizes (HUGE ISSUE) I padded all the reads with N's and @'s.
Before trimming RUM had no problem (just poor mapping).
After trimming, RUM stalls after step 16. I run this on a Cluster and the
job is still running but RUM thinks it is stopped.

Here's the screen output:

RUM Version v2.0.4

     _   _   _   _   _   _    _
       // \// \// \// \// \// \/
      //\_//\_//\_//\_//\_//\_//
o_O__O_ o
   | ====== |   .-----------.
   `--------'   |||||||||||||
|| ~~ ||    |-----------|
|| ~~ ||    | .-------. |
||----||    ! | UPENN | !
   //      \\    \`-------'/
  // /!  !\ \\    \_  O  _/
 !!__________!!     \   /
 ||  ~~~~~~  ||      `-'
 || _    ||
 |||_|| ||\/|||
 ||| \|_||  |||
 ||      ||
 ||  ~~~~~~  ||
 ||__________||

.----||| |||------------------.
||\ //|| /|
|============| //
`------------' //
---------------------------------'/
---------------------------------'


  • The RNA-Seq Unified Mapper (RUM) Pipeline has been initiated -

I'm going to try to figure out how much RAM you have. If you see some error
messages here, don't worry, these are harmless.
Use of uninitialized value $params{"message"} in print at
/usr/share/perl5/Log/Log4perl/Appender/File.pm l
ine 245.
Use of uninitialized value $params{"message"} in print at
/usr/share/perl5/Log/Log4perl/Appender/File.pm l
ine 245.

It seems like you have 94.00 Gb of RAM on your machine. Unless you have too
much other stuff running, RAM should not be a problem.

Processing as paired-end data
Saving job configuration
If this is a big job, you should keep an eye on the rum_errors*.log files
in the output directory. If all goes well they should be empty. You can
also run "/home/as459/RUM_mapper/bin/rum_runner status -o
/scratch/as459/trimmed_S11-49946_RNA/G3" to check the status of the job.
Preprocessing

Processing as paired-end data
Reformatting reads file... please be patient.
Splitting fastq file into 8 chunks with separate reads and quals
*** Note: I am going to report alignments of length 35, based on a read
length of 50 . If you want to change the minimum size of alignments
reported, use the --min-length option
Processing in 8 chunks

Checking that files were split properly

And here's what happens when I poll the status:
Preprocessing

XXXXXXXX Split input files
Processing in 8 chunks

XXXXXXXX 1. Run Bowtie on genome
XXXXXXXX 2. Run Bowtie on transcriptome
XXXXXXXX 3. Merge unique mappers together
XXXXXXXX 4. Merge non-unique mappers together
XXXXXXXX 5. Make unmapped reads file for blat
XXXXXXX 6. Run BLAT
XXXXXXX 7. Merge bowtie and blat results
XXXXXXX 8. Clean up RUM files
XXXXXXX 9. Produce RUM_Unique
XXXXXXX 10. Sort RUM_Unique by location
XXXXXXX 11. Sort cleaned non-unique mappers by ID
XXXXXXX 12. Remove duplicates from NU
XXXXXXX 13. Create SAM file
XXXXXXX 14. Create non-unique stats
XXXXXXX 15. Sort RUM_NU
XXXXXXX 16. Generate quants
Postprocessing

  1. Merge RUM_NU files
  2. Make non-unique coverage
  3. Merge RUM_Unique files
  4. Compute mapping statistics
  5. Make unique coverage
  6. Finish mapping stats
  7. Merge SAM headers
  8. Concatenate SAM files
  9. Merge quants
  10. make_junctions
  11. Sort junctions (all, bed) by location
  12. Sort junctions (all, rum) by location
  13. Sort junctions (high-quality, bed) by location
  14. Get inferred internal exons
  15. Quantify novel exons

All error log files are empty. That's good.
RUM is not running.

Any ideas what is going on?


Reply to this email directly or view it on GitHubhttps://github.com//issues/162.

@safisher
Copy link

There is something odd in your output file. Note that you have 8 chunks but step 6 on only report 7 chunks.

XXXXXXXX 5. Make unmapped reads file for blat
XXXXXXX 6. Run BLAT

On Jan 22, 2013, at 10:23 AM, asowalsky [email protected] wrote:

I have a pair of FastQ's from paired end sequencing that had some adapter contamination, so I trimmed them. But since RUM doesn't like FQ files with different sizes (HUGE ISSUE) I padded all the reads with N's and @'s. Before trimming RUM had no problem (just poor mapping).
After trimming, RUM stalls after step 16. I run this on a Cluster and the job is still running but RUM thinks it is stopped.

Here's the screen output:

RUM Version v2.0.4

     _   _   _   _   _   _    _
       // \// \// \// \// \// \/
      //\_//\_//\_//\_//\_//\_//
o_O__O_ o
   | ====== |   .-----------.
   `--------'   |||||||||||||
|| ~~ ||    |-----------|
|| ~~ ||    | .-------. |
||----||    ! | UPENN | !
   //      \\    \`-------'/
  // /!  !\ \\    \_  O  _/
 !!__________!!     \   /
 ||  ~~~~~~  ||      `-'
 || _    ||
 |||_|| ||\/|||
 ||| \|_||  |||
 ||      ||
 ||  ~~~~~~  ||
 ||__________||

.----||| |||------------------.
||\ //|| /|
|============| //
`------------' //
---------------------------------'/
---------------------------------'


  • The RNA-Seq Unified Mapper (RUM) Pipeline has been initiated -
    I'm going to try to figure out how much RAM you have. If you see some error
    messages here, don't worry, these are harmless.
    Use of uninitialized value $params{"message"} in print at /usr/share/perl5/Log/Log4perl/Appender/File.pm l
    ine 245.
    Use of uninitialized value $params{"message"} in print at /usr/share/perl5/Log/Log4perl/Appender/File.pm l
    ine 245.

It seems like you have 94.00 Gb of RAM on your machine. Unless you have too
much other stuff running, RAM should not be a problem.

Processing as paired-end data
Saving job configuration
If this is a big job, you should keep an eye on the rum_errors*.log files
in the output directory. If all goes well they should be empty. You can
also run "/home/as459/RUM_mapper/bin/rum_runner status -o
/scratch/as459/trimmed_S11-49946_RNA/G3" to check the status of the job.

Preprocessing

Processing as paired-end data
Reformatting reads file... please be patient.
Splitting fastq file into 8 chunks with separate reads and quals
*** Note: I am going to report alignments of length 35, based on a read
length of 50 . If you want to change the minimum size of alignments
reported, use the --min-length option

Processing in 8 chunks

Checking that files were split properly

And here's what happens when I poll the status:

Preprocessing

XXXXXXXX Split input files

Processing in 8 chunks

XXXXXXXX 1. Run Bowtie on genome
XXXXXXXX 2. Run Bowtie on transcriptome
XXXXXXXX 3. Merge unique mappers together
XXXXXXXX 4. Merge non-unique mappers together
XXXXXXXX 5. Make unmapped reads file for blat
XXXXXXX 6. Run BLAT
XXXXXXX 7. Merge bowtie and blat results
XXXXXXX 8. Clean up RUM files
XXXXXXX 9. Produce RUM_Unique
XXXXXXX 10. Sort RUM_Unique by location
XXXXXXX 11. Sort cleaned non-unique mappers by ID
XXXXXXX 12. Remove duplicates from NU
XXXXXXX 13. Create SAM file
XXXXXXX 14. Create non-unique stats
XXXXXXX 15. Sort RUM_NU
XXXXXXX 16. Generate quants

Postprocessing

Merge RUM_NU files
Make non-unique coverage
Merge RUM_Unique files
Compute mapping statistics
Make unique coverage
Finish mapping stats
Merge SAM headers
Concatenate SAM files
Merge quants
make_junctions
Sort junctions (all, bed) by location
Sort junctions (all, rum) by location
Sort junctions (high-quality, bed) by location
Get inferred internal exons
Quantify novel exons
All error log files are empty. That's good.

RUM is not running.

Any ideas what is going on?


Reply to this email directly or view it on GitHub.

@greggrant
Copy link
Contributor

But since RUM doesn't like FQ files with different sizes (HUGE ISSUE) I
padded all the reads with N's and @'s.

Thank you for your feedback. Agreed, this is a major limitation at this
point and we'll fix it soon. In my testing, however, I get far better
performance trimming rather than padding. Sounds counter-intuitive but
that's what happens.

Before trimming RUM had no problem (just poor mapping). After trimming,
RUM stalls after step 16. I run this on a Cluster and the job is still
running but RUM thinks it is stopped.

Here's the screen output:

RUM Version v2.0.4

       _   _   _   _   _   _    _
         // \// \// \// \// \// \/
        //\_//\_//\_//\_//\_//\_//
  o_O__O_ o
      | ====== |  .-----------.
      `--------'  |||||||||||||
  || ~~ ||    |-----------|
  || ~~ ||    | .-------. |
  ||----||    ! | UPENN | !
      //      \\   \`-------'/
     // /!  !\ \\   \_  O  _/
    !!__________!!        \   /
    ||  ~~~~~~  ||         `-'
    || _   ||
    |||_|| ||\/|||
    ||| \|_|| |||
    ||     ||
    ||  ~~~~~~  ||
    ||__________||
.----|||  |||------------------.
    ||\\      //||            /|
    |============|           //
    `------------'          //
---------------------------------'/
---------------------------------'
 ____________________________________________________________
- The RNA-Seq Unified Mapper (RUM) Pipeline has been initiated -

I'm going to try to figure out how much RAM you have. If you see some error
messages here, don't worry, these are harmless.
Use of uninitialized value $params{"message"} in print at /usr/share/perl5/Log/Log4perl/Appender/File.pm l
ine 245.
Use of uninitialized value $params{"message"} in print at /usr/share/perl5/Log/Log4perl/Appender/File.pm l
ine 245.

It seems like you have 94.00 Gb of RAM on your machine. Unless you have too
much other stuff running, RAM should not be a problem.

Processing as paired-end data
Saving job configuration
If this is a big job, you should keep an eye on the rum_errors*.log files
in the output directory. If all goes well they should be empty. You can
also run "/home/as459/RUM_mapper/bin/rum_runner status -o
/scratch/as459/trimmed_S11-49946_RNA/G3" to check the status of the job.

Preprocessing

Processing as paired-end data
Reformatting reads file... please be patient.
Splitting fastq file into 8 chunks with separate reads and quals
*** Note: I am going to report alignments of length 35, based on a read
length of 50 . If you want to change the minimum size of alignments
reported, use the --min-length option

Processing in 8 chunks

Checking that files were split properly

And here's what happens when I poll the status:

Preprocessing

XXXXXXXX Split input files

Processing in 8 chunks

XXXXXXXX 1. Run Bowtie on genome
XXXXXXXX 2. Run Bowtie on transcriptome
XXXXXXXX 3. Merge unique mappers together
XXXXXXXX 4. Merge non-unique mappers together
XXXXXXXX 5. Make unmapped reads file for blat
XXXXXXX 6. Run BLAT
XXXXXXX 7. Merge bowtie and blat results
XXXXXXX 8. Clean up RUM files
XXXXXXX 9. Produce RUM_Unique
XXXXXXX 10. Sort RUM_Unique by location
XXXXXXX 11. Sort cleaned non-unique mappers by ID
XXXXXXX 12. Remove duplicates from NU
XXXXXXX 13. Create SAM file
XXXXXXX 14. Create non-unique stats
XXXXXXX 15. Sort RUM_NU
XXXXXXX 16. Generate quants

Postprocessing

  1. Merge RUM_NU files
  2. Make non-unique coverage
  3. Merge RUM_Unique files
  4. Compute mapping statistics
  5. Make unique coverage
  6. Finish mapping stats
  7. Merge SAM headers
  8. Concatenate SAM files
  9. Merge quants
  10. make_junctions
  11. Sort junctions (all, bed) by location
  12. Sort junctions (all, rum) by location
  13. Sort junctions (high-quality, bed) by location
  14. Get inferred internal exons
  15. Quantify novel exons

All error log files are empty. That's good.

RUM is not running.

Any ideas what is going on?


Reply to this email directly or view it on GitHub:
#162

@mdelaurentis
Copy link
Contributor

Yes, you may want to check the end of your input file. What do you see if
you run "tail" on the forward and reverse files? It seems like one of them
may be truncated.

On Tue, Jan 22, 2013 at 10:50 AM, asowalsky [email protected]:

I am not splitting the chunks over multiple nodes. I am using LFS and it
supports MPI but I didn't try here -- just splitting into 8 chunks and
requesting 8 cores from the same node. RUM finished successfully using the
same configuration on untrimmed reads, so I am wondering whether the
trimming or padding messed up RUM.

I mentioned the cluster because I submit the job and wait for an email to
tell me it finished but after 48h the time limit expires and RUM did not
finish.
I decided to check the log subdirectory for this job and there are 8 log
files (1 for each chunk) and the 8th file is considerably smaller than the
rest. Files 1-7 conclude with something like:

2013/01/21 08:25:37 clarinet001-255 19222 INFO RUM.Workflow FINISH Generate quants
2013/01/21 08:25:37 clarinet001-255 19222 INFO RUM.Workflow Cleaning up after a workflow step

File 8's log ends with:

2013/01/21 02:43:25 clarinet001-255 32015 INFO RUM.Script.ParseBlatOut Parsing output from blat and mdust
2013/01/21 02:46:36 clarinet001-255 32015 INFO RUM.Script.ParseBlatOut Done parsing output; waiting for blat process 32247

so it decided to stop early?
All of the _errors logs are empty. rum.log says it split into 8 chunks but
only finished 7, consistent with these files. Why would it stop on the last
chunk?

In the tmp directory is a file called preproc-error-log which contains:

Use of uninitialized value $line in scalar chomp at /home/as459/RUM_mapper/bin/../bin/parsefastq.pl line 229, &lt;$[...]> line 191262
302.
Use of uninitialized value $line in string eq at /home/as459/RUM_mapper/bin/../bin/parsefastq.pl line 230, <$
[...]> line 191262302
.
ERROR: in script parsefastq.pl: something is wrong, the forward file seems to end with an incomplete record...

Could this be the culprit?


Reply to this email directly or view it on GitHubhttps://github.com//issues/162#issuecomment-12550368.

@greggrant
Copy link
Contributor

RUM used to do an integrity check upfront that the two files were the
exact same size. If that is no longer the case, it should be reinstsated.

On Tue, 22 Jan 2013, Mike DeLaurentis wrote:

Yes, you may want to check the end of your input file. What do you see if
you run "tail" on the forward and reverse files? It seems like one of them
may be truncated.

On Tue, Jan 22, 2013 at 10:50 AM, asowalsky [email protected]:

I am not splitting the chunks over multiple nodes. I am using LFS and it
supports MPI but I didn't try here -- just splitting into 8 chunks and
requesting 8 cores from the same node. RUM finished successfully using the
same configuration on untrimmed reads, so I am wondering whether the
trimming or padding messed up RUM.

I mentioned the cluster because I submit the job and wait for an email to
tell me it finished but after 48h the time limit expires and RUM did not
finish.
I decided to check the log subdirectory for this job and there are 8 log
files (1 for each chunk) and the 8th file is considerably smaller than the
rest. Files 1-7 conclude with something like:

2013/01/21 08:25:37 clarinet001-255 19222 INFO RUM.Workflow FINISH Generate quants
2013/01/21 08:25:37 clarinet001-255 19222 INFO RUM.Workflow Cleaning up after a workflow step

File 8's log ends with:

2013/01/21 02:43:25 clarinet001-255 32015 INFO RUM.Script.ParseBlatOut Parsing output from blat and mdust
2013/01/21 02:46:36 clarinet001-255 32015 INFO RUM.Script.ParseBlatOut Done parsing output; waiting for blat process 32247

so it decided to stop early?
All of the _errors logs are empty. rum.log says it split into 8 chunks but
only finished 7, consistent with these files. Why would it stop on the last
chunk?

In the tmp directory is a file called preproc-error-log which contains:

Use of uninitialized value $line in scalar chomp at /home/as459/RUM_mapper/bin/../bin/parsefastq.pl line 229, &lt;$[...]> line 191262
302.
Use of uninitialized value $line in string eq at /home/as459/RUM_mapper/bin/../bin/parsefastq.pl line 230, <$
[...]> line 191262302
.
ERROR: in script parsefastq.pl: something is wrong, the forward file seems to end with an incomplete record...

Could this be the culprit?


Reply to this email directly or view it on GitHubhttps://github.com//issues/162#issuecomment-12550368.


Reply to this email directly or view it on GitHub:
#162 (comment)

@asowalsky
Copy link
Author

I tried trimming first but then the FQ files were not the same size
(probably due to cutadapt)

On Tue, Jan 22, 2013 at 10:58 AM, greggrant [email protected]:

Thank you for your feedback. Agreed, this is a major limitation at this
point and we'll fix it soon. In my testing, however, I get far better
performance trimming rather than padding. Sounds counter-intuitive but
that's what happens.

@asowalsky
Copy link
Author

You found the problem! My post-trim padding script (really a PE fix script)
truncated the quality scores for the last entry.

I never had an issue with BWA so I never padded before. I can fix this --
thank you!

On Tue, Jan 22, 2013 at 11:02 AM, Mike DeLaurentis <[email protected]

wrote:

Yes, you may want to check the end of your input file. What do you see if
you run "tail" on the forward and reverse files? It seems like one of them
may be truncated.

@asowalsky
Copy link
Author

Or at least verify that the FQ files are formed correctly. In my case, the
last entry was missing a quality score.
A correctly formed FQ file should have a number of lines that is divisible
by 4 without any remainder.
My troublesome input files had 515535726 lines, which, when divided by 4,
has a remainder of 2 (0.5 decimal)
My repaired input files now have 515535728 lines, which IS evenly divisible
by 4 with no remainder.

On Tue, Jan 22, 2013 at 11:05 AM, greggrant [email protected]:

RUM used to do an integrity check upfront that the two files were the
exact same size. If that is no longer the case, it should be reinstsated.

@mdelaurentis
Copy link
Contributor

It looks like the preprocessing script found that the files ended with an
incomplete record, because it wrote a message indicating that to the log
file, but somehow that error didn't propagate up and terminate the job
right away. I'll look into why that didn't happen.

I think the check that makes sure the forward and reverse FQ files are the
same size is still in place. You said that RUM initially failed because
they were different sizes, correct?

On Tue, Jan 22, 2013 at 11:32 AM, asowalsky [email protected]:

Or at least verify that the FQ files are formed correctly. In my case, the
last entry was missing a quality score.
A correctly formed FQ file should have a number of lines that is divisible
by 4 without any remainder.
My troublesome input files had 515535726 lines, which, when divided by 4,
has a remainder of 2 (0.5 decimal)
My repaired input files now have 515535728 lines, which IS evenly divisible
by 4 with no remainder.

On Tue, Jan 22, 2013 at 11:05 AM, greggrant <[email protected]

wrote:

RUM used to do an integrity check upfront that the two files were the
exact same size. If that is no longer the case, it should be reinstsated.

Reply to this email directly or view it on GitHubhttps://github.com//issues/162#issuecomment-12552668.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants