Which parameter can be used to filter the scallop result? #35

Huangyizhong · 2021-11-08T13:16:08Z

hi ,there
The scallop is a good software to assembly the illumina data and I got lots of transcripts that other softwares can not. When I use the ORFfinder to predict the ORF with the scallop results. I got lots of transcripts without the classical splice site, such as the GT-AG,GC-AG or AT-AC. As shown in the picture1, the scallop results were not the same as the other data. Lots of scallop transcripts were not the classical splice site. Is there some parameters can be used to filter it ? As also the picture 2, the transcript looks so strange!
Thanks so much!
Sincerely
Yizhong Huang

shaomingfu · 2021-11-08T20:05:14Z

Hi Yizhong,

Re question 1: Scallop fully uses the splice sites predicted by the aligner. So far it does not contain any model or parameter to detect / filter out poorly supported non-canonical splice sites. We will probably add such feature in future releases. But for now, you may try: 1, check if certain aligner such as STAR or HISAT2 provide such parameters to control splice sites, and/or 2, write a script of your own to filter the assembled transcripts (by Scallop).

Re question 2: the assembled transcripts seem strange to me too. Is this sample strand-specific? If so did you specify library-type when running Scallop?

Best,
Mingfu

Huangyizhong · 2021-11-09T02:32:12Z

@shaomingfu Thanks so much for your quick reply! It is a pity that the scallop has no the parameter to filter the splice sites. I have checked the annotation file of the human using the gffread software, and almost all the transcripts are the canonical splice sites. May be I can use the gffread to filter these directly. How can I get the proper thread of the reads number to filter the undesired transcripts? As shown in the picture1, the scallop transcript has two more bases (CT) than other data. The strange transcript I have attached is not the strand-specific, how to deal with it ? I just run the scallop as follows: ${scallop} -i ${bam[$PBS_ARRAYID]} -o ${output}/${NAME}_scallop.gtf.
Thanks again for your kind help
Sincerely
Yizhong Huang

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Which parameter can be used to filter the scallop result? #35

Which parameter can be used to filter the scallop result? #35

Huangyizhong commented Nov 8, 2021

shaomingfu commented Nov 8, 2021

Huangyizhong commented Nov 9, 2021

Which parameter can be used to filter the scallop result? #35

Which parameter can be used to filter the scallop result? #35

Comments

Huangyizhong commented Nov 8, 2021

shaomingfu commented Nov 8, 2021

Huangyizhong commented Nov 9, 2021