-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CCS Header File #21
Comments
Hi Danilo, We did not define our own CCS header files. The header file, used by Scallop-LR, is obtained by concatenating the header lines (i.e., those lines starting with >) in the full-length and non-full-length .fasta files. (Please see an example of such header file here: https://github.com/Kingsford-Group/scallop/tree/isoseq). These full-length and non-full-length .fasta files (usually in names of isoseq_flnc.fasta and isoseq_nfl.fasta) are obtained by running the PacBio SMRT Link software. Best, |
Hi Danilo, In our GitHub “Long-Read Transcript Assembly Analysis” (https://github.com/Kingsford-Group/lrassemblyanalysis), we provide scripts to generate classified CCS reads (full-length CCS and non-full-length CCS reads) from PacBio raw reads, as well as a script to automatically generate the CCS header file and run Scallop-LR. Please refer to the “Analyze a BioSample-based Dataset with Iso-Seq Analysis, Scallop-LR, and StringTie” section in the README page of the above GitHub for detailed descriptions on how to run these scripts. Among these scripts, However, if you start with some existing classified CCS reads and would like to run Scallop-LR with them, you can simply use the following command to create the CCS header file (and then input the CCS header file and the alignments of the classified CCS reads into Scallop-LR): Best, |
Hi Minghu and Laura, Best, |
Hi Feng, I would recommend using SMRT Link v5.1.0. Best, |
Hi, there |
Hi Yizhong, I looked at the link where we originally downloaded SMRT Link v5.1.0, but it looks like PacBio has removed the previous downloading link. Unfortunately I could not find the current downloading site for SMRT Link v5.1.0 by searching on the internet. I found the PacBio SMRT Link v5.1.0 Archives (https://www.pacb.com/asset_tags/smrt-link-v5-1-0/), which does not seem to contain a downloading link; however, there is a "Contact Us” (or “Ask a Question”) button that is a short form which you may submit to ask PacBio where you can download SMRT Link v5.1.0. And I think SMRT Link v6.0.0 also saves non-full-length reads in its output. About the primer, we use the Classify tool in SMRT Link (v5.1.0) to obtain full-length and non-full-length CCS reads, and the Classify tool also removes primers from reads during the classification process. So the full-length and non-full-length CCS reads as input to Scallop-LR no longer contain primers. The Classify tool further classifies full-length reads into artificial-concatemer chimeric reads or non-chimeric reads, and it only outputs full-length non-artificial-concatemer reads. So the input CCS reads to Scallop-LR are non-artificial-concatemer reads. Best, |
Hi,
Is there documentation on how the CCS header file required for scallop-lr needs to be formatted?
The text was updated successfully, but these errors were encountered: