Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Download GFF3" omits Parent and resets score and source #427

Open
dariober opened this issue Aug 27, 2024 · 0 comments
Open

"Download GFF3" omits Parent and resets score and source #427

dariober opened this issue Aug 27, 2024 · 0 comments

Comments

@dariober
Copy link
Contributor

dariober commented Aug 27, 2024

As of commit 22db7bd, the exporting (downloading) of features to gff file has the following issues:

  • the source and score fields are reset to missing value and those values become attributes.
  • The Parent attribute is omitted

For example, the test file packages/apollo-cli/test_data/tiny.fasta.gff3 is:

ctgA	example	contig	1	50	1234567	.	.	Name=ctgA;multivalue=val1,val2,val3;aKey=Q;aNumber=987
ctgA	example	BAC	10	20	.	.	.	ID=b101.2;Name=b101.2;Note=Fingerprinted BAC with end reads
ctgA	example	SNP	10	30	0.987	.	.	ID=FakeSNP1;Name=FakeSNP;Note=This is a fake SNP that should appear at 1000 with length 1
ctgA	example	gene	100	200	.	+	.	ID=EDEN;Name=EDEN;Note=protein kinase
ctgA	example	mRNA	100	200	.	+	.	ID=EDEN.1;Parent=EDEN;Name=EDEN.1;Note=Eden splice form 1;Index=1
ctgB	someExample	contig	1	50	.	.	.	Name=SomeContig
ctgC	example	gene	100	200	.	+	.	ID=MyGene
ctgC	example	mRNA	100	200	.	+	.	ID=MyGene.1;Parent=MyGene
ctgC	example	CDS	100	170	.	+	.	ID=MyExon.1;Parent=MyGene.1
ctgC	example	gene	150	250	.	+	.	ID=AnotherGene
ctgC	example	mRNA	150	250	.	+	.	ID=mRNA.1;Parent=AnotherGene
ctgC	example	CDS	150	201	.	+	.	ID=CDS.1;Parent=mRNA.1

when exported it becomes:

ctgA	.	contig	1	50	.	.	.	gff_score=1234567;gff_source=example;multivalue=val1,val2,val3;aKey=Q;aNumber=987;Name=ctgA
ctgB	.	contig	1	50	.	.	.	gff_source=someExample;Name=SomeContig
ctgA	.	BAC	10	20	.	.	.	gff_source=example;gff_id=b101.2;Name=b101.2;Note=Fingerprinted BAC with end reads
ctgA	.	SNP	10	30	.	.	.	gff_score=0.987;gff_source=example;gff_id=FakeSNP1;Name=FakeSNP;Note=This is a fake SNP that should appear at 1000 with length 1
ctgA	.	gene	100	200	.	+	.	gff_source=example;gff_id=EDEN;Name=EDEN;Note=protein kinase
ctgA	.	mRNA	100	200	.	+	.	gff_source=example;gff_id=EDEN.1;Index=1;Name=EDEN.1;Note=Eden splice form 1
ctgC	.	gene	100	200	.	+	.	gff_source=example;gff_id=MyGene
ctgC	.	mRNA	100	200	.	+	.	gff_source=example;gff_id=MyGene.1
ctgC	.	CDS	100	170	.	+	.	gff_source=example;gff_id=MyExon.1
ctgC	.	gene	150	250	.	+	.	gff_source=example;gff_id=AnotherGene
ctgC	.	mRNA	150	250	.	+	.	gff_source=example;gff_id=mRNA.1
ctgC	.	CDS	150	201	.	+	.	gff_source=example;gff_id=CDS.1

See also #203.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

No branches or pull requests

1 participant