Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotation of data in query coordinates #1571

Open
rneher opened this issue Feb 23, 2025 · 0 comments · May be fixed by #1578
Open

Annotation of data in query coordinates #1571

rneher opened this issue Feb 23, 2025 · 0 comments · May be fixed by #1578
Labels
t:feat Type: request of a new feature, functionality, enchancement

Comments

@rneher
Copy link
Member

rneher commented Feb 23, 2025

For submission of sequences to databases like NCBI, one often requires genome annotations. Since we align sequences to a well annotated reference sequence, we can 'lift' this annotation to the query sequences.

Essentially, we each feature in the annotation, we would record the beginning and end coordinates of the feature (and subfeatures) on the query sequence.

this could for example happen alongside the extraction here:

pub fn extract_cds_from_aln(seq_aln: &[Nuc], cds: &Cds, coord_map_global: &CoordMapGlobal) -> Vec<Nuc> {

and would require adding a map aln_to_qry here

ref_to_aln_table: Vec<NucAlnGlobalPosition>,

(could happen via the function make_aln_to_ref_map)

In the output, the simplest thing would probably be to stream the new gff entries into a common gff file where the chromosome contains the index/ID of the sequence. One could also make a separate file for each sequence, but that can generate a lot of files when run on a large dataset.

@rneher rneher added good first issue Good for newcomers help wanted Extra attention is needed needs triage Mark for review and label assignment t:feat Type: request of a new feature, functionality, enchancement labels Feb 23, 2025
@ivan-aksamentov ivan-aksamentov removed good first issue Good for newcomers help wanted Extra attention is needed needs triage Mark for review and label assignment labels Feb 23, 2025
@ivan-aksamentov ivan-aksamentov linked a pull request Mar 9, 2025 that will close this issue
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t:feat Type: request of a new feature, functionality, enchancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants