Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with preparing TxDB and GRanges object from GFF file #85

Open
benjytan88 opened this issue Jul 14, 2023 · 4 comments
Open

Issue with preparing TxDB and GRanges object from GFF file #85

benjytan88 opened this issue Jul 14, 2023 · 4 comments

Comments

@benjytan88
Copy link

Hi,
I am having some trouble making a gene annotation object (TxDB or GRanges) for CHM13 from the GFF file.
When making the TxDB object, I got the following warning message:

> txdb <- getTxDb(organism = "Homo sapiens", file = "./ref_genome/chm13v2.0_RefSeq_Liftoff_v5.1.gff3")
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Warning messages:
1: In .extract_transcripts_from_GRanges(tx_IDX, gr, mcols0$type, mcols0$ID,  :
  some transcripts have no "transcript_id" attribute ==> their name ("tx_name" column
  in the TxDb object) was set to NA
2: In .extract_transcripts_from_GRanges(tx_IDX, gr, mcols0$type, mcols0$ID,  :
  the transcript names ("tx_name" column in the TxDb object) imported from the
  "transcript_id" attribute are not unique
3: In .find_exon_cds(exons, cds) :
  The following transcripts have exons that contain more than one CDS (only the first
  CDS was kept for each exon): NM_001134939.1, NM_001172437.2, NM_001184961.1,
  NM_001301020.1, NM_001301302.1, NM_001301371.1, NM_002537.3, NM_004152.3,
  NM_015068.3, NM_016178.2

Is this normal? I also noticed that there are quite a lot of NAs in the table and was having difficulties converting it to a GRanges object.

I think the annotation would be very useful to everyone else, so would it be possible for your team to host a TxDB / GRanges object of the annotation here?

Thank you very much.

@njaupan
Copy link

njaupan commented Nov 1, 2023

Hi, any update? I am looking for the same TxDB and GRanges files for the T2T genome.

@chrarnold
Copy link

I also run into this, can someone please comment on this? I wonder whether it is related to the known issue with the gff3 file sometimes missing a "parent" entry

@arangrhie
Copy link
Collaborator

Hello, could you provide a few examples of how the TxDB and GRanges files are used? I'd like to compare those from hg38 annotations to see how they differ.

@axxxxx08
Copy link

Hi, I encountered the same warning messages when creating the TxDb and GRanges objects, and I was following the example provided in this link:
https://github.com/Bioconductor/GenomicFeatures/issues/65

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants