-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to filter and export a taxonomy in e.g. NCBI taxdump style #112
Comments
Hi James, it's quite easy. Step 1: preparing taxids in the subset tree
Step 2: extracting data of needed nodes
Checking it. Since there are only two leaves here, we just dump the whole tree
|
Hi @shenwei356 ! This is absolutely perfect! This generated exactly what I needed and solved a problem I've been stuck on for the last couple o days, this should absolutely be added as a tutorial. You are quite right, that is indeed rather easy, I think I was just thinking about how taxonkit works in the wrong way (and indeed Thank you very very much! If it helps, I think the following or a variant of a tutorial title or 'description' would be match searches I had been trying: Title: 'Filtering or subsetting taxdmp files to make a custom taxdmp' But of course, just a suggestion :) |
Please feel free to close if not in scope of the tool
I have been trying to make a very small version of the NCBI taxonomy with just a few species for test-data purposes (metagenomic database building).
However I've been struggling to find a tool that can efficiently build such very small versions of the NCBI taxonomy.
In an ideal world I am looking for is something where I can supply a tool the NCBI taxonomy, a few taxon IDs (e.g. at species level), and then the tool will export in the same NCBI taxdmp format, just the taxonomic 'lineages' from root to the specified taxon ID.
Then I can use the rsulting
nodes.dmp
andnames.dmp
in any tool that accepts NCBI taxonomy as input.e.g. if I borrow the taxonkit cli system, assuming the full NCBI taxdmp files downloaded from the NCBI FTP are already in
TXaONKIT_DB
Where
custom_taxdmp/
Would contain
And the two example files would contain the contents of the following attached files (which I 'manually' reconstructed with a horrible bash script)
(too big to post)
custom_taxdmp.zip
I hope my request makes sense, and I would like to think this tool would be a suitable place for such functionality.
The text was updated successfully, but these errors were encountered: