Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ingest workflow #11

Open
victorlin opened this issue Feb 4, 2025 · 3 comments · May be fixed by #15
Open

Add ingest workflow #11

victorlin opened this issue Feb 4, 2025 · 3 comments · May be fixed by #15

Comments

@victorlin
Copy link
Member

victorlin commented Feb 4, 2025

Add an ingest workflow to the repository based on pathogen-repo-guide and tutorial in docs.

Potential data sources:

  1. NCBI GenBank – I will start with this
  2. Pathoplexus
  3. artic-network/ebolavirus – a curated subset of accessions from (1)?
  4. Fauna – was used for https://nextstrain.org/ebola, but not high priority.

At the very least this will need a lot of work to label division, which doesn't seem to be listed often on GenBank (see OR084909 or KU296360)

@victorlin victorlin changed the title ingest Add ingest workflow Feb 4, 2025
This was referenced Feb 4, 2025
@huddlej
Copy link
Contributor

huddlej commented Feb 6, 2025

It would be really cool if a Pathoplexus-based workflow could serve as the starting point for a docs.nextstrain.org tutorial on how to create a Nextstrain build with Pathoplexus.

@jameshadfield
Copy link
Member

2018-20 DRC ZEBOV data is in the INRB's GitHub

@victorlin victorlin linked a pull request Feb 7, 2025 that will close this issue
1 task
@victorlin
Copy link
Member Author

I briefly explored ingest from Pathoplexus. Summarized as an unmerged commit: 99ec7ad

While it mostly works, I'll stick with GenBank for now. My main motivation was to see if the Pathoplexus endpoint came with more metadata compared to NCBI datasets ingest, but I think it's about the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants