Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split "kmerseek search" into "search-create-query-sketch", "search-get-query-kmers", "search-do-search", "search-get-matching-kmers" #8

Open
olgabot opened this issue Feb 14, 2025 · 0 comments
Labels
Python Only involves writing Python code

Comments

@olgabot
Copy link
Contributor

olgabot commented Feb 14, 2025

Similar to #7, this would split up kmerseek search from the convenient one-liner for the CLI, into at least 6 different subcommands. These commands would call separate Sourmash/Branchwater CLIs and/or custom code:

  1. search-create-query-sketch: sourmash scripts manysketch (uses as many CPUs as available, low-medium memory)
  2. search-get-query-kmers: sourmash sig kmers (takes a long time, but only 1 CPU and low memory)
  3. search-do-search: sourmash scripts manysearch (uses as many CPUs as available, low-medium memory)
  4. search-get-matching-kmers: Custom code, most likely kmerseek.search.KmerseekResults.join_query_target_kmers(), KmerseekResults.join_results_kmers(), KmerseekResults.stitch_kmers_per_gene()
  5. search-get-uniprot-domains: [TODO: First need to implement the Uniprot domain search in the first place] Get Uniprot domains from overlapping regions found in k-mers #6
  6. search-output-per-gene-results: Custom code, most likely kmerseek.search.KmerseekResults.show_results_per_gene()

Notice that each command is named search-VERB-something -- I like this style for clarity. Also taking suggestions on naming, the hardest problem!

@olgabot olgabot added the Python Only involves writing Python code label Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Python Only involves writing Python code
Projects
None yet
Development

No branches or pull requests

1 participant