-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to save an SgkitSampleData instance, e.g. for running the CLI #925
Comments
I guess another possibility would be to provide an input CSV file (or Zarr) which is formatted with |
Good point. This could even be a .npz file with the appropriately named variables. I guess that's basically the same as another zarr file. Probably best if @benjeffery weighs in with what he thinks would work best. |
Here's a snippet from the docs I am trying to write
Thoughts about how to specify on the command-line to use |
A JSON or yaml config file specifying inference parameters? |
It appears as if it's not possible to save an SgkitSampleData instance to a path. I'm not sure, therefore, how I might run the CLI on an zarr file, if I've specified bespoke masks, ancestral alleles, or wherever via numpy arrays (see #923)
I guess the easy way around this is to have a function that saves all the information such as bespoke masks / ancestral alleles into the zarr file (or make a copy of it if the original zarr is read-only?), and allow the CLI to run directly on that modified zarr:
A more complex possibility for the user is to have the CLI accept the same parameters as
tsinfer.SgkitSampleData(...)
, but then we might want to allow either numpy files or names in the .vcz file, which seems a bit icky, e.g.The text was updated successfully, but these errors were encountered: