-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: Python tools for obs sequences #742
Comments
I worked through the Quickstart guide as a fairly naive xxPyxx user. I don't have python on my laptop, so I had to decide whether to install it or find it somewhere else. Summary of the CLI efforts; partially successful.[These instructions would have been more helpful to me.]
Adapting Quickstart to NCAR's Jupyterhub.
Jupyterhub complaint:
I saved the notebook into '~raeder/from_works_no_pix.ipynb'. Another jupyterhub complaint: I opened 'from_works_no_pix.ipynb' and the pictures appeared (!). A third jupyterhub complaint: |
For future developers: |
Use case
For CROCODILE, Python based tools for observation space diagnostics. Might be useful more generally for DART, so adding this issue to track.
Is your feature request related to a problem?
Originally for CROCODILE obs space diaginostic plotting in the python ecosystem, but the ability to examine obs squences in a dataframe in a Jupyter notebook (or Python tool of your choice) is quite helpful, e.g. finding duplicates in obs sequences, looking at output from obs converters, subsetting observations (in space, time, or by X), splitting and joining obs sequences.
No need to run obs_diag to bin observations, you can read the obs_sequence into a dataframe directly.
Example finding duplicates:

Describe your preferred solution
https://github.com/NCAR/pyDARTdiags. See issues for various notes and docs for documentation.
https://pypi.org/project/pydartdiags/ (but recommend you do a local editable pip install if you are developing this or playing with it)
BUYER BEWARE, this is bleeding edge.
Describe any alternatives you have considered
Currently using pandas, which seems ok (tried naively loading 20GB obs sequences one after the other, actually worked on my mac). Probably need to think about big-big data tools for going larger (and maybe faster).
Also keeping notes on other observation tools (NCAR/pyDARTdiags#4).
The text was updated successfully, but these errors were encountered: