Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistency in offline tools to abort if a file to be read doesn't exist and abort if a file created already exists... #1622

Open
ekluzek opened this issue Jan 31, 2022 · 5 comments
Labels
code health improving internal code structure to make easier to maintain (sustainability) enhancement new capability or improved behavior of existing capability priority: low Background task that doesn't need to be done right away. usability Improve or clarify user-facing options

Comments

@ekluzek
Copy link
Collaborator

ekluzek commented Jan 31, 2022

I'm just elevating this to a discussion. I think the behavior of our offline tools should be that files to be read in are first checked for existence. If they don't exist -- the tool aborts with error explaining that the file doesn't exist. This is helpful to do since the error message given when a tool finds a problem can be hard to decipher.

Similarly if a file is going to be created, and it's going to overwrite an existing file, that file existence is checked for and it will die with an error alerting the user that the file exists. So then the user can delete the file, or rename it in order to allow creation of the new one. This is in contrast to CTSM which will just overwrite files. But, that's helpful because the list of files is long and it can happen after running for some time. Here for offline tools where the number of output files will be small, I think this makes more sense.

If we agree to this behavior we can add this to:

https://github.com/ESCOMP/CTSM/blob/master/doc/design/python_script_user_interface.rst

It is a user-interface behavior that it's good to have a convention around.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Jan 31, 2022

Here's an example error that I saw testing subset_data that shows an example error that's difficult to decipher.

INFO: ----------------------------------------------------------------------
INFO: Creating DATM files at 263.38956, 39.1082
23
78
Traceback (most recent call last):
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/pkg-library/20201220/lib/python3.7/site-packages/xarray/backends/file_manager.py", line 199, in _acquire_with_cache_info
    file = self._cache[self._key]
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/pkg-library/20201220/lib/python3.7/site-packages/xarray/backends/lru_cache.py", line 53, in __getitem__
    value = self._cache[key]
KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('/glade/p/cgd/tss/CTSM_datm_forcing_data/atm_forcing.datm7.GSWP3.0.5d.v1.c170516/Solar/clmforc.GSWP3.c2011.0.5x0.5.Solr.2018-01.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False))]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/glade/work/erik/ctsm_worktrees/neon/tools/site_and_regional/subset_data", line 42, in <module>
    main()
  File "/glade/work/erik/ctsm_worktrees/neon/tools/site_and_regional/../../python/ctsm/subset_data.py", line 572, in main
    subset_point(args, file_dict)
  File "/glade/work/erik/ctsm_worktrees/neon/tools/site_and_regional/../../python/ctsm/subset_data.py", line 469, in subset_point
    nl_datm)
  File "/glade/work/erik/ctsm_worktrees/neon/tools/site_and_regional/../../python/ctsm/site_and_regional/single_point_case.py", line 616, in create_datm_at_point
    self.extract_datm_at(infile[idx], out_f)
  File "/glade/work/erik/ctsm_worktrees/neon/tools/site_and_regional/../../python/ctsm/site_and_regional/single_point_case.py", line 524, in extract_datm_at
    f_in = self.create_1d_coord(file_in, "LONGXY", "LATIXY", "lon", "lat")
  File "/glade/work/erik/ctsm_worktrees/neon/tools/site_and_regional/../../python/ctsm/site_and_regional/base_case.py", line 111, in create_1d_coord
    f_in = xr.open_dataset(filename)
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/pkg-library/20201220/lib/python3.7/site-packages/xarray/backends/api.py", line 572, in open_dataset
    store = opener(filename_or_obj, **extra_kwargs, **backend_kwargs)
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/pkg-library/20201220/lib/python3.7/site-packages/xarray/backends/netCDF4_.py", line 364, in open
    return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/pkg-library/20201220/lib/python3.7/site-packages/xarray/backends/netCDF4_.py", line 314, in __init__
    self.format = self.ds.data_model
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/pkg-library/20201220/lib/python3.7/site-packages/xarray/backends/netCDF4_.py", line 373, in ds
    return self._acquire()
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/pkg-library/20201220/lib/python3.7/site-packages/xarray/backends/netCDF4_.py", line 367, in _acquire
    with self._manager.acquire_context(needs_lock) as root:
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/lib/python3.7/contextlib.py", line 112, in __enter__
    return next(self.gen)
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/pkg-library/20201220/lib/python3.7/site-packages/xarray/backends/file_manager.py", line 187, in acquire_context
    file, cached = self._acquire_with_cache_info(needs_lock)
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/pkg-library/20201220/lib/python3.7/site-packages/xarray/backends/file_manager.py", line 205, in _acquire_with_cache_info
    file = self._opener(*self._args, **kwargs)
  File "netCDF4/_netCDF4.pyx", line 2357, in netCDF4._netCDF4.Dataset.__init__
  File "netCDF4/_netCDF4.pyx", line 1925, in netCDF4._netCDF4._ensure_nc_success
FileNotFoundError: [Errno 2] No such file or directory: b'/glade/p/cgd/tss/CTSM_datm_forcing_data/atm_forcing.datm7.GSWP3.0.5d.v1.c170516/Solar/clmforc.GSWP3.c2011.0.5x0.5.Solr.2018-01.nc'

@billsacks
Copy link
Member

I agree - both of these are good ideas. Thanks for starting this discussion.

@ekluzek ekluzek added the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Feb 9, 2022
@ekluzek
Copy link
Collaborator Author

ekluzek commented Feb 9, 2022

@negin513 had a great point that another option would be to add an override option "-o" that would allow existing files to be overwritten for existing files.

@billsacks billsacks removed the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Feb 17, 2022
@ekluzek ekluzek added enhancement new capability or improved behavior of existing capability next this should get some attention in the next week or two. Normally each Thursday SE meeting. code health improving internal code structure to make easier to maintain (sustainability) and removed discussion labels Aug 14, 2024
@ekluzek
Copy link
Collaborator Author

ekluzek commented Aug 14, 2024

Adding next here to discuss if we should go through the tools and ensure this is true as well as map out a plan for steps to make it happen.

@samsrabin samsrabin added this to the ctsm6.0.0 (code freeze) milestone Sep 5, 2024
@samsrabin samsrabin added priority: low Background task that doesn't need to be done right away. and removed next this should get some attention in the next week or two. Normally each Thursday SE meeting. labels Sep 5, 2024
@wwieder
Copy link
Contributor

wwieder commented Sep 5, 2024

This is a usability issue that may be nice for the release, but needs to be discussed later?

@samsrabin samsrabin added the usability Improve or clarify user-facing options label Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code health improving internal code structure to make easier to maintain (sustainability) enhancement new capability or improved behavior of existing capability priority: low Background task that doesn't need to be done right away. usability Improve or clarify user-facing options
Projects
None yet
Development

No branches or pull requests

4 participants