Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalization classes for FITS cutouts #136

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

snbianco
Copy link
Collaborator

@snbianco snbianco commented Jan 23, 2025

This PR implements the generalized architecture for creating cutouts from FITS files. It implements Cutout, ImageCutout, and FITSCutout. It also changes the functions in cutouts.py to call the aforementioned classes and updates the test suite with a new file, test_FITSCutout.py.

Here is a class diagram of the new generalized architecture, though it may not be fully updated: https://innerspace.stsci.edu/display/MASTC/Class+Diagram

Considerations:

  • While implementing this new generalized architecture, I'm focusing on restructuring the code rather than adding features or making significant changes. The existing API should have no breaking changes, for now.
  • Ideally, FITSCutout will eventually use the _get_cutout_data method defined in ImageCutout that uses astropy.nddata.Cutout2D. Unfortunately, there is an issue with using the section attribute of an ImageHDU object that I've put in a PR to Astropy to fix. We could choose to use data instead of section, but this significantly worsens performance. I'm not sure when Astropy's next release will be, so for now, I think we will have to leave in the code that cuts out the data array and modifies the WCS for the cutout.
  • The new test class looks VERY similar to test_cutouts.py. The main difference is that test_FITSCutout.py uses the FITSCutout class directly while test_cutouts.py uses the fits_cut and img_cut functions. I've essentially copy-pasted the old code and added some assertions and cases to increase coverage.

@snbianco snbianco marked this pull request as ready for review January 28, 2025 01:45
Copy link
Collaborator

@AlexReedy AlexReedy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good to me!

Copy link
Contributor

@havok2063 havok2063 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overall looks like a good start to me. I think one thing to consider during this refactor is where spectral cutouts might fit in here. Slicing a spectrum in wavelength is fairly trivial but perhaps astrocut could provide some helper functions for efficient bulk slicing.

Or if we enable cutouts for IFU cubes, for JWST and SDSS MaNGA, we might want the option to cut both in spatial and spectral directions via this tool.

Comment on lines +284 to +294
def _get_cutout_wcs(self, img_wcs: WCS, cutout_lims: np.ndarray) -> WCS:
"""
Starting with the full image WCS and adjusting it for the cutout WCS.
Adjusts CRPIX values and adds physical WCS keywords.

Parameters
----------
img_wcs : `~astropy.wcs.WCS`
WCS for the image the cutout is being cut from.
cutout_lims : `numpy.ndarray`
The cutout pixel limits in an array of the form [[ymin,ymax],[xmin,xmax]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some FITS products use gwcs, e.g some JWST miri or nirspec cubes. They actually store it within an asdf extension. And since ASDF files are basically dictionary trees, we might eventually have asdf files that store a FITS wcs as a dictionary.

In case we want to add support for JWST cube cutouts in the future, I wonder if we want to abstract out the WCS logic into their own classes, for fits-wcs/gwcs, that could be used by either FitsCutout or ASDFCutout? Not necessarily something to do for this PR, but maybe a follow-up?

Comment on lines +129 to +131
def img_cut(input_files: List[Union[str, Path, S3Path]], coordinates: Union[SkyCoord, str],
cutout_size: Union[int, np.ndarray, Quantity, List[int], Tuple[int]] = 25, stretch: str = 'asinh',
minmax_percent: Optional[List[int]] = None, minmax_value: Optional[List[int]] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a "Returns" entry in the docstring? The name img_cut seems a bit confusing to me. Its primary difference from fits_cut is to make color images from FITS? img_cut sounds like I'm making cutouts from non-fits images, like jpeg.

With the abstraction into FITSCutout are these two functions still needed? The primary user endpoint seems to be FITSCutout and both just call that class with different options.

Comment on lines +25 to +28
if request.param == 'SPOC':
return create_test_imgs('SPOC', 50, 6, dir_name=tmpdir)
else:
return create_test_imgs('TICA', 50, 6, dir_name=tmpdir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be simplified to return create_test_imgs(request.param, 50, 6, dir_name=tmpdir)?

Comment on lines +34 to +40
if request.param == 'SPOC':
return create_test_imgs('SPOC', 50, 1, dir_name=tmpdir,
basename="img_badsip_{:04d}.fits", bad_sip_keywords=True)[0]
else:
return create_test_imgs('TICA', 50, 1, dir_name=tmpdir,
basename="img_badsip_{:04d}.fits", bad_sip_keywords=True)[0]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same with this, create_test_imgs(request.param, ..).

assert new_dir in cutout_files[0]
assert path.exists(new_dir) # new directory should now exist

cutout_hdulist.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since cutout_hdulist is being iterated over, should this be within the loop to ensure each instance gets closed properly?

Comment on lines +335 to +338
# Cutout each input file
for file in self._input_files:
self._cutout_file(file)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did astrocut ever use multiprocessing for handling batch cutouts? Given some of the roman reqs and the tests you did before on mass cutouts, we might want to create optional modes of batch processing files. Not something for this PR, but maybe something for later?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants