Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve compile_header function #513

Closed
wants to merge 2 commits into from
Closed

Conversation

kelle
Copy link
Collaborator

@kelle kelle commented Jun 14, 2024

Improvements to make the compile_header function more user friendly.

@kelle kelle requested a review from SherelynA June 14, 2024 19:31
except KeyError:
exposure_time = None
logging.warning("No exposure time")
object_name = input("REQUIRED: Please input a name for the object: ")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of the input function for several reasons, but here are two main points:

  1. Testing this function is now more complex- you have to simulate a variable number of user inputs and while there are surely methods in pytest to do so, it'll require more setup.
  2. The flow of this function is not at all consistent. If values exist, no inputs are needed, sometimes some inputs are needed but not others. Automating this, if you want to loop through a list, will become nearly impossible.

The original implementation, which I see you commented out around line 50 is better (though can be tweaked)- identify all missing keywords from your input dictionary and fail if any of them are missing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The target user of this function (at the moment) is a high school student or undergrad who is using the spectrum for a different purpose. I want a function which will hold their hand through putting together our highest priority header keywords.

  • I have used the input() function for what I consider the highest priority keywords.
  • For keywords which they can likely figure out, e.g., Instrument , I've also used the input() function but labeled them as OPTIONAL.
  • for keywords which may have existed in the original header, I've just pulled them from the original header with no user interaction.

If you have another suggestion for how to do the hand holding, I'm open!

I also wasn't planning on doing too much testing in our CI...I was gonna let the students do the testing. As they find things which don't work, we can modify the function and add tests?

I also recognize that this function is not mission critical to SIMPLE -- the database will still work with heterogenous spectra files -- so I'm ok if this function is not to the same standard as our ingest code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all that being said, this could be a useful script in astrodb_utils at some point, especially since it's striving to be consistent with SpectrumDM 1.2. We could rename it compile_ivoa_spectrum_dm_header. And I'm working on doing the same thing for light curves at the minute. So maybe making it high quality code is a good idea....

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that occurs to me is that we can split this into two (or more) functions:

  1. one function that takes a dictionary and compiles the header and
  2. one that validates and creates the dictionary

That later function can be one using the input() calls (and we can add helper functions if needed), but the compile header function is more streamlined and can be tested or automated more readily.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think two functions is a very good idea. I think I may also go ahead and develop this in astrodb_utils instead of here in SIMPLE.

Comment on lines 180 to +184
try:
title = original_header_dict["title"] # trim so header wraps nicely
header.set("TITLE", title, "Data set title")
except KeyError:
pass
logging.warning("No title")
title = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative to so many try..except blocks, consider using something like:

title = original_header_dict.get("title")
if title is not None:
   header.set("TITLE", title, "Data set title")

The python dictionary's get method automatically provides a default value of None if the key is not present: https://docs.python.org/3.10/library/stdtypes.html#dict.get

@kelle kelle marked this pull request as draft June 17, 2024 20:57
@kelle kelle closed this Jul 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants