Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve compile_header function #513

Closed
wants to merge 2 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
173 changes: 90 additions & 83 deletions simple/utils/spectra_convert.py
Original file line number Diff line number Diff line change
@@ -1,17 +1,20 @@
import numpy as np
import logging
from pandas import to_datetime
# from pandas import to_datetime
from astropy.time import Time
from datetime import date
import astropy.io.fits as fits
from astropy.table import Table
import astropy.units as u
import dateparser
from astrodb_utils.photometry import assign_ucd

logger = logging.getLogger("SIMPLE")


def compile_header(wavelength_data, **original_header_dict):
"""Creates a header from a dictionary of values.
TODO: Check that RA and Dec are in degrees

wavelength_data: an array of wavelengths

Expand Down Expand Up @@ -41,13 +44,13 @@ def compile_header(wavelength_data, **original_header_dict):
"observatory",
]

keywords_given = list(original_header_dict.keys())
# keywords_given = list(original_header_dict.keys())

for key in keywords_given:
if key not in required_keywords:
raise Exception(f"Not expecting keyword: {key}. Add manually.")
else:
pass
# for key in keywords_given:
# if key not in required_keywords:
# raise Exception(f"Not expecting keyword: {key}. Add manually.")
# else:
# pass

history = original_header_dict["history"]

Expand All @@ -56,81 +59,59 @@ def compile_header(wavelength_data, **original_header_dict):
header = fits.Header()
header.set("EXTNAME", "PRIMARY", "name of this extension")

try:
header.set("VOCLASS", original_header_dict["voclass"], "VO Data Model")
except KeyError:
logging.warning("No VO Data Model")
# try:
# header.set("VOCLASS", original_header_dict["voclass"], "VO Data Model")
# except KeyError:
# logging.warning("No VO Data Model")

# Target Data
# REQUIRED Target Data
try:
header.set(
"OBJECT", original_header_dict["object_name"], "Name of observed object"
)
except KeyError:
logging.warning("No object name")

try:
ra = original_header_dict["RA"]
header.set("RA", ra, "[deg] Pointing position")
except KeyError:
logging.warning("No RA")
ra = None
try:
dec = original_header_dict["dec"]
header.set("DEC", dec, "[deg] Pointing position")
except KeyError:
logging.warning("No DEC")
dec = None

try:
time = (
Time(to_datetime(original_header_dict["start_time"])).jd
+ Time(to_datetime(original_header_dict["stop_time"])).jd
) / 2
header.set("TMID", time, "[d] MJD mid expsoure")
except KeyError:
time = None
logging.warning("No time")

try:
exposure_time = original_header_dict["exposure_time"]
header.set("TELAPSE", exposure_time, "[s] Total elapsed time")
except KeyError:
exposure_time = None
logging.warning("No exposure time")
object_name = input("REQUIRED: Please input a name for the object: ")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of the input function for several reasons, but here are two main points:

  1. Testing this function is now more complex- you have to simulate a variable number of user inputs and while there are surely methods in pytest to do so, it'll require more setup.
  2. The flow of this function is not at all consistent. If values exist, no inputs are needed, sometimes some inputs are needed but not others. Automating this, if you want to loop through a list, will become nearly impossible.

The original implementation, which I see you commented out around line 50 is better (though can be tweaked)- identify all missing keywords from your input dictionary and fail if any of them are missing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The target user of this function (at the moment) is a high school student or undergrad who is using the spectrum for a different purpose. I want a function which will hold their hand through putting together our highest priority header keywords.

  • I have used the input() function for what I consider the highest priority keywords.
  • For keywords which they can likely figure out, e.g., Instrument , I've also used the input() function but labeled them as OPTIONAL.
  • for keywords which may have existed in the original header, I've just pulled them from the original header with no user interaction.

If you have another suggestion for how to do the hand holding, I'm open!

I also wasn't planning on doing too much testing in our CI...I was gonna let the students do the testing. As they find things which don't work, we can modify the function and add tests?

I also recognize that this function is not mission critical to SIMPLE -- the database will still work with heterogenous spectra files -- so I'm ok if this function is not to the same standard as our ingest code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all that being said, this could be a useful script in astrodb_utils at some point, especially since it's striving to be consistent with SpectrumDM 1.2. We could rename it compile_ivoa_spectrum_dm_header. And I'm working on doing the same thing for light curves at the minute. So maybe making it high quality code is a good idea....

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that occurs to me is that we can split this into two (or more) functions:

  1. one function that takes a dictionary and compiles the header and
  2. one that validates and creates the dictionary

That later function can be one using the input() calls (and we can add helper functions if needed), but the compile header function is more streamlined and can be tested or automated more readily.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think two functions is a very good idea. I think I may also go ahead and develop this in astrodb_utils instead of here in SIMPLE.

header.set("OBJECT", object_name, "Name of observed object")

try:
time_start = Time(to_datetime(original_header_dict["start_time"])).jd
header.set("TSTART", time_start, "[d] MJD start time")
telescope = original_header_dict["telescope"]
header.set("TELESCOP", telescope, "name of telescope")
except KeyError:
time_start = None
telescope = input("REQURIED: Please input the name of the telescope: ")
header.set("TELESCOP", telescope, "name of telescope")

try:
time_stop = Time(to_datetime(original_header_dict["stop_time"])).jd
header.set("TSTOP", time_stop, "[d] MJD stop time")
instrument = original_header_dict["instrument"]
header.set("INSTRUME", instrument, "name of instrument")
except KeyError:
time_stop = None
instrument = input("REQUIRED: Please input the name of the instrument: ")
header.set("INSTRUME", instrument, "name of instrument")

try:
obs_location = original_header_dict["observatory"]
header.set("OBSERVAT", obs_location, "name of observatory")
ra = original_header_dict["RA"]
header.set("RA_OBJ", ra, "[deg] Right Ascension of object")
except KeyError:
obs_location = None
logging.warning("No observatory")
ra = input(
"REQUIRED: Please input the right ascension of the object in degrees: "
)
header.set("RA_OBJ", ra, "[deg] Right Ascension of object")

try:
telescope = original_header_dict["telescope"]
header.set("TELESCOP", telescope, "name of telescope")
dec = original_header_dict["dec"]
header.set("DEC_OBJ", dec, "[deg] Declination of object")
except KeyError:
telescope = None
logging.warning("No telescope")
dec = input("REQUIRED: Please input the declination of the object in degrees: ")
header.set("DEC_OBJ", dec, "[deg] Declination of object")

try:
instrument = original_header_dict["instrument"]
header.set("INSTRUME", instrument, "name of instrument")
obs_date = dateparser.parse(original_header_dict["obs_date"]).strftime(
"%Y-%m-%d"
)
header.set("DATE-OBS", obs_date, "date of the observation")
except KeyError:
instrument = None
logging.warning("No instrument")
obs_date = input("REQUIRED: Please input the date of the observation: ")
obs_date = dateparser.parse(obs_date).strftime("%Y-%m-%d")
header.set("DATE-OBS", obs_date, "date of the observation")

# Wavelength info
w_units = wavelength_data.unit
Expand All @@ -139,50 +120,75 @@ def compile_header(wavelength_data, **original_header_dict):
width = (w_max - w_min).astype(np.single)
w_mid = ((w_max + w_min) / 2).astype(np.single)

header.set("SPEC_VAL", w_mid, f"[{w_units}] Characteristic spec coord")
header.set("SPEC_BW", width, f"[{w_units}] Width of spectrum")
header.set("TDMIN1", w_min, f"[{w_units}] Starting wavelength")
header.set("TDMAX1", w_max, f"[{w_units}] Ending wavelength")

try:
bandpass = original_header_dict["bandpass"]
header.set("SPECBAND", bandpass, "SED.bandpass")
except KeyError:
bandpass = None
logging.warning("No bandpass")
bandpass = assign_ucd(w_mid * wavelength_data.unit)
header.set("SPECBAND", bandpass, "SED.bandpass")

# OPTIONAL Header keywords
try:
header.set("SPEC_VAL", w_mid, f"[{w_units}] Characteristic spec coord")
exposure_time = original_header_dict["exposure_time"]
header.set("TELAPSE", exposure_time, "[s] Total elapsed time")
except KeyError:
pass
exposure_time = input("Please input the exposure time in seconds: ")
if exposure_time != "":
header.set("TELAPSE", exposure_time, "[s] Total elapsed time")

header.set("SPEC_BW", width, f"[{w_units}] Width of spectrum")
header.set("TDMIN1", w_min, f"[{w_units}] Starting wavelength")
header.set("TDMAX1", w_max, f"[{w_units}] Ending wavelength")
try:
time_start = Time(dateparser.parse(original_header_dict["start_time"])).jd
header.set("TSTART", time_start, "[d] MJD start time")
except KeyError:
time_start = None

try:
aperture = original_header_dict["aperture"]
header.set("APERTURE", aperture, "[arcsec] slit width")
time_stop = Time(dateparser.parse(original_header_dict["stop_time"])).jd
header.set("TSTOP", time_stop, "[d] MJD stop time")
except KeyError:
aperture = None
logging.warning("No aperature")
time_stop = None

try:
obs_date = to_datetime(original_header_dict["obs_date"]).strftime("%Y-%m-%d")
header.set("DATE-OBS", obs_date, "date of the observation")
time = (
Time(dateparser.parse(original_header_dict["start_time"])).jd
+ Time(dateparser.parse(original_header_dict["stop_time"])).jd
) / 2
header.set("TMID", time, "[d] MJD mid expsoure")
except KeyError:
time = None

try:
obs_location = original_header_dict["observatory"]
header.set("OBSERVAT", obs_location, "name of observatory")
except KeyError:
print(KeyError)
obs_date = None
logging.warning("No observation date")
obs_location = None

try:
aperture = original_header_dict["aperture"]
header.set("APERTURE", aperture, "[arcsec] slit width")
except KeyError:
aperture = input("OPTIONAL: Please input the slitwidth in arcseconds: ")
if aperture != "":
header.set("APERTURE", aperture, "[arcsec] slit width")

# Publication Information
try:
title = original_header_dict["title"] # trim so header wraps nicely
header.set("TITLE", title, "Data set title")
except KeyError:
pass
logging.warning("No title")
title = None
Comment on lines 180 to +184
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative to so many try..except blocks, consider using something like:

title = original_header_dict.get("title")
if title is not None:
   header.set("TITLE", title, "Data set title")

The python dictionary's get method automatically provides a default value of None if the key is not present: https://docs.python.org/3.10/library/stdtypes.html#dict.get


try:
header.set("AUTHOR", original_header_dict["author"], "Authors of the data")
except KeyError:
pass
logging.warning("No author")
author = input("OPTIONAL: Please input the original authors of the data: ")
if author != "":
header.set("AUTHOR", author, "Authors of the data")

try:
header.set("VOREF", original_header_dict["bibcode"], "Bibcode of dataset")
Expand All @@ -197,19 +203,20 @@ def compile_header(wavelength_data, **original_header_dict):
try:
header.set("VOPUB", original_header_dict["VOPUB"], "VO Publisher")
except KeyError:
logging.warning("No VO Publisher")
pass

try:
header.set("COMMENT", comment)
except KeyError:
comment = None
pass

try:
header.set("HISTORY", history)
except KeyError:
pass

header.set("DATE", date.today().strftime("%Y-%m-%d"), "Date of file creation")
header.set("CREATOR", "simple.spectra.convert_to_fits.py", "FITS file creator")

return header

Expand Down