-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add enum checking for attributes #557
Conversation
@sherjeelshabih please use this as a starting point. |
@lukaspie Please check the warnings for NXxps, they are probably relevant, similar as for NXmpes_arpes. |
@rettigl thanks for the pointer. Indeed, the I started making them |
I added a small check for discussion. I now add list types (based on our convention) as enumeration objects in the validation tree. I tested with passing a list [0, 0, 1] and that works as expected now. If there are more examples, I can have a look. |
@sherjeelshabih Thanks for the update! With this branch now, I got new errors because I was using values for enumerated attributes that were not in the list. I updated pynxtools-xps, so now the tests are passing again. However, I was expecting that I still get a similar warning as before (see above) since the thee |
I checked the combination of this PR and #554. This seems to work more or less. The vectors are correctly checked for the enums, and a warning is given if the dtype is wrong (ints passed for NX_CHAR). However, the checking of list of strings seems to not work anymore now, as example the axes attribute in data. I pass a tuple with ("angular0", "angular1", "energy"), and it produces these errors:
@RubelMozumder This might be due to that tuples are not considered in #554, and also maybe not here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few issues with this after the last commit, see comments. For me, the combination of this branch before the last commit, and #554 seems to work fine.
if isinstance(value, list): | ||
return all(is_valid_data_field(v, nxdl_type, path) for v in value), output_value | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This produces one warning per element in a list, if the datatype is not correct. @RubelMozumder had a different solution for this in #554.
"ISO8601": (str), | ||
"NX_BINARY": (bytes, bytearray, np.byte, np.ubyte, np.ndarray), | ||
"NX_BOOLEAN": (bool, np.ndarray, np.bool_), | ||
"NX_CHAR": (str, np.ndarray, np.chararray), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still allows to write numeric arrays to an NX_CHAR, which is not correct. This issue is also solved by #554
try: | ||
if accepted_types[0] is bool and isinstance(value, str): | ||
value = convert_str_to_bool_safe(value) | ||
if value is None: | ||
raise ValueError | ||
output_value = accepted_types[0](value) | ||
except ValueError: | ||
collector.collect_and_log( | ||
path, ValidationProblem.InvalidType, accepted_types, nxdl_type | ||
) | ||
return False, value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep the boolean string to bool conversion, this certainly makes sense and We could even think about doing this for all datatypes with safe literal conversion, as you did for the enums.
@sanbrock This here adds lists as possible elements of enumerations for the dataconverter. However, read_nexus apparently still does not support this. In the debug log, I don't get any warning or so about a wrong enum:
Nomad, on the other hand, reports a warning when parsing the file:
Do you know where to fix this? |
Closed in favor of #554 |
Possible fix for #556