-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Featurizers missing prechecking #385
Comments
Note that returning a NaN for precheck could be considered as neither failing or passing a precheck, e.g., if the precheck test was skipped for a large structure |
@computron Should prechecks have the ability to be skipped though? AFAIK they can be all be super fast operations that (1) pass (2) fail or (3) throw an error. If (3) happens the user should look into it. For example: def precheck(self, structure): # for some NN featurizer
if len(structure.sites) > 250:
warnings.warn(some_warning)
...
# returns True or False, regardless of whether the warning shows or not Or is there some other precheck mechanism you had in mind? |
At the very least precheck skipping should be discouraged. Let's not introduce this unless we feel it is really necessary. Based on your comment I got the sense that some large structures could not be prechecked quickly. If that's the case it might be better to throw NaN versus have the precheck take forever. |
Yes - I think all featurizers can have a precheck that never skips. I don't see any operations where just checking if the featurizer will take a long time will itself take a long time. Edit: I see in my comment where confusion could come from, changed it |
While #384 introduced 3 prechecks for featurizers, there are many more featurizers that could benefit from a precheck. This issue can be used for tracking these featurizers
CationProperty
,OxidationStates
,ElectronAffinity
,ElectronegativityDiff
,IonProperty
: all-metal entries' oxidation states are ill-defined, so they could be considered invalidSiteStatsFingerprint
): Throw warnings (not neccessarily fail precheck) when large structures are prechecked, cuz featurizing them is gonna take forever (prechecks should be very fast regardless)The text was updated successfully, but these errors were encountered: