-
Notifications
You must be signed in to change notification settings - Fork 616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(api): add FieldsNotFoundError #10412
base: main
Are you sure you want to change the base?
Conversation
2fae0d4
to
8be679f
Compare
ibis/expr/types/structs.py
Outdated
@@ -205,7 +205,7 @@ def __getitem__(self, name: str) -> ir.Value: | |||
KeyError: 'foo_bar' | |||
""" | |||
if name not in self.names: | |||
raise KeyError(name) | |||
raise FieldNotFoundError(self, name, self.names) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's OK to be breaking here and not raise a KeyError anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KeyError seems semantically a little wrong. I think KeyError should be for collections with a dynamic set of keys, such as a vanilla python dict
. But structs have a static set of keys, so FieldNotFoundError, as a subclass of AttributeError, seems better to me.
8be679f
to
c5f58d5
Compare
@cpcloud does this look like an idea worth considering? Any behavior requirements that you think I would need to make an implementation satisfy? |
@cpcloud should I invest more time in this or do you think this whole idea is not the right direction? |
I'm not opposed to this but I think the behavior should be different for attribute versus getitem syntax. It all comes down to what we believe the user's intent is, to the best of our knowledge. For attribute misspellings, I think we should leave them as plain old For |
That sounds like good behavior to me to return an AttributeError for There is one bit of nuance on attribute access though. If I do In pseudocode: # Raised on t.bogus
# Includes helpful suggestion "did you mean <best match among all columns AND attributes/methods>"
class TableAttributeError(AttributeError, IbisError): ...
# Raised on my_struct.bogus
# Includes helpful suggestion "did you mean <best match among all fields AND attributes/methods>"
# Maybe should get consolidated with TableAttributeError
class StructAttributeError(AttributeError, IbisError): ...
# Raised on table_or_struct["bogus", "bogus2"] and table_or_struct.select(_.bogus, _.bogus2) and my_struct["bogus"]
# Includes helpful suggestion "did you mean <best match among ONLY columns/fields>"
# Note that this contains ALL not found fields.
# I think that would be simpler to just have one error, and not one for single FieldNotFound.
# Should this inherit from AttributeError and/or KeyError?
# I think AttributeError since 1. these are statically known and 2. then a user can `catch AttributeError`
# and have all of these errors caught.
class FieldsNotFoundError(AttributeError, IbisError): ... What do you think of this spec? |
So I understand your goals, is your reasoning here so that 1. our maintenance is easier or 2. so we don't accidentally do the wrong thing for the user? |
I think that spec is too complex for now. Let's start with plain |
Sounds good, thanks, I'll try to update this implementation and let's see what we think about it! |
c5f58d5
to
66273f4
Compare
956a6a1
to
6b0d0db
Compare
Not sure what is causing the postgres failure but I think it's unrelated. Details
|
def __init__( | ||
self, | ||
container: object, | ||
names: str | Iterable[str], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not totally sold on the names of these arguments/attributes.
""" | ||
if name not in self.names: | ||
raise KeyError(name) | ||
raise FieldsNotFoundError(self, name, self.names) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We didn't have a test for Struct.__getitem
before, but if we had, then we would see that this change is breaking. Before Struct["foo_bar"]
raised a KeyError, now it raises a FieldsNotFoundError. I think something that derives from AttributeError is more semantically correct: we know what fields are present at compile time, whereas I think of KeyError I think of for when you have a runtime-missing key. IDK if this is standard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EDIT: now FieldsNotFoundError derives ONLY from IbisError, and neither KeyError nor AttributeError.
"bogus", | ||
("bogus", False), | ||
ibis.desc("bogus"), | ||
_.bogus, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FieldsNotFoundError is a subclass of AttributeError, so this is only breaking for the test cases that looked for IbisTypeError.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EDIT: now that I made FieldsNotFound NOT a subclass of Attribute error, this is now breaking for all of these test cases.
As you can see in the updated tests, this change is potentially breaking for users who are trying to catch IbisTypeError. But, it does unify these two errors so they only need to catch a FieldsNotFoundError, so that is a plus. I think this breakage is OK. We could make FieldsNotFoundError subclass from BOTH AttributeError and IbisTypeError? IDK, I could see how table.bogus is a sort of TypeError. |
ibis/common/exceptions.py
Outdated
@@ -45,6 +48,37 @@ class RelationError(ExpressionError): | |||
"""RelationError.""" | |||
|
|||
|
|||
class FieldsNotFoundError(AttributeError, IbisError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you decide to subclass AttributeError
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was because it originally was a FieldNotFoundError
(singular), so it made more sense in the sense of "you tried to lookup some statically known thing on this object that doesn't exist". But now that it stores multiple missing fields, the heirarchy doesn't make sense, so I removed AttributeError as a base class, now it is just inherits from IbisError.
return self.limit(limit, offset=offset) | ||
except com.FieldsNotFoundError: | ||
# the raised FieldsNotFoundError contains the Op, but we want | ||
# to only expose the Expr. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's avoid storing unused state on exceptions. This can have unintended consequences for object lifetimes due to exceptions keeping stack frames around for tracebacks.
Please remove any unused fields from your new exception types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wow, I'd never thought about how that might effect lifetimes, that's really interesting. Sounds good, I am removing them. I originally had them because in an earlier version I did actually need to keep track of them, but now I don't need that, so I'll stop storing them.
I have been getting sick of typing some_table_or_struct.field_that_doesnt_exist_or_has_a_small_typo and then getting a useless error message. This PR makes that UX much better.
6b0d0db
to
434fe52
Compare
I have been getting sick of typing
some_table_or_struct.field_that_doesnt_exist_or_has_a_small_typo
and then getting a useless error message. It also is annoying to dosome_table.select(doesnt_exist, also_doesnt_exist)
, and you only get an error for the very first one. It would be better if you got back info on ALL the values that failed to get resolved.This PR makes that UX much better.
Still need to add tests, but I wanted to get this up here for some initial thoughts before I invested more time. Is this something we want to pursue?