-
Notifications
You must be signed in to change notification settings - Fork 998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arrow write_parquet removes .internal.selfref, data.table warning message not helpful #6737
Comments
The key is in the help of Arrow's read_parquet |
Hi @rikivillalba - good catch with |
Mmm no . Perhaps earlier arrow versions restore all attributes and newers do not save/restore those starting with dots. However, .internal.selfref cannot be "saved" as it is a true pointer (i.e. not serializable), loading a data.table directiy won't work without warning unless you use data.table native methods, i.e. setDT, as.data.table, or fread. Data.table checks whether .internal.selfref is ok and print the warning if not, and corrects it. |
this sounds like a regression that arrow should fix |
@rikivillalba, I shared your observations on As far as this issue re the warning message is concerned, you said:
Would it be worth updating the warning message to clarify this? |
The bug can be replicated as follows (I'm on Windows 11, using version 4.4.2 of R). It is new behaviour as of arrow 17.0.
What was happening took effort to track down, because it was not obvious to me that writing and reading a data.table file was covered by the warning message. (With the added complication that I was using targets, which called write/read parquet in the background because I'd selected to save my targets as parquet files).
I have reported the bug to arrow, here. I debated whether to cross-post, but given the request in the warning message itself, decided to. Please delete/close if this cross-posting was ill-advised.
The text was updated successfully, but these errors were encountered: