Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDG::local_to_global_id should be an arrow::Array, not a arrow::ChunkedArray #57

Open
witchel opened this issue Jan 29, 2021 · 4 comments

Comments

@witchel
Copy link
Contributor

witchel commented Jan 29, 2021

See PR #55.

@roshandathathri roshandathathri changed the title PropertyFileGraph::local_to_globa_vector should be an array, not a chunkedarray PropertyFileGraph::local_to_global_vector should be an array, not a chunkedarray Jan 29, 2021
@arthurp
Copy link
Contributor

arthurp commented Apr 16, 2021

Code changes have renamed the stuff involved, but this issue still exists:

const std::shared_ptr<arrow::ChunkedArray>& local_to_global_id() const {

@arthurp arthurp changed the title PropertyFileGraph::local_to_global_vector should be an array, not a chunkedarray RDG::local_to_global_id should be an arrow::Array, not a arrow::ChunkedArray Apr 16, 2021
@witchel
Copy link
Contributor Author

witchel commented Apr 16, 2021

Way to stay on top of things! I spent some effort trying to convert local_to_global_id to an Array, but RDG mostly deals with ChunkedArrays, which it must to read/write Arrow tables. At that point, I figured too much consistency was the hobgoblin of little minds. But I thought maybe the root problem was my laziness, so I kept the comment and the issue open.

@arthurp
Copy link
Contributor

arthurp commented Apr 16, 2021

I think some cases we convert ChunkedArray to Array in PropertyGraph by failing if there is more than one chunk. (We definitely used to.) I don't know as this will scale all that well, because chunk arrays can be longer than unchunked arrays. But depending on requirements we might be able to change PropertyGraph to use an Array while allowing RDG to still use a ChunkedArray.

PS: I try to procrastinate productively, so I will scan old issues and update them. ;-)

@witchel
Copy link
Contributor Author

witchel commented Apr 16, 2021

Well, Unchunk in ArrowInterchange.h in libsupport will correctly convert from a ChunkedArray to an Array, but as you say, there are length restrictions on Arrays that could be problematic.

I think I read that Euler would check ever-increasing number ranges for primes when he had free time. Maybe that was Newton. Anyway, thank god for Netflix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants