Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory usage of make_pmh_tables? #28

Open
cbeck88 opened this issue Dec 8, 2017 · 3 comments
Open

Reduce memory usage of make_pmh_tables? #28

cbeck88 opened this issue Dec 8, 2017 · 3 comments

Comments

@cbeck88
Copy link
Collaborator

cbeck88 commented Dec 8, 2017

Hi,

So, it occurred to me that it's possible that the memory usage of the buckets array is a bottleneck in the make_pmh_tables function.

The reason I say that is:

  • The paper that was referred to in the blog post says that the randomized PMH algorithm, in the random hash function model, is supposed to take linear (or nearly linear?) time almost surely

  • Yet, in this line, we allocate about n^2 memory on the stack to hold the buckets, and zero-initialize all of it:

    // Step 1: Place all of the keys into buckets
    cvector<bucket, M> buckets;

I actually asked a stackoverflow question about this (naively) in the process of trying to sort it out: https://stackoverflow.com/questions/47700071/time-complexity-of-allocating-uninitialized-memory-on-the-stack-in-a-constexpr-f

When I was (earlier) trying to figure out how to make this compile on gcc 5.4, one of the problems is that gcc 5 has a hard time mutating arrays at compile time, it tends to just get confused and give up on the constexpr. I was trying to figure out how to get the buckets without doing that, in like a pure-functional way. One approach that I thought of, instead of having M "dynamically-sized" mutable buckets is:

  • First compute a list of all M hash pairs: that is all N pairs of the form ([index in original array], [hash of that item % M])
  • Sort this list of pairs, using the second part (the hash) as sorting criteria. Now items that hash to the same bucket are adjacent.
  • Extract from the sorted list a list of M "ranges" corresponding to the buckets, range being like a pair of iterators. Iterate over the list of ranges in the later parts of algorithm.

I think that

  • This can be done at compile time (certainly on gcc 6+, I've given up on gcc 5 :) )
  • It would only use O(M) memory and O(M log M) time, instead of O(M) time but O(M^2) memory + initialization
  • It might speed up the step where we later sort the buckets by size, since we just sort the ranges then, and swapping means swapping two O(1)-sized things, rather than O(M) sized buckets.

Another idea would be, try to make the cvector leave its entries uninitialized when it is constructed. I'm not really sure how that works in constexpr and if there would actually be savings.

Do you think this sorting approach makes sense at a high level? I might try to implement it later. :)

@serge-sans-paille
Copy link
Owner

Yeah, the O(n^2) is a n issue. it's only paid at compile time but still it should be avoided. It's tool late for me to think clearly about that, but maybe we don't need that much space when combined with your hash selection criteria: if a bucket is oversize, we could try another seed or something like that? Preallocate N buckets of k (k being a fixed constant) elements and cope with it?

@cbeck88
Copy link
Collaborator Author

cbeck88 commented Dec 9, 2017

Yea we can do some measurements and see. That might be the simplest way to fix. Im only not sure what the max bucket size can be. The second phase can probably tolerate several medium size buckets just not TOO many. If the restriction is too tight we will time out in the do-while loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@serge-sans-paille @cbeck88 and others