-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fastrecord #9
Comments
@joinr if the record improvements are backward compatible with current |
It would, but that's a tall order IME :) Much easier to sort things out in a library IMO. That would require patching and going through JIRA, which is now a bit more closed off. There's typically a fairly high burden of proof standard as well; for the moment this is a pretty easy way to explore possible performance opts without formal patching and review and waiting etc. The other problem is that the implementation enables some changes to existing semantics; allowing pure static fields is new, the operations you'd expect on a value that passes There's also the general disinterest in the nanos and microbenchmarks and a general dubiousness about these kinds of differences IME. Some of that is valid, where microscopic performance does not hold at large (or as in the case with zach tellman's implementation of faster small vectors called tuples, they led to slow downs in other unforeseen use cases due to the jvm being confused by the proliferation of types and de-optimizing due to megamorphic call sites). Maybe if the thing is demonstrably better at no risk, it'd make sense to go through the motions of submitting a patch (probably a bit different than the rewriting I did here though. I think the ideas are there though (like improving typical lookups) to provide a simpler focused optimization of the existing defrecord. |
Another case I came across here is fast hash calculations for records |
That makes sense. Would be particularly useful if they are use as keys in collections. That was actually an initial optimization I ran up against; the ICFPC code was using a pre-computed lookup table of record coordinate pairs like {:x :y y} to index into precomputed neighborhoods (a memoized mapping to a vector). The hashing was way slow. That led to other observations... I think that it would be useful to re-implement the clojure.core stuff directly and extend this (get more control e.g. over hashing and the like); my implementation is an intentional piggy-back hack on top of that infrastructure. |
Seems reasonable. |
Map hashing, yeah. I ran into some corner cases implementing that stuff correctly. You can probably just rip out the hash computation from existing defrecord and elide the bit where it includes the unordered-hash of the extmap. I think it would be unrolling hashUnorderedColl, not too bad. I believe I did that for some stuff in fastmath patches and other things. |
I think the more unpleasant part of it would be implementing the hashing of MapEntry which extends APersistentVector |
Example from fast-math |
You can cover up the nasty bits like he did by using a utility fn (or inline all the things if you want). |
Probably just re-use clojure's generic hash function (this example is optimized for doubles specifically). |
Building off of #8 , we have a better implementation for records. During my optimization exercise for the ICFPC 2019 contest, I ran into a couple of peculiar performance phenomena with records. Records generally did fine with direct field lookups, but other map behaviors (particularly things invoking valAt) tended to perform worse than array-maps. Looking up keys outside the fields also performed worse, but far more than array maps.
It turns out, records have overhead on lookup in a couple of areas:
knowing that this is a clojure persistentmap. clojure.core/get invokes a bunch of overhead...
These are found throughout the implementation, particularly valAt, which is used a bunch.
Fortunately, we can hack the defrecord implementation to use more efficient code paths with identical semantics. A better implementation would be to rewrite the defecord macro and emitters from the ground up, but I use code walking and transforms here to define a vaster record variant, the
fastrecord
:fastrecord
gets us back to efficient lookups, and even regains our performance for non-static fields.In my use case, recovering that ~3.3x improvement for looking up non-static keys ended up being important. I added an IFn implementation in this case, which uses the optimized valAt pathways, but that doesn't have to be the case (and indeed deviates from defrecord). One option would be detecting user-supplied IFn implementations and using that instead, otherwise providing a map-lookup semantics for records. There are other improvements to make, including better hashing for primitive fields, but these are a step in the right direction (I think).
[edit] Added a fix for empty dynamic map.
The text was updated successfully, but these errors were encountered: