-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Cache] Consider increasing concurrency level from 1 to a potentially higher/default value #2557
Comments
Thanks @sgup432. Its probably worth checking out. Traditionally, the cache does not require very much write throughput as opposed to read throughput, and so thats why concurrency level 1 is set. However, I dont see how this could hurt. wdyt @kotwanikunal |
FYI - It actually impacts read throughput too. Guava splits the maximum weight between partitions so that might have been a reason if there was fear about large entries. Or maybe for testing to have strict LRU. Caffeine doesn’t partition the eviction policy and doesn’t need a concurrency level. It does default to async eviction (as user functions have an unknown cost, fine to disable) and adds randomness to eviction to prevent deterministic attacks. So testing should focus on behavior instead of state, if that was an original concern. |
I think you get a good point up @sgup432. And thanks for your inputs @ben-manes. I can comment majorly for The way we use the cache is like a representative structure for the off heap memory objects. The reason we use the cache is that we utilize the baked in TTL + LRU mechanisms. We could potentially achieve the latter with just a PQ with a custom Comparator. In fact, recently with #2015 - we added in a force eviction mechanism to maintain the memory structures in sync with the off heap memory to avoid OOM issues on nodes. The solution limits cache entries to 1 thread at a time in case of a cache miss + overflow scenario. @ben-manes @sgup432 I was reading through the Caffeine docs and it seemed pretty good for high throughput use cases. Do you have any suggestions on how can we work around the above described use case? I was inclined towards a custom PQ/tracking mechanism as cache seems like an overkill for our use case. Also - we cannot segment our off heap memory to be reflective of the segmented cache - a single graph file can consume the entire off heap memory as well in certain scenarios. |
I am not sure how keeping concurrency level to 1 actually solves this. Firstly, I always thought Guava cleanup happens on the same thread(which is writing to the cache) after checking the size/time criteria and not in an async manner(ie on a different thread)? Even if it doesn't, concurrency level and cache doing cleanup logic in async manner are not related in any manner IMO. Maybe @ben-manes can comment on that. Also additionally, ideally force evictions logic should be agnostic of underlying concurrency level and should work with multiple partitions as well?
No sure if I get this entirely. Considering you are using onheap cache to track memory in off-heap, with multiple segments, you would give each segment = |
I'd have recommended OHC but it is no longer maintained. It is an off-heap native cache previously used by Cassandra, which now uses a custom allocator. I don't know enough, but I think they use Caffeine to wrap off-heap buffers, likely similar to your cache (ChunkCache or SlabAllocator).
I think you are referring to your Caffeine is layered above a Java 8 ConcurrentHashMap which was rewritten to use hash bin locks, where a bin is typically 1-3 entries, so larger caches have higher write concurrency. This allowed them to add compute methods that were linearizable. Caffeine's
@sgup432 I think you missed the cannot segment, as in they want one entry to be able to use all of the cache's capacity. The N/S behavior of Guava hurts here, but that is not the case in Caffeine. |
Ah yeah. In that case segmenting would hurt them. |
I think what I was referring to was we have a bigger problem than solving for the concurrency level in terms of the cache access behavior. Until we find a better approach than force eviction - no amount of concurrency would be useful if we have a forcing bottleneck.
Yes. I was referring to the removal listener and not the eviction itself.
This is nice. We could potentially try exploring this. |
I noticed that QuantizationCache which uses guava cache as an underlying implementation uses concurrency level as 1 here.
Same goes for NativeCacheManager and ModelCache
I wanted to know if this is intentional?
Concurrency level internally controls the number of partitions within the cache, where each partition holds its own write/read lock. With concurrency level=1, you can essentially have only 1 thread writing into the cache at a time.
With multiple writers, you will see degraded read/write performance.
Would be better to increase it further(4, 16 etc) or use the default which is 4. A quick micro benchmarking would confirm this.
If you're looking for even better performance, give Caffeine a try—it's faster 😁
The text was updated successfully, but these errors were encountered: