-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Usage-based caching of filters can be ineffective without tuning #16031
Comments
+1, been experiencing similar issues and been looking for this setting too for a while! |
Related question: I know the usage policy (including the underlying buffer that stores the history) is scoped to the shard but isn't the actual usage accounting done at the segment level? IF that's true, the mismatch adds a bit of complexity/variability as the runtime behavior will vary with the count of (large enough) segments -- in effect, segment count will shorten the effective history size. And that, in turn, alters the significance of the thresholds. So, as segments come and go, the caching behavior will subtly (or not so subtly) change. |
We are also experiencing a very similar issue. In our case it's not many different filters that overwhelm the LRU cache, but a single query with what could be considered a very heavy Terms Query in a filter context (thousands of terms on one field; think permissions filtering scenario), that simply refuse to get cached. Setting the aforementioned configuration option on the nodes cause it to be cached, dropping query latency from 15 seconds to less than 1 second. @dbaggott to be fair, historically the query cache will represent the cache of the query results, not the filters being used. This seems to still be the case: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-cache.html. Pre 2.x there was a filter cache, which was either removed (since it's not managed by ES anymore apparently, but by Lucene) or hidden (since the user has no control over it anymore). Be that as it may, I'm pretty sure the current visibility on filter caching behavior is zero. When Elasticsearch moved to automatic decision making on filter caching following Lucene's path I suspected such errors will arise - no automatic decision process is perfect. An all-or-nothing approach like the OP and us are having is a bit disastrous. Best solution would be to bring back at least some user control over filter-caching, a la cache key which is probably the most powerful feature ES's query DSL had (@bleskes, we discussed this shortly not long ago..). I'm hoping ES / Lucene folks will realise some level of control over caches should be given to users - for error scenarios or for advanced usages. |
@synhershko, the documentation and the terminology are all a little confusing given that, as you alluded to, queries and filters merged but the "query cache" I'm referring to and the one in the documentation you mentioned is for *filters•. Note the following (my emphasis):
This cache is not to be confused with the shard request cache. As for visibility into the cache where filters are stored, there is good visibility into node-level stats via the node stats api: I'm guessing your problem is exactly the problem of the history being too short for your use case. A query in a filter context would be cached according to the same usage policy I'm referring to and each of your multiple terms is probably taking up a slot in the history buffer -- so the history never sees the same term more than once and caching is effectively disabled under the default policy. |
I am all for improving how query caching works, but first I would like to make sure that we are not jumping too quickly to the conclusion. If you have identified requests that are much slower in 2.1 than in 1.7, would you be able to share them and capture hot threads while they are running (several times if possible) so that we can get an idea of what the bottleneck is? |
No, actual usage accounting is done at the shard level too. |
@jpountz, I'll gather documentation in support of the conclusion. Is the conceptual problem clear from from my description though? It's probably also easy to write a unit test against the default UsageTrackingFilterCachingPolicy that illustrates the problem case and then any query with a sufficiently complex filter context is going to trigger the problem. Also, just to make sure I understand:
If there is a shard with 3 segments that are large enough to satisfy the segment policy, how many times will |
Yes it is clear to me. If you have queries that involve hundreds of filters, then nothing will be cached. But there has to be a limit anyway so you will always be able to construct queries that always bypass the cache. I am not denying the fact that the query cache might be the problem but I would like to make sure it is in your case. There are so many things that happen at search time, I have been surprised many times how the actual cause of a slowdown was very different from my initial expectations in spite of the fact that understanding slow queries is something that I do all the time. The problem with this cache is that it does not behave like a regular cache, that can only make things faster. On the contrary, many queries leverage on-disk skip lists in order to only read the information that they are interested in. By caching them, we are forced to read every single matching document so that we can build a cached entry. So the worst thing that this cache could do would be to cache matching docs and then never reuse it. Unfortunately this happened a lot in 1.x, hence the pickyness of the new cache to get evidence of reuse before caching a filter.
|
Thanks for the explanations (and, w/r/t the shard-level vs segment-level question, I'm assuming you meant Here's some additional information surrounding what we're seeing. In a cluster running 2.1.1, we configured it so all but one node had the The one node without the "cache everything" behavior showed dramatically higher cpu for the entire duration as seen below (the drop off at the end is when load stopped) Here are some hot threads for the one node without cache everything behavior:
Here's the general style of the filter portion of the queries that struggle without the cache everything behavior. There are further variations on this but I think this captures the core problem:
It's the "1, 2, 3" portion of the nested "items.fieldA" that varies. The actual values are arbitrary and can range from 1 value to 100s of values. In practice, there's probably a couple hundred different combinations of values (some with a few values some with 100s) that represent the bulk of the variants. However, many more combinations are requested with less frequency. This is the core of the problem for us, the 256 history is just too short to allow the relevant filters to be cached and it's only when, by luck, there's a series of searches with a small number of values that are repeated across searches that we get any caching! And we probably never get much caching of the "non-leaf" filters. Finally, I don't have the query_stats data handy but without the "cache everything" behavior, we see a few hundred filters cached and a very high miss ratio. With the "cache everything" behavior, we see many 1000s of filters cached and a very high hit ratio. Let me know if that's sufficient information or if anything is unclear. Thanks! |
Woops indeed. Thanks for the info. The hot threads suggest that the bottleneck is reading posting data from disk. Could you report how much memory your nodes have, how much of it is given to the JVM and how large your data directory is? I am thinking that maybe you did not give enough memory to the filesystem cache? Since the hit ratio of the filter cache is high when caching everything, then the same should be true for the filesystem cache. Could you also check whether putting |
Interesting. Memory: 64 GB, with 30 GB given to the JVM (and nothing else of note running on the box). I'll need to get back to you re the mmapfs setting, I can't readily test that. I should mention that we're under a constant stream of index updates right now as we're adding additional data to every document -- prior to enabling the "cache everything" behavior, we initially suspected the relatively heavier indexing and the higher percentage of deleted documents were to blame for performance. However, we suspended the indexer for a couple of hours and it didn't help with performance so we dismissed that as a concern. Similarly, we removed the deleted documents (via a force merge) but that didn't help either. |
Also, I said "box" but these are EC2 instances and the data is on EBS gp2 (300 / 3000) volumes -- that has obvious ramifications on disk access... |
Your filesystem cache is almost the size of your data directory so I would expect the filesystem cache to do a very good job at caching posting data. The fact that caching everything helps and that you are seeing a high hit rate means that many filters are reused, which should also help with no query caching at all since the same data would be read from disk over and over again. Another data point that could be interesting would be to know if you notice a difference if you force posting data to be loaded in the fs cache (for instance, run |
The hot threads show that the term queries with more than 16 terms are the bottleneck here. When there is more than 16 terms the terms query populates a bit set with the matching docs and then returns a Scorer over this bit set. This is done in isolation (the other parts of the query are not taken into account when the bit set is populated). Your results with the cache all heuristic indicates that you have a lot of query that share the filter part (the same terms are used). Is it the expected behavior ? One thing to notice here is that the big terms query act exactly like if they were cached, the bit set contains the matching documents of the terms query alone, the only difference is that the bit set is not added to the cache at the end :(. This means that the cost of caching this query vs not caching it would be very small in terms of computation.
... this could help only if the non-filter part of the query returns few results. Another thing, in the cache there is a distinction between costly and cheap queries. Costly queries are cached when we see them twice in the 256 history whereas normal ones are cached after 5 appearances. It could be helpful to add the terms query with more than 16 terms in the list of costly queries (@jpountz ?), it seems that they are not. |
Admittedly I did not check this possibility: my assumption was that if the terms query execution was the bottleneck then the hot threads would point either to the code that decodes postings lists or to the code that builds the DocIdSet. But here all stacks are pointing to disk reads, even though they should be almost free since the filters appear to be reused, which confuses me a bit. I suggested to test mith That said, the suggestion to cache large terms filters more aggressively sounds good to me. |
Thank you so much for the feedback and suggestions. I'm still working on setting up a proper testing environment so that I can perform ad hoc tests to my heart's content so I don't have feedback on your suggestions. Yet.
Yes, that is definitely expected. The actual values of the terms are arbitrary and can range from 1 value to 100s of values. In practice, when you analyze the stream of incoming queries you see a couple hundred distinct combinations of terms (some combinations with just a few terms some with 100s) that represent the vast bulk of the variants. So, heavy reuse is expected and a very high hit ratio is exactly what I want/see with the "cache all" behavior.
At what point are the other parts of the filter taken into account?
Can you clarify what you mean by this? Particularly the very last statement. Are you saying that the individual bit sets for each term alone is (potentially) cached and so the calculation of the overall bit set can be derived (relatively) cheaply from the individual (hopefully) cached bit sets?
The delta will vary considerably depending on the exact nature of the query. But, in general, the larger the "big terms" portion of the filter, the more likely it is that the delta will be large. So, if this is a core problem, maybe it could still be a net win. What are the downsides to breaking up the large terms filter when the query portion does NOT return few results? I assume there must be some otherwise it would be implemented as an internal optimization? Ok, so avenues of investigation to understand/improve the performance of the non-cached filters are:
That sounds good to me too although, even if it's applicable to the performance of my query stream, it wouldn't help my particular use case much, if at all, given that the history size is insufficiently small to accurately sample my particular filter stream... The performance suggestions are awesome (thank you!) and deserve to be worked through as the first-tier problem -- I'd much rather have performance improved to the point where caching wasn't necessary then to rely on caching. Or, put another way, I don't want caching to cover up fundamental performance issues! That being said, even if my use case no longer has a need for filter caching, there's still a fundamental issue with usage tracking of filters: ie, in order to detect repeated usage, the sampling methodology must accommodate the amount of variance in the data stream. Otherwise, filter reuse won't be detected properly. The challenge (as I'm sure you know) is to solve that problem in a generic fashion that doesn't expose a bunch of configuration complexity and doesn't require a priori understanding of your query complexity (which is prone to changing anyway). I can imagine a time-based rolling window implementation would be more forgiving and therefore closer to "one size fits all". Reservoir sampling might be helpful. I can also imagine a variation of the current history-based one that simply dynamically resizes the history length based on the complexity of the filters. The idea being that if you never see any repeated filters (or, more realistically, the percentage of repeated filters is below some threshold) then the history length is increased. The various thresholds for caching would then be expressed as percentages instead of in absolute terms... |
Just after, suppose you have a terms query with more than 16 terms inside a boolean query with two other clauses. First the bit set with the matching docs for the terms query is build and after that the bit set is used in conjunction with the two other clauses to build the final matching docs. This is done to avoid the explosion of boolean clauses when the number of terms is big.
No the overall bit set for the terms query is computed which is exactly how the cache would work if the terms query was added to the cache. The cache would build the bit set of matching docs for the terms query alone. What I am saying is that terms query with more than 16 terms build this bit set even if the query is not cached. If you cache a query it has the downside of building this individual query alone, so if you have another part of your query that reduce the number of matching docs it is not taken into account. This is why when a query enters the cache the response time can be much bigger. For terms query with more than 16 terms this extra work is done every time so the cost of adding this query to the cache would be the same in terms of computation (and only computation). If this is not clear please read the javadocs of org.apache.lucene.queries.TermsQuery which explain how the terms query are built.
It's not exactly a downside but in that case the current heuristic on big terms query may be faster. The order of magnitude you're looking for is something like 1000 times smaller. Something like my terms query alone returns 1M results and the query part alone returns 1000 results.
Well you must consider that your big terms query (more than 16 terms) only counts for 1 entry in the cache and not one per individual terms. This is why a more aggressive caching of big term queries may solve your problem without changing the size of the window. |
@jimferenczi, thank you for the clarifications -- they are helpful! And I hear you about the more aggressive caching of big term queries, it may help more than I think. But I'm still suspicious my query stream is too diverse for the history size but, ultimately, that's an empirical question and I haven't done the analysis to answer it... I'll get back to this thread w/r/t the suggestions... |
We have just upgraded a cluster from 1.4.1 to 2.1, and it seems this is also a problem for us. We did expect issues with the upgrade since we had a lot of queries being explicitly cached. Anyway, having 2 clusters with the exact same configuration, data and handling exactly the same queries, we have the cluster running 2.1 with a cpu usage 30% higher when compared to 1.4.1.. Part of this could be attributed to the extra effort at indexing time(doc_values), but "pausing" indexing for awhile didn't really change things significantly. We will give it a try with index.queries.cache.everything, even though I'm afraid this is not a long term solution. For now, query cache for one of my data nodes(running 2.1) looks like this:
Will update this on monday after having index.queries.cache.everything enabled for sometime. @jpountz Could I gather any more relevant data for you? I do have both 1.4.1 and 2.1 running in parallel, so if you would like to compare something let me know :) |
Here I am sharing my use case. I am running elasticsearch as advertisements targeting backend of adserver. There are ad campaigns with multiple targeting parameter like age, carrier, device name, gender, os, region and so on. Usually I have under 5000 campaigns and index size is less than 10MB. It is very read-heavy environment but none of my filter is cached because of small documents number. It would be nice if I can customize caching policy. |
@lmenezes Thanks for offering help. Something that I am interested in would be to know what hot threads report on the 2.x cluster to try to get an idea of what the bottleneck is. @SeoJueun The fact that none of your filters get cached suggests that you have less than 10k documents in your index. Queries should be very fast anyway on such a small index aren't they? Notes to self about what to do/investigate next about this issue:
|
@jpountz I'm working with @lmenezes :) First of all we set This is a small list of traits that our expensive queries have:
As requested I collected some hot_threads from our data nodes:
Thanks! |
@jpountz As you said, query is quite fast with small index. But I can get better performance with filter caching even if index is small. I did experiment to compare performance between index.queries.cache.everything true and false in our production environment. true version comes with about 30% less cpu consumption than false version. Since index is really small, our elasticsearch cluster is not memory bounded but cpu bounded. It would be really nice if I can utilize enough memory with filter caching. |
@SeoJueun actually, we also use it for ads. We just haven't upgraded this particular cluster yet, but I share the concerns in this case too :) |
For the record, here's one change that should help already: #16851. |
Are queries executing faster again in 2.3 with the fixes of @jpountz, has anyone upgraded and seen improvements?
in production, but a bit unsure to just make the switch without this settings if things would be slow again. |
Ok upgraded to 2.3.1 and removed index.queries.cache.everything: true. |
@jrots you are welcome! Thank you for the feedback, I will close this issue now. |
There have been reports that the query cache did not manage to speed up search requests when the query includes a large number of different sub queries since a single request may manage to exhaust the whole history (256 queries) while the query cache only starts caching queries once they appear multiple times in the history (elastic#16031). On the other hand, increasing the size of the query cache is a bit controversial (elastic#20116) so this pull request proposes a different approach that consists of never caching term queries, and not adding them to the history of queries either. The reasoning is that these queries should be fast anyway, regardless of caching, so taking them out of the equation should not cause any slow down. On the other hand, the fact that they are not added to the cache history anymore means that other queries have greater chances of being cached.
There have been reports that the query cache did not manage to speed up search requests when the query includes a large number of different sub queries since a single request may manage to exhaust the whole history (256 queries) while the query cache only starts caching queries once they appear multiple times in the history (#16031). On the other hand, increasing the size of the query cache is a bit controversial (#20116) so this pull request proposes a different approach that consists of never caching term queries, and not adding them to the history of queries either. The reasoning is that these queries should be fast anyway, regardless of caching, so taking them out of the equation should not cause any slow down. On the other hand, the fact that they are not added to the cache history anymore means that other queries have greater chances of being cached.
There have been reports that the query cache did not manage to speed up search requests when the query includes a large number of different sub queries since a single request may manage to exhaust the whole history (256 queries) while the query cache only starts caching queries once they appear multiple times in the history (#16031). On the other hand, increasing the size of the query cache is a bit controversial (#20116) so this pull request proposes a different approach that consists of never caching term queries, and not adding them to the history of queries either. The reasoning is that these queries should be fast anyway, regardless of caching, so taking them out of the equation should not cause any slow down. On the other hand, the fact that they are not added to the cache history anymore means that other queries have greater chances of being cached.
There have been reports that the query cache did not manage to speed up search requests when the query includes a large number of different sub queries since a single request may manage to exhaust the whole history (256 queries) while the query cache only starts caching queries once they appear multiple times in the history (#16031). On the other hand, increasing the size of the query cache is a bit controversial (#20116) so this pull request proposes a different approach that consists of never caching term queries, and not adding them to the history of queries either. The reasoning is that these queries should be fast anyway, regardless of caching, so taking them out of the equation should not cause any slow down. On the other hand, the fact that they are not added to the cache history anymore means that other queries have greater chances of being cached.
There have been reports that the query cache did not manage to speed up search requests when the query includes a large number of different sub queries since a single request may manage to exhaust the whole history (256 queries) while the query cache only starts caching queries once they appear multiple times in the history (#16031). On the other hand, increasing the size of the query cache is a bit controversial (#20116) so this pull request proposes a different approach that consists of never caching term queries, and not adding them to the history of queries either. The reasoning is that these queries should be fast anyway, regardless of caching, so taking them out of the equation should not cause any slow down. On the other hand, the fact that they are not added to the cache history anymore means that other queries have greater chances of being cached.
Explicitly mention that term queries are not cached. Refer to this: elastic/elasticsearch#16031 Because this doc talks extensively about term queries and then introduces caching, readers are misled into thinking that term queries are in-fact cached. Perhaps it's best to put it up front.
@jpountz - Would it make sense to tell it upfront in the documentation that term queries are not cached now? See this link https://www.elastic.co/guide/en/elasticsearch/guide/current/_finding_exact_values.html |
For some use cases, the default settings of
UsageTrackingFilterCachingPolicy
are inappropriate. It would be ideal if those settings were configurable.For example, the default configuration includes a history that looks at the last 256 filters. In the extreme case, if you had a stream of fairly complicated filters where the total number of distinct filters was significantly greater than 256, then you'd never get caching of any filters!
In a real world example, we just went through an upgrade from ES 1.7 to 2.1.1 and experienced pretty severe degradation of response times. Analysis of the query_cache indicated that a) the cache was relatively sparsely populated; b) the miss ratio was very high. Our particular use case was not well handled b/c we filter documents using arbitrary boolean combinations of ~1k terms. As best as I could tell, the limited 256 history size meant that we were only caching the very most common terms. Inefficient caching combined with a fairly large document count led to increased response times across the board and very painful response times under heavier load.
Tellingly, what rescued performance was completely bypassing the usage-based cache via the undocumented
index.queries.cache.everything: true
setting and relying purely on the LRU-nature of the cache to keep the most useful filters around.Tuning the usage policy would be great. And, arguably, documented support of a policy that simply relies on using LRU to determine the useful filters would also be a good idea. (It did not escape my attention that the
index.queries.cache.everything
setting appears to have only been added for testing purposes!)The text was updated successfully, but these errors were encountered: