-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove usage of cluster level setting for circuit breaker #2567
base: main
Are you sure you want to change the base?
Conversation
3eaf8b3
to
6365da6
Compare
} | ||
} | ||
}; | ||
this.threadPool.scheduleWithFixedDelay(runnable, TimeValue.timeValueSeconds(CB_TIME_INTERVAL), ThreadPool.Names.GENERIC); | ||
threadPool.scheduleWithFixedDelay(runnable, TimeValue.timeValueSeconds(CB_TIME_INTERVAL), ThreadPool.Names.GENERIC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could go even a step further an move this to the native cache manager to simplfiy things more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this change - I think its fairly straightforward and completely removes the potential circular dependency.
@Override | ||
public Boolean get() { | ||
try { | ||
KNNCircuitBreakerTrippedResponse knnCircuitBreakerTrippedResponse = client.execute( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: This is going to have to somehow be made to be async, which will complicate things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this change. Its a bit messy - will clean up a bit.
df687f7
to
ecd1318
Compare
@jmazanec15 - I like the overall idea of getting rid of the cluster level breaker and relying on the node level breakers. It will get us closer to the backpressure constructs: https://opensearch.org/blog/shard-indexing-backpressure-in-opensearch/ |
@@ -500,4 +486,21 @@ private void startMaintenance(Cache<String, NativeMemoryAllocation> cacheInstanc | |||
|
|||
maintenanceTask = threadPool.scheduleWithFixedDelay(cleanUp, interval, ThreadPool.Names.MANAGEMENT); | |||
} | |||
|
|||
private void circuitBreakerUpdater() { | |||
Runnable runnable = () -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the previous approach was directly in the codepath. Are we moving to runnable tasks because we want a mechanism to unset the CB?
I would rather have it as a side effect than a constant monitor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code was actually in the KNNCircuitBreaker class: https://github.com/opensearch-project/k-NN/blob/main/src/main/java/org/opensearch/knn/index/KNNCircuitBreaker.java#L60-L109. I moved it to the cache manager because it removes the circular dependency between the 2. However, the logic remains consistent
@Override | ||
public ActionListener<Void> setupContext(KNNStatFetchContext knnStatFetchContext, ActionListener<Void> actionListener) { | ||
// If there are any nodes in the cluster before 3.0, then we need to fall back to checking the CB | ||
if (minVersionSupplier.get().compareTo(Version.V_3_0_0) < 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we use the compare from Version class directly? minVersionSupplier.get().orOrBefore(Version.V_3_0_0)
@kotwanikunal Why do you say that? We only trip up per node. |
|
@kotwanikunal Oh I see. Yeah I figured thatd be a breaking change and so couldnt remove it without deprecation. Hence, the stats update. I might be mistaken |
I think it makes sense. We missed the deprecation train for this one. We have to wait until 4.0 now 😢 |
src/main/java/org/opensearch/knn/plugin/stats/KNNStatFetchContext.java
Outdated
Show resolved
Hide resolved
eca9e08
to
4cf10b8
Compare
@kotwanikunal @shatejas Can you take a look again when you get the chance? The stats are similar except instead of a new transport action, I just reuse the stats transport action and make a single node stats call before making the complete stats call.. |
@jmazanec15 can you add the output of how new stats will look like? Also, if a user has to know a CB is triggered or not, how user will identify it? With this new logic should I assume that if CB is triggered for a node all the indexing requests on that node will fail correct? if the earlier statement is true then what will happen if a node has a replica so will the replica ingestion will fail only? |
There wont be any change to the stats api interface. The only change is how we set the cluster circuit_breaker_triggered stat. Before, we were just reading from the setting. Now, we are sending out a request to all nodes to get the cache_capacity_reached stat and if any are true, then we return true.
Yes, if node has CB tripped (instead of cluster) we will fail the request. This change is https://github.com/opensearch-project/k-NN/pull/2567/files#diff-9dad8c6bb50a281c98d7cee29de3491e73a561f530d5e0a029d7deb860cf1698. It gets called here: https://github.com/opensearch-project/k-NN/blob/main/src/main/java/org/opensearch/knn/index/mapper/KNNVectorFieldMapper.java#L690. So, we would be unable to parse the index request, so replica would fail. |
// Add the stats makes sure that we dont recurse infinitely. | ||
dependentStats.forEach(knnStatsRequest::addStat); | ||
client.execute(KNNStatsAction.INSTANCE, knnStatsRequest, ActionListener.wrap(knnStatsResponse -> { | ||
knnNodeStatAggregation = new KNNNodeStatAggregation(knnStatsResponse.getNodes()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this action singleton or a new action object is created per request? My concern is will this aggregation object remain the same with two threads have the request in parallel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Will associate it with the request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Thanks @jmazanec15 for providing the info. I am bit concerned now, since if replicas fails then indexing is done partially and other cases can come. How we are thinking about those cases? |
This will be consistent to if the jvm circuit breaker for a node is tripped and it cannot send the request to the replica node. I dont think it will fail indexing on the primary. |
Simplification of circuit breaker management. Before, we were periodically updating the circuit breaker cluster setting to have a global circuit breaker. However, this is inconvenient and not actually useful. This change removes that logic and only trips based on node level circuit breaker. For knn stats, in order to fetch the cluster level circuit breaker, a transport call is made to check all of the nodes. This isnt super efficient but its made for stats calls so its not on the critical path. Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
It should be consistent irrespective of where it is triggered. Jack's changes focus on changing the tracking mechanism and not the overall behavior from what I can think of. Today, if a circuit breaker is triggered before the replica shard indexes - we would fall into the same scenario. |
Yes, so if CB is tripped during indexing on replica action, they will retry the operation (https://github.com/opensearch-project/OpenSearch/blob/f6d6aa61e5039e4c6143cc25a71c3e448572dd33/server/src/main/java/org/opensearch/action/support/replication/ReplicationOperation.java#L329), but other than that, it will fail the action in the same way. So this would be consistent with how core handles CB. |
Confirmed locally the behavior will be that it fails the replica shard. |
We could also change the exception KnnCircuitBreakerException to CircuitBreakingException so that expected retries happen. This is done here for snapshot: https://github.com/opensearch-project/OpenSearch/blob/f6d6aa61e5039e4c6143cc25a71c3e448572dd33/server/src/main/java/org/opensearch/action/admin/cluster/snapshots/status/TransportSnapshotsStatusAction.java#L710 |
@jmazanec15 thanks for the deep-dive. Can we make the behavior as close as possible to text based indices CB exceptions. |
@jmazanec15 in the stats api response will we now be seeing node level CB value and also the top level CB value? can you paste a response from your testing |
Sure, the problem is that any exception thrown in the mapper gets wrapped:
Sure there is no change in this/.
|
Yes I think we should do this. |
Will it be better to have the node level CB info too. Since now the CB is moved from top level to node level. Thoughts? |
Description
Simplification of circuit breaker management. Before, we were periodically updating the circuit breaker cluster setting to have a global circuit breaker. However, this is inconvenient and not actually useful.
This change removes that logic and only trips based on node level circuit breaker. For knn stats, in order to fetch the cluster level circuit breaker, a node stats transport call with just the cache_capacity_reached stat is made to check all of the nodes. This isnt super efficient but its made for stats calls so its not on the critical path.
Overall, this greatly simplfiies the memory management code. However, it does mean that the cluster setting "knn.circuit_breaker.triggered" will no longer have an effect. But, I think for 3.0 this is okay. For users interested in this info, they can get it from the stats API.
Related Issues
Resolves #1607
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.