You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently server would truncate activity failure if it exceeds 4KB (default threshold). This make sense if there is many pending activities (like thousands), but it does not make sense if there is only one activity.
A better solution is to only truncate if the aggregated failure from all pending activities exceeds some lager threshold.
The text was updated successfully, but these errors were encountered:
We enforce a fixed 2kb activity failure size limit for each activity. This has a couple of issues:
The limit maybe too low if a workflow only have 1 or 2 activities. We have received requests from cluster before saying we need a higher limit.
The limit is too high when workflow has lots of pending activities (max 2k pending activities), causing entire mutable state size to reach limit and workflow get terminated.
When combined with other things like buffered events, the total mutable state size may reach the limit and get workflow terminated.
Some ideas:
A total failure size limit across activities.
When ms size reaches the limit, before directly terminating the workflow, see if we can flush buffered events or truncate activity failure message.
Currently server would truncate activity failure if it exceeds 4KB (default threshold). This make sense if there is many pending activities (like thousands), but it does not make sense if there is only one activity.
A better solution is to only truncate if the aggregated failure from all pending activities exceeds some lager threshold.
The text was updated successfully, but these errors were encountered: