You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This will be unblocked once the auto-scaling is done in #50
That will result in a notion of "queue pressure" being established. When there is queue pressure, the code will enforce that a new worker is scaling up.
Priority will modify both:
when a task can be considered to contribute to queue pressure
what workers can be considered for task placement
The goal of the 2nd point would be to reserve extra workers that never process low priority tasks. An expression of the policy for a low-priority task might be:
Task can not take the last remaining worker
If task is queued, this does not constitute queue pressure (this task being queued does not cause scale-up events)
Current behavior is more-or-less, all tasks are highest priority.
The text was updated successfully, but these errors were encountered:
I was just writing documentation with @Alex-Izquierdo as the bind argument, and I am realizing how cleanly this will form a "group" of task options for capacity management.
on_duplicate - have now, allows task shedding
trigger_scale_up - True / False, this task in the queue will trigger scaling up of the worker pool. This allows for low-prio tasks to be "patient", sit around and wait for capacity as opposed to growing the pool. Less clear: as a corollary to this, these may also be the last selected from the queue to run, which further minimizes scale-ups.
reserve_workers - int from 0 to max_workers. A value of 1 means that this tasks is not allowed to take the last available worker. Likewise, 2 means that it can only run if there are 3 free workers, leaving 2 reserve workers. And so on.
There is an interesting interaction between these, and we would need to write some "example" stories. For example, you could have a task that refuses to take the last worker, but still causes scale-up. So you always run the task (the task is important and time-sensitive), but it won't render the dispatcher temporarily unresponsive.
Internally, we need workers to send replies for control-and-reply communication. The reply is sending a single message and should be fast, but is extremely time-sensitive. So this "built-in" task would leave no reserve workers, and likely be set to trigger scale-up.
This will be unblocked once the auto-scaling is done in #50
That will result in a notion of "queue pressure" being established. When there is queue pressure, the code will enforce that a new worker is scaling up.
Priority will modify both:
The goal of the 2nd point would be to reserve extra workers that never process low priority tasks. An expression of the policy for a low-priority task might be:
Current behavior is more-or-less, all tasks are highest priority.
The text was updated successfully, but these errors were encountered: