Rework local cache to address "not yet computed a rollout plan" issue. #48

DFINITYManu · 2024-09-25T14:24:57Z

The rollout dashboard keeps a local cache of all tasks it has seen, because
retrieving all task instances from Airflow is expensive. We do this
during the first loop, but in subsequent runs, we retrieve tasks that have
updated or modified after the last check's timestamp. To this, we also add
tasks that have started after the last check.

In addition to that, we now linearize the tasks in case that any of the
retrieved task lists contains the same task, but may have been updated
between requests. Linearization involves picking, for each task instance
the object with the latest date (be it execution, start or end date).

In addition to that, if the rollout plan is somehow retrieved but marked
empty when the schedule task is complete, we re-retrieve it again. This
prevents the odd error where the task has completed but the XCom associated
with the task (containing the plan) is not yet saved to the database (or
at least it looks that way, because we're racing to get the value right
after the task finished, but the value is not yet inserted stably into
the database).

Finally, this PR parallelizes the multiple requests that take place when
the task retrieval is performed. This reduces incremental update time to
roughly half of what it used to be.

The rollout dashboard keeps a local cache of all tasks it has seen, because retrieving *all* task instances from Airflow is expensive. We do this during the first loop, but in subsequent runs, we retrieve tasks that have updated or modified after the last check's timestamp. To this, we also add tasks that have started after the last check. In addition to that, we now linearize the tasks in case that any of the retrieved task lists contains the same task, but may have been updated between requests. Linearization involves picking, for each task instance the object with the latest date (be it execution, start or end date). In addition to that, if the rollout plan is somehow retrieved but marked empty when the schedule task is complete, we re-retrieve it again. This prevents the odd error where the task has completed but the XCom associated with the task (containing the plan) is not yet saved to the database (or at least it looks that way, because we're racing to get the value right after the task finished, but the value is not yet inserted stably into the database). Finally, this PR parallelizes the multiple requests that take place when the task retrieval is performed. This reduces incremental update time to roughly half of what it used to be.

rollout-dashboard/server/src/frontend_api.rs

DFINITYManu added 2 commits September 12, 2024 14:21

Server API doc link fixed.

21c4c1c

DFINITYManu requested a review from a team as a code owner September 25, 2024 14:24

sasa-tomic reviewed Sep 25, 2024

View reviewed changes

rollout-dashboard/server/src/frontend_api.rs Outdated Show resolved Hide resolved

sasa-tomic approved these changes Sep 25, 2024

View reviewed changes

Comment fix.

4916ee5

DFINITYManu merged commit 452aec7 into main Sep 25, 2024
6 checks passed

DFINITYManu deleted the requery-schedule branch September 25, 2024 14:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework local cache to address "not yet computed a rollout plan" issue. #48

Rework local cache to address "not yet computed a rollout plan" issue. #48

DFINITYManu commented Sep 25, 2024

Rework local cache to address "not yet computed a rollout plan" issue. #48

Rework local cache to address "not yet computed a rollout plan" issue. #48

Conversation

DFINITYManu commented Sep 25, 2024