Skip to content

Commit

Permalink
Remove delay-based backpressure in favor of explicit queue limits (#1515
Browse files Browse the repository at this point in the history
)

Right now, we have a backpressure delay that's a function of `(job
count, bytes in flight)`.

The actual queue length is set implicitly by the shape of that function
(e.g. its gain) and the actual downstairs IO time. If a downstairs slows
down due to load, then the queue length has to grow to compensate!

Having an implicitly-defined queue length is tricky, because it's not
well controlled. Queue length also affects latency in subtle ways, so
directly controlling the queue length would be helpful.

This PR removes the delay-based backpressure implementation in favor of
a simple pair of semaphores: there are a certain number of job and byte
permits available, and the `Guest` has to acquire them before sending a
job to the `Upstairs`.

In other words, writes will be accepted as fast as possible _until_ we
run out of permits; we will then shift to a one-in, one-out operation
mode where jobs have to be completed by the downstairs before a new job
is submitted. The maximum queue length (either in jobs or bytes) is well
known and set by global constants.

Architecturally, we replace the `BackpressureGuard` with a new
`IOLimitGuard`, which is claimed by the `Guest` and stored in relevant
`BlockOp` variants (then moved into the `DownstairsIO`). It still uses
RAII to automatically release permits when dropped, and we still
manually drop it early (once a job is complete on a particular
downstairs).
  • Loading branch information
mkeeter authored Nov 12, 2024
1 parent 61b1df7 commit a6624f1
Show file tree
Hide file tree
Showing 13 changed files with 551 additions and 680 deletions.
8 changes: 0 additions & 8 deletions cmon/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@ enum DtraceDisplay {
Replaced,
ExtentLiveRepair,
ExtentLimit,
Backpressure,
NextJobId,
JobDelta,
DsDelay,
Expand All @@ -61,7 +60,6 @@ impl fmt::Display for DtraceDisplay {
DtraceDisplay::Replaced => write!(f, "replaced"),
DtraceDisplay::ExtentLiveRepair => write!(f, "extent_live_repair"),
DtraceDisplay::ExtentLimit => write!(f, "extent_under_repair"),
DtraceDisplay::Backpressure => write!(f, "backpressure"),
DtraceDisplay::NextJobId => write!(f, "next_job_id"),
DtraceDisplay::JobDelta => write!(f, "job_delta"),
DtraceDisplay::DsDelay => write!(f, "ds_delay"),
Expand Down Expand Up @@ -229,9 +227,6 @@ fn print_dtrace_header(dd: &[DtraceDisplay]) {
DtraceDisplay::ExtentLimit => {
print!(" {:>4}", "EXTL");
}
DtraceDisplay::Backpressure => {
print!(" {:>5}", "BAKPR");
}
DtraceDisplay::NextJobId => {
print!(" {:>7}", "NEXTJOB");
}
Expand Down Expand Up @@ -348,9 +343,6 @@ fn print_dtrace_row(d_out: Arg, dd: &[DtraceDisplay], last_job_id: &mut u64) {
DtraceDisplay::ExtentLimit => {
print!(" {:4}", d_out.ds_extent_limit);
}
DtraceDisplay::Backpressure => {
print!(" {:>5}", d_out.up_backpressure);
}
DtraceDisplay::NextJobId => {
print!(" {:>7}", d_out.next_job_id);
}
Expand Down
3 changes: 1 addition & 2 deletions tools/dtrace/single_up_info.d
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ crucible_upstairs*:::up-status
* I'm not very happy about this, but if we don't print it all on one
* line, then multiple sessions will clobber each others output.
*/
printf("%8s %17s %17s %17s %5s %5s %9s %5s %10s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s\n",
printf("%8s %17s %17s %17s %5s %5s %5s %10s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s\n",

substr(session_id, 0, 8),

Expand All @@ -62,7 +62,6 @@ crucible_upstairs*:::up-status
* Job ID delta and backpressure
*/
json(copyinstr(arg1), "ok.next_job_id"),
json(copyinstr(arg1), "ok.up_backpressure"),
json(copyinstr(arg1), "ok.write_bytes_out"),

/*
Expand Down
3 changes: 1 addition & 2 deletions tools/dtrace/sled_upstairs_info.d
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ crucible_upstairs*:::up-status
* we don't print it all on one line, then multiple sessions will
* clobber each others output.
*/
printf("%5d %8s %17s %17s %17s %5s %5s %9s %5s %10s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s\n",
printf("%5d %8s %17s %17s %17s %5s %5s %5s %10s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s\n",

pid,
substr(session_id, 0, 8),
Expand All @@ -62,7 +62,6 @@ crucible_upstairs*:::up-status

/* Job ID and backpressure */
json(copyinstr(arg1), "ok.next_job_id"),
json(copyinstr(arg1), "ok.up_backpressure"),
json(copyinstr(arg1), "ok.write_bytes_out"),

/* In progress jobs on the work list for each downstairs */
Expand Down
5 changes: 2 additions & 3 deletions tools/dtrace/upstairs_info.d
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ tick-1s
{
printf("%6s ", "PID");
printf("%17s %17s %17s", "DS STATE 0", "DS STATE 1", "DS STATE 2");
printf(" %5s %5s %9s %5s", "UPW", "DSW", "JOBID", "BAKPR");
printf(" %5s %5s %9s", "UPW", "DSW", "JOBID");
printf(" %10s", "WRITE_BO");
printf(" %5s %5s %5s", "IP0", "IP1", "IP2");
printf(" %5s %5s %5s", "D0", "D1", "D2");
Expand Down Expand Up @@ -49,10 +49,9 @@ crucible_upstairs*:::up-status
printf(" %5s", json(copyinstr(arg1), "ok.ds_count"));

/*
* Job ID and backpressure
* Job ID and outstanding bytes
*/
printf(" %9s", json(copyinstr(arg1), "ok.next_job_id"));
printf(" %5s", json(copyinstr(arg1), "ok.up_backpressure"));
printf(" %10s", json(copyinstr(arg1), "ok.write_bytes_out"));

/*
Expand Down
Loading

0 comments on commit a6624f1

Please sign in to comment.