Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vecindex: add pacer for vector index operations #137734

Merged
merged 1 commit into from
Jan 18, 2025

Conversation

andy-kimball
Copy link
Contributor

The pacer will limit the rate of foreground insert and delete operations in the vector index such that background split and merge operations can keep up. It does this by setting the allowed ops/sec and then delaying operations that would otherwise exceed this limit.

This PR introduces the pacer, but does not change the vector index to use it. That will come in a later PR. However, this PR does simulate usage of the pacer under different interesting conditions, including edge cases.

Epic: CRDB-42943

Release note: None

@andy-kimball andy-kimball requested a review from a team as a code owner December 19, 2024 00:55
Copy link

blathers-crl bot commented Dec 19, 2024

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 12 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball and @mw5h)


pkg/sql/vecindex/pacer.go line 19 at r1 (raw file):

// size is below this threshold, throttling will still occur if the queue size
// is increasing.
const allowedQueuedFixups = 5

This number seems low to me. I'm assuming it's just a placeholder value for testing, is that right?

Also, should these numbers eventually be proportional to GOMAXPROCS?


pkg/sql/vecindex/pacer.go line 134 at r1 (raw file):

	p.mu.totalOpCount++

	// Compute elapsed time since the last query.

Do we really need to do all this work on every insert/delete? Could we short-circuit in most cases after increasing the op count? For example, if the elapsed time is less than 1 second (or some other interval), we could return immediately without updating p.mu.lastOpAt. Maybe we'd also need to compare queuedFixups against lastQueuedFixups or something, but same idea.

It also seems like the calculations below are very vulnerable to floating-point errors when the duration is short. While clamping values might prevent the pacer from making crazy decisions in that situation, we probably want to avoid getting there in the first place as much as possible.

I'm also suspicious overall of making all this logic happen on every insert or delete. Do we really need to make pacing decisions here? What if we did this instead:

  • Have the pacer run token adjustment logic every second (or a similar small interval).
  • Have OnInsertOrDelete record just the needed statistics, totalOpCount, and maybe an opCount for the current second.
  • As before, have OnInsertOrDelete subtract one token, and delay as needed.

pkg/sql/vecindex/pacer.go line 205 at r1 (raw file):

		return 0
	}
	return time.Duration(float64(time.Second) * -p.mu.currentTokens / p.mu.allowedOpsPerSec)

Should we enforce bounds on the delay?

Copy link
Contributor Author

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @mw5h)


pkg/sql/vecindex/pacer.go line 19 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

This number seems low to me. I'm assuming it's just a placeholder value for testing, is that right?

Also, should these numbers eventually be proportional to GOMAXPROCS?

I'd like to make targetQueuedFixups proportional to actual ops/sec, as that's what drives fixups. However, I want to wait until I have the "real" testing setup so I can tune these values. I also plan to add some cluster settings so we can change these on the fly.


pkg/sql/vecindex/pacer.go line 134 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Do we really need to do all this work on every insert/delete? Could we short-circuit in most cases after increasing the op count? For example, if the elapsed time is less than 1 second (or some other interval), we could return immediately without updating p.mu.lastOpAt. Maybe we'd also need to compare queuedFixups against lastQueuedFixups or something, but same idea.

It also seems like the calculations below are very vulnerable to floating-point errors when the duration is short. While clamping values might prevent the pacer from making crazy decisions in that situation, we probably want to avoid getting there in the first place as much as possible.

I'm also suspicious overall of making all this logic happen on every insert or delete. Do we really need to make pacing decisions here? What if we did this instead:

  • Have the pacer run token adjustment logic every second (or a similar small interval).
  • Have OnInsertOrDelete record just the needed statistics, totalOpCount, and maybe an opCount for the current second.
  • As before, have OnInsertOrDelete subtract one token, and delay as needed.

I've changed the logic to contain a fast path if the token bucket has at least one token in it. Also, I now only recalculate allowed ops/sec when the size of the fixup queue changes, or if it's been at least one second. This seems to work pretty well.

Regarding floating point math, I've been pretty careful to ensure that the algorithms are numerically stable. Even if durations are short, everything should work well. While short durations can cause high rates, we then adjust ops/sec according to the duration, so that has the effect of "canceling out" the high rate (i.e. we multiply and then divide by the same number). The clamping is just an extra safeguard in case there are bugs.


pkg/sql/vecindex/pacer.go line 205 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Should we enforce bounds on the delay?

There's a built-in bound: the delay can't be longer than 1 second * number of waiting operations. I've also added a way for the caller to report canceled operations so that we know if a "waiting" operation is no longer waiting.


pkg/sql/vecindex/pacer.go line 200 at r2 (raw file):

	p.mu.delayed = false

	// Calculate the desired rate of change in the fixup queue size over the next

I've updated the algorithm to hopefully be more clear. It first calculates how much we want to change the queue size, then calculates how much it's actually changing. The difference is the net change we want to achieve. We change allowed ops/sec to try and achieve that net change.

@andy-kimball andy-kimball force-pushed the pacing branch 2 times, most recently from 862e4b8 to 9e63d05 Compare January 14, 2025 16:36
Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice!

Reviewed 6 of 9 files at r2, 9 of 9 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball and @mw5h)


pkg/sql/vecindex/pacer_test.go line 142 at r3 (raw file):

			totalOpCount++
			delay = p.OnInsertOrDelete(queuedFixups)
			// Keep ops/sec <= 1000 (not counting noise).

Would it be interesting to simulate a constant arrival rate?


pkg/sql/vecindex/pacer_test.go line 161 at r3 (raw file):

			fixupsInterval := crtime.Mono(time.Second / time.Duration(fixupsPerSec))
			fixupsInterval = crtime.Mono(float64(fixupsInterval) * (1 + noiseFactor))
			nextFixup = max(now+fixupsInterval, 1)

Should the max be applied to the interval instead of the time?


pkg/sql/vecindex/pacer.go line 182 at r3 (raw file):

// since the last call to updateOpsPerSecLocked. Allowing sub-second elapsed
// increments allows the pacer to be significantly more responsive.
func (p *pacer) updateOpsPerSecLocked(elapsed time.Duration, queuedFixups int) {

Since the mu field is gone now, should we change this to just updateOpsPerSec, or alternatively, add the Locked suffix to all the pacer methods? It'd also be nice to mention that pacer is not thread-safe, unless the follow-up is just going to reintroduce the mutex.


pkg/sql/vecindex/pacer.go line 191 at r3 (raw file):

	var desiredQueueSizeRate float64
	if queuedFixups > targetQueuedFixups {
		// If the fixup queue is too large, reduce it a gradual rate that's

nit: reduce it a -> reduce it at a


pkg/sql/vecindex/pacer.go line 243 at r3 (raw file):

	// Scale the change in ops/sec by the magnitude of desired change with respect
	// to the max allowed change.
	deltaOpsPerSec *= netQueueSizeRate / maxQueueSizeRate

nit: consider changing this line to:

deltaOpsPerSec *= math.Abs(netQueueSizeRate / maxQueueSizeRate)

so that it's only updating the magnitude like the comment says. Then, we could have the netQueueSizeRate < 0 case above do this instead to make sure deltaOpsPerSec has the correct sign:

deltaOpsPerSec = deltaOpsPerSec/deltaFactor - deltaOpsPerSec

pkg/sql/vecindex/pacer.go line 274 at r3 (raw file):

	// statistical clustering or a backlog of fixups that is suddenly added to
	// the queue).
	p.queueSizeRate = max(min(p.queueSizeRate, maxQueueSizeRate), -maxQueueSizeRate)

Since this is just our estimate for the size rate, should we avoid clamping here and instead rely on the clamping when calculating the desired rate?

Copy link
Contributor Author

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @mw5h)


pkg/sql/vecindex/pacer.go line 182 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Since the mu field is gone now, should we change this to just updateOpsPerSec, or alternatively, add the Locked suffix to all the pacer methods? It'd also be nice to mention that pacer is not thread-safe, unless the follow-up is just going to reintroduce the mutex.

I re-named the methods. My plan is to use the fixupProcessor mutex processor. I added a comment to pacer saying that it's not thread-safe.


pkg/sql/vecindex/pacer.go line 243 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

nit: consider changing this line to:

deltaOpsPerSec *= math.Abs(netQueueSizeRate / maxQueueSizeRate)

so that it's only updating the magnitude like the comment says. Then, we could have the netQueueSizeRate < 0 case above do this instead to make sure deltaOpsPerSec has the correct sign:

deltaOpsPerSec = deltaOpsPerSec/deltaFactor - deltaOpsPerSec

I changed the initial calculation of deltaOpsPerSec to be negative in the case that it needs to decrease. I then changed the scaling function to be:

deltaOpsPerSec *= math.Abs(netQueueSizeRate) / maxQueueSizeRate

Hopefully that's a less subtle way of getting a negative delta.


pkg/sql/vecindex/pacer.go line 274 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Since this is just our estimate for the size rate, should we avoid clamping here and instead rely on the clamping when calculating the desired rate?

I clamped here as well because this stores the EMA. Say that was calculated to be a really large number, maybe even +INF. I don't want that to poison all future calculations of the rate. In other words, I want to ensure that the actual rate is clamped and the net rate is clamped.


pkg/sql/vecindex/pacer_test.go line 142 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Would it be interesting to simulate a constant arrival rate?

You mean if the pacer returns a delay, but before that operation has waited out the delay, another operation arrives? If that's what you mean, it's not too easy to simulate that in this loop. But I did add a new TestMultipleArrival test to explicitly test that case. The delay given to each subsequent arrival should be proportional to the amount of debt in the token bucket, in order to enforce the overall ops/sec rate.


pkg/sql/vecindex/pacer_test.go line 161 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Should the max be applied to the interval instead of the time?

Done.

@andy-kimball andy-kimball force-pushed the pacing branch 5 times, most recently from e25911e to d791989 Compare January 18, 2025 18:51
The pacer will limit the rate of foreground insert and delete operations in the
vector index such that background split and merge operations can keep up. It
does this by setting the allowed ops/sec and then delaying operations that
would otherwise exceed this limit.

This PR introduces the pacer, but does not change the vector index to use it.
That will come in a later PR. However, this PR does simulate usage of the pacer
under different interesting conditions, including edge cases.

Epic: CRDB-42943

Release note: None
@andy-kimball
Copy link
Contributor Author

bors r=drewkimball

@craig craig bot merged commit 758fe6a into cockroachdb:master Jan 18, 2025
22 checks passed
@andy-kimball andy-kimball deleted the pacing branch January 18, 2025 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants