Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bandwidth recommendations for direct copy #2884

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

drbasic
Copy link
Collaborator

@drbasic drbasic commented Jan 20, 2025

Продолжение #2368
Здесь делается ограничение по скорости со стороны ДискАгента.

  1. Дискагент на старте зачитывает файл от инфры со скоростью интерфейса.
  2. Дискагент считает со скольки девайсов за последнюю секунду производилось копирование.
  3. Дискагент рассчитывает сколько полосы может занять копирование одного девайса, для этого он делит всю полосу, отведенную под копирование, на всех клиентов, ограничивая каждого клиента сверху величиной MaxDeviceBandwidthMiB
  4. В ответе TEvDirectCopyBlocksResponse отправляется рекомендуемая полоса, которую может использовать клиент для этого девайса.

Copy link
Contributor

Note

This is an automated comment that will be appended during run.

🔴 linux-x86_64-relwithdebinfo: some tests FAILED for commit 6d531ee.

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
5826 5821 0 5 0 0

@drbasic drbasic force-pushed the users/drbasic/recommended-bandwidth branch from 1f10e5e to 102b8ad Compare January 20, 2025 12:12
Copy link
Contributor

Note

This is an automated comment that will be appended during run.

🔴 linux-x86_64-relwithdebinfo: some tests FAILED for commit 102b8ad.

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
5827 5814 0 7 6 0

@drbasic drbasic force-pushed the users/drbasic/recommended-bandwidth branch from 7302424 to ad8643b Compare January 21, 2025 04:52
Copy link
Contributor

Note

This is an automated comment that will be appended during run.

🟢 linux-x86_64-relwithdebinfo: all tests PASSED for commit ad8643b.

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
6332 6332 0 0 0 0

@drbasic drbasic added the blockstore Add this label to run only cloud/blockstore build and tests on PR label Jan 21, 2025
Copy link
Contributor

Note

This is an automated comment that will be appended during run.

🔴 linux-x86_64-relwithdebinfo: some tests FAILED for commit 076c215.

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
3583 3582 0 1 0 0

@drbasic drbasic force-pushed the users/drbasic/recommended-bandwidth branch from 076c215 to 26a408f Compare January 21, 2025 10:47
@drbasic drbasic marked this pull request as ready for review January 21, 2025 10:54
Copy link
Contributor

Note

This is an automated comment that will be appended during run.

🟢 linux-x86_64-relwithdebinfo: all tests PASSED for commit 26a408f.

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
3584 3584 0 0 0 0

message TDiskAgentThrottlingConfig
{
// Host limits.
optional string InfraThrottlingConfigPath = 1;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Поле InfraThrottlingConfigPath уже есть в настройках тротлера NBS.
Если файл не получилось прочитать будет использован дефолт DefaultNetworkMbitThroughput

optional double DirectCopyBandwidthFraction = 3;

// Maximum bandwidth for one device in MiB/s.
optional uint64 MaxDeviceBandwidthMiB = 4;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Нужно ограничить нагрузку на один девайс, потому что если сеть быстрая, то просядут клиентские записи в девайс, куда проводится копирование.

Rack
);
Rack,
HostPerformanceProfile.NetworkMbitThroughput);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

если дискагент создается в процессе NBS то берем полосу сетевого интерфейса, которую зачитал тротлер NBSа. Чтобы не читать файл еще раз.

////////////////////////////////////////////////////////////////////////////////

std::optional<NJson::TJsonValue> ReadJsonFile(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -20,6 +20,7 @@ class TDirectCopyActor final
const NActors::TActorId Source;
const TRequestInfoPtr RequestInfo;
const NProto::TDirectCopyBlocksRequest Request;
const ui64 RecommendedBandwidth;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RecommendedBandwidth заполняется при создании актора. Он подмешивает это в ответ перед отправкой клиенту.

@@ -71,6 +72,8 @@ class TDiskAgentActor final
// Pending WaitReady requests
TDeque<TPendingRequest> PendingRequests;

TBandwidthCalculator BandwidthCalculator {*AgentConfig};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BandwidthCalculator один на сервер.

TInstant now)
{
DeviceLastRequest[deviceUUID] = now;
ClearHistory(now - WindowDuration);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. забываем девайсы, которые не трогали больше секунды.
  2. считаем сколько у нас девайсов к которым есть запросы и делим полосу поровну


if (msg->AllZeroes) {
ChangedRangesMap.MarkNotChanged(msg->Range);
}
TimeoutCalculator->SetRecommendedBandwidth(msg->RecommendedBandwidth);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

после получения рекомендации сохраняем ее в TimeoutCalculator и он будет использовать ее при расчетах таймаутов между запросами.

@@ -77,6 +77,13 @@ FileDevices: {
BlockSize: 4096
}

ThrottlingConfig: {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

для упрощения локальной разработки

Copy link
Contributor

Note

This is an automated comment that will be appended during run.

🔴 linux-x86_64-relwithdebinfo: some tests FAILED for commit 6ae8204.

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
3584 3583 0 1 0 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blockstore Add this label to run only cloud/blockstore build and tests on PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants