Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2.1] spa: make read/write queues configurable #15696

Merged

Conversation

robn
Copy link
Member

@robn robn commented Dec 21, 2023

Backporting #15675 to 2.1.

Note that master has removed ZTI_BATCH and added ZTI_SYNC. This PR matches 2.1, that is, it accepts batch instead of sync. Comments and manpages have been updated accordingly.

Motivation and Context

We are finding that as customers get larger and faster machines (hundreds of cores, large NVMe-backed pools) they keep hitting relatively low performance ceilings. Our profiling work almost always finds that they're running into bottlenecks on the SPA IO taskqs. Unfortunately there's often little we can advise at that point, because there's very few ways to change behaviour without patching.

Description

This commit adds two load-time parameters zio_taskq_read and zio_taskq_write that can configure the READ and WRITE IO taskqs directly.

This achieves two goals: it gives operators (and those that support them) a way to tune things without requiring a custom build of OpenZFS, which is often not possible, and it lets us easily try different config variations in a variety of environments to inform the development of better defaults for these kind of systems.

Because tuning the IO taskqs really requires a fairly deep understanding of how IO in ZFS works, and generally isn't needed without a pretty serious workload and an ability to identify bottlenecks, only minimal documentation is provided. Its expected that anyone using this is going to have the source code there as well.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.

How Has This Been Tested?

This PR is a further backport of #15695 and has been compiled and sanity checked only. However the "original" version of this was developed at a customer site against 2.1 and has seen hours of testing, so I feel pretty confident about it. Still, ets see what the other PR and CI here shakes out.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@robn robn changed the title spa: make read/write queues configurable [2.1] spa: make read/write queues configurable Dec 21, 2023
@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Dec 21, 2023
We are finding that as customers get larger and faster machines
(hundreds of cores, large NVMe-backed pools) they keep hitting
relatively low performance ceilings. Our profiling work almost always
finds that they're running into bottlenecks on the SPA IO taskqs.
Unfortunately there's often little we can advise at that point, because
there's very few ways to change behaviour without patching.

This commit adds two load-time parameters `zio_taskq_read` and
`zio_taskq_write` that can configure the READ and WRITE IO taskqs
directly.

This achieves two goals: it gives operators (and those that support
them) a way to tune things without requiring a custom build of OpenZFS,
which is often not possible, and it lets us easily try different config
variations in a variety of environments to inform the development of
better defaults for these kind of systems.

Because tuning the IO taskqs really requires a fairly deep understanding
of how IO in ZFS works, and generally isn't needed without a pretty
serious workload and an ability to identify bottlenecks, only minimal
documentation is provided. Its expected that anyone using this is going
to have the source code there as well.

Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
@robn robn force-pushed the spa-taskq-parameterise-2.1 branch from 949f9f9 to 199ba5c Compare December 21, 2023 23:00
@robn
Copy link
Member Author

robn commented Dec 21, 2023

Updated.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Dec 22, 2023
@behlendorf behlendorf merged commit 12a031a into openzfs:zfs-2.1.15-staging Dec 22, 2023
6 of 11 checks passed
robn added a commit to robn/zfs that referenced this pull request Jan 11, 2024
robn added a commit to robn/zfs that referenced this pull request Jan 12, 2024
behlendorf pushed a commit that referenced this pull request Jan 16, 2024
allanjude pushed a commit to KlaraSystems/zfs that referenced this pull request May 21, 2024
Missed in openzfs#15696, backporting openzfs#15675.

Signed-off-by: Rob Norris <[email protected]>
(cherry picked from commit 437d598)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants