Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RAFT wrappers around current_device_resource functions #2424

Closed

Conversation

harrism
Copy link
Member

@harrism harrism commented Aug 28, 2024

Fixes #2423

Adds local RAFT wrappers around rmm::mr::get/set_current_device_resource and resource_ref aliases. The idea is to localize changes needed to keep up with RMM refactoring.

The wrappers are added in cpp/include/raft/core/resource/device_memory_resource.hpp

I've marked this PR as breaking because it breaks the ABI, however the API is compatible.

@github-actions github-actions bot added the cpp label Aug 28, 2024
@harrism harrism added feature request New feature or request breaking Breaking change and removed cpp labels Aug 28, 2024
@harrism harrism changed the title Fea/raft_current_resource_wrappers Add RAFT wrappers around current_device_resource functions Aug 28, 2024
@github-actions github-actions bot added the cpp label Aug 28, 2024
@@ -375,7 +375,7 @@ rmm::mr::cuda_memory_resource cuda_mr;
// set the initial size to half of the free device memory
auto init_size = rmm::percent_of_free_device_memory(50);
rmm::mr::pool_memory_resource<rmm::mr::cuda_memory_resource> pool_mr{&cuda_mr, init_size};
rmm::mr::set_current_device_resource(&pool_mr); // Updates the current device resource pointer to `pool_mr`
raft::resource::set_current_device_resource(&pool_mr); // Updates the current device resource pointer to `pool_mr`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add raft/core/resource/device_memory_resource.hpp to includes of this code example

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@achirkin achirkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to put the new resource helpers somewhere into a detail namespace to discourage the use of these outside of where it is strictly necessary within raft.
Otherwise it goes against raft principle of keeping the program state locally in a raft handle by introducing the global state functions in the api.
In the docs, where we refer to rmm::mr::get_current_device_resource(), I'd rather keep the rmm namespaces, possibly adjusting them to reflect the upcoming API changes in the rmm - again because raft is not supposed to provide this API.

@harrism
Copy link
Member Author

harrism commented Sep 16, 2024

I'd suggest to put the new resource helpers somewhere into a detail namespace

This would severely limit the utility of this PR. I have a dependent PR in each RAPIDS repo that depends on RAFT (cuVS, cuML, cumlprims_mg, cuGraph) to use these RAFT wrappers rather than directly using RMM's functions.

@achirkin
Copy link
Contributor

I'm sorry that this limits the utility of the PR, but, in my opinion, this plainly goes against the whole design of raft resources.
Other reviewers may disagree.

@harrism
Copy link
Member Author

harrism commented Sep 16, 2024

Perhaps you can point me to the design document?

@cjnolet
Copy link
Member

cjnolet commented Sep 18, 2024

Perhaps you can point me to the design document?

We do outline what @achirkin is talking about in our developer guide. It may sound like something that was done in haste but it was actually very intentional and done for a specific purpose.

We've always relied on RMM's APIs to be used outside of RAFT (and outside of cuML, cuGraph, etc...) so that we could merely honor whatever the user has configured with RMM. More recently, we did add a capability for a user to set a different RMM memory resource for temporary memory allocations (and an additionl one for smaller vs larger allocations), but those features still continue to follow the constraint that RAFT maintains no global state of its own.

It's dangerous for us to do that because the underlying computational functions now have potential interdependencies that we have to manage and make guarantees with respect to RAFT itself, which now means RAFT needs to be concerned with thread safety.

This is a separation of concerns- RAFT, cuML, cuGraph, cuOpt, and others have intentionally avoided doing memory management things in libraries outside of RMM because RMM Is supposed to be responsible for that. If you put this wrapper inside RMM itself then we'd still get the benefits of defining it somewhere upstream (in this case within the RMM namespace) so everyone downstream can use it. Is there any reason we can't just alias this inside of RMM in the meantime?

@harrism
Copy link
Member Author

harrism commented Sep 18, 2024

Aliasing within RMM would just be duplicating something that is already in RMM. The point was to localize changes in non-RMM libraries to a single function rather than have to change every function as we go through this churn. It is painful to modify every function in every RMM-dependent library when RMM changes.

@harrism harrism closed this Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change cpp feature request New feature or request
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[FEA] Add RAFT wrappers for RMM current_device_resource functions.
4 participants