-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates for CUDA versions >= 12 #87
Conversation
…e code through nvcc for consistence with the thrust and cub ABI changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This compiles for me on cuda 12.6 and runs a basic regression simulation. I have not tried on 11.x again to make sure we are backwards compatible. Anyone else tried this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice maintenance change, forced by the new cuda compile strategy, but I think it also makes the codebase simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. The goal here was to get the implementation out of the header where it required NVTX-specific structures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is still for testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like cruft to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we planning on stylistically removing this everywhere? Fine if so and I'll start when I touch any other code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really stylistic. In most compilers, using the "" first checks your local directory, and if it doesn't find a match then moves on to check the system paths. Using <> starts the search with system headers. NVTX.H
should always be local. So I would say: use "" when we really mean local directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason extern int cudaGlobalDevice;
now comes earlier, or just convenience?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, cudaGlobalDevice
should be defined independently of whether nvcc
is guiding the compile. It's not about earlier but getting it out of the __NVCC__
block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking -- is this true for all cuda >12? Or just 12.4 and up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the release notes, it's true for all of CUDA 12. It could be that some of this was deprecated for a few point releases before finally not working at 12.4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clean fix here, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's old cruft from emacs.
Somewhat related, GPU runners are now available (see here), so I could imagine working this compile into CI so that we have some advance notice of issues. |
Signed-off-by: Georgia Stuart <[email protected]>
Rework cuda file handling for >= 12
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been working with this on cuda 12.6, everything seems to be working as expected. Anything left to do before merge?
@@ -53,6 +53,7 @@ public: | |||
//! Finish and clean-up (caching data necessary for restart) | |||
virtual void finish() {} | |||
|
|||
// #if HAVE_LIBCUDA==1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// #if HAVE_LIBCUDA==1 |
I think we can merge this after quickly checking that a compile with CUDA 11.x still works. I'm a little nervous that this PR is really a stop-gap measure. In the end, we should probably move the whole compile to C++-20 modules to avoid this |
Okay, compiles on CUDA < 12. It's a wrap. |
Summary of updates
nvcc
. Many thanks to Georgia Stuart who contributed theCMakeLists.txt
recipes for doing this!