Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opencl support for cpu: amd cpus not showing up as opencl devices #1010

Closed
goofyseeker311 opened this issue Oct 18, 2024 · 5 comments
Closed

Comments

@goofyseeker311
Copy link

goofyseeker311 commented Oct 18, 2024

Question

what is with the amd ryzen 5000 series cpus not showing up as opencl devices on windows 11?
nvidia gpus and amd igpus show up just fine in the CLDemo java program. where is the issue?

self-answer: downloading and installing the intel opencl runtime for cpu works for amd cpus too.
https://www.intel.com/content/www/us/en/developer/articles/technical/intel-cpu-runtime-for-opencl-applications-with-sycl-support.html

@goofyseeker311
Copy link
Author

goofyseeker311 commented Oct 20, 2024

A second question, why is java lwjgl opencl simple math calculations much slower on nvidia discrete gpus than even amd cpus and igpus, by like 2-4x (multiplication and float4[] matrix multiplication). nvidia opencl cuda running the same opencl program is almost as fast as java auto-vectorized code on cpu. only taking account time taken to run clEnqueueNDRangeKernel() and clFinish(). all data is pre-uploaded and clFinish() before starting the benchmark run.

@goofyseeker311
Copy link
Author

goofyseeker311 commented Oct 20, 2024

what is wrong with the opencl/cuda, it gets about 1/1000 floating point operations of what it should be getting. say 2gflops instead of 0.7-2tflops for cpu. and 20gflops instead of 20tflops, for a gpu. yep doing plain C=A*B float multiplications for arrays. or float4 array multiplications with matrix shaped array.

@goofyseeker311
Copy link
Author

goofyseeker311 commented Oct 21, 2024

How can you get an long type event, from PointerBuffer, to be used for event profiling for NDRangeEnqueued kernel running. there is no overload for PointerBuffer type of clGetEventProfilingInfo, just the long event types. also the NDRangeEnqueue function only accepts PointerBuffer events, not long type of events.

In other words, how can you do kernel runtime start-end time profiling from lwjgl.

@Spasi
Copy link
Member

Spasi commented Oct 21, 2024

Hey @goofyseeker311,

The cl_event * event parameter of clEnqueueNDRangeKernel is an output parameter. If you pass a PointerBuffer there, when the call returns a cl_event value will have been written to it. Example code:

PointerBuffer pe = ...; // cl_event *
clEnqueueNDRangeKernel(..., pe);

long e = pe.get(0); // cl_event
clGetEventProfilingInfo(e, ...);

@goofyseeker311
Copy link
Author

goofyseeker311 commented Oct 21, 2024

yes. (so how to get the profiling start/end times out of the event. instead of using the code below.)

nvm. somehow I was not able to get that pe.get(0); stuff working before. whatever I did wrong.

previous code looked like this:

PointerBuffer event = clStack.mallocPointer(1);
if (CL12.clEnqueueNDRangeKernel(clQueue, clKernel, dimensions, null, globalWorkSize, null, null, event)==CL12.CL_SUCCESS) {
	long ctimestart = System.nanoTime();
	CL12.clWaitForEvents(event);
	long ctimeend = System.nanoTime();
	float ctimedif = (ctimeend-ctimestart)/1000000.0f;
}

edit: new code looks like this:

PointerBuffer event = clStack.mallocPointer(1);
if (CL12.clEnqueueNDRangeKernel(clQueue, clKernel, dimensions, null, globalWorkSize, null, null, event)==CL12.CL_SUCCESS) {
	CL12.clWaitForEvents(event);
	long eventLong = event.get(0);
	long[] ctimestart = {0};
	long[] ctimeend = {0};
	CL12.clGetEventProfilingInfo(eventLong, CL12.CL_PROFILING_COMMAND_START, ctimestart, (PointerBuffer)null);
	CL12.clGetEventProfilingInfo(eventLong, CL12.CL_PROFILING_COMMAND_END, ctimeend, (PointerBuffer)null);
	float ctimedif = (ctimeend[0]-ctimestart[0])/1000000.0f;

@Spasi Spasi closed this as completed Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants