-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Segmentation fault while reading bandwidth in Ice Lake after reboot #469
Comments
There seem to be a cascade of problems here. When you add a raw event string instead of a performance group, the LIKWID library adds the fixed counters at the end of the event string depending on the architecture. Intel Icelake introduces a new fixed counter When I got my hands on an Icelake SP system the first time, I experienced that the Can you please supply some further information:
Use your "fixing perf command" here or install LIKWID in two different folders with accessdaemon and perf_event mode. I try to spot the difference in the LIKWID execution. If it works for |
I can't reboot the node now. I hope is enough. If not, I will come back when the node is rebooted.
From Access Daemon installation:
From Perf Event installation:
|
Here I see two problems in den different backends:
So, I don't know why it works if you use perf_event once. Either is generally enables |
I was able to reproduce it on one of our Icelake nodes. |
|
This does not solve the problems with failing RDPMC instructions after reboot. All possible checks are passed but we get an illegal instruction. The real check would be the status of the |
Describe the bug
Segmentation fault while reading bandwidth in Ice Lake node 8352Y after reboot.
To Reproduce
Is compiled with ACCESSMODE = accessdaemon or direct. If I switch to perf_event, it will run perfectly. Moreover, going back to accessdaemon or direct will be fixed until rebooting the node. Is it possible I'm missing some initialization?
Additional context
GDB output:
We load likwid library using dlopen, and the "likwid." is the struct containing the functions.
Our initialization code:
Opening events:
And the reading:
The text was updated successfully, but these errors were encountered: