Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Segmentation fault while reading bandwidth in Ice Lake after reboot #469

Open
uenomad opened this issue May 31, 2022 · 7 comments
Open
Labels

Comments

@uenomad
Copy link

uenomad commented May 31, 2022

Describe the bug
Segmentation fault while reading bandwidth in Ice Lake node 8352Y after reboot.

To Reproduce
Is compiled with ACCESSMODE = accessdaemon or direct. If I switch to perf_event, it will run perfectly. Moreover, going back to accessdaemon or direct will be fixed until rebooting the node. Is it possible I'm missing some initialization?

Additional context
GDB output:

0x000015554ae847f3 in __rdpmc (cpu_id=0, counter=1073741827, value=0x1555547bab90) at ./src/access_x86_rdpmc.c:75
75	./src/access_x86_rdpmc.c: No such file or directory.
(gdb) bt
#0  0x000015554ae847f3 in __rdpmc (cpu_id=0, counter=1073741827, value=0x1555547bab90) at ./src/access_x86_rdpmc.c:75
#1  0x000015554ae8549a in access_x86_rdpmc_read (cpu_id=0, reg=780, data=0x1555547bab90) at ./src/access_x86_rdpmc.c:322
#2  0x000015554af37331 in access_x86_msr_read (cpu_id=0, reg=780, data=0x1555547bab90) at ./src/access_x86_msr.c:204
#3  0x000015554ae81a71 in access_x86_read (dev=MSR_DEV, cpu_id=0, reg=780, data=0x1555547babd0) at ./src/access_x86.c:83
#4  0x000015554af4b9f2 in HPMread (cpu_id=0, dev=MSR_DEV, reg=780, data=0x1555547bac28) at ./src/access.c:211
#5  0x000015554af0b5c1 in perfmon_readCountersThread_icelake (thread_id=0, eventSet=0x15554c0d9480) at ./src/includes/perfmon_icelake.h:1735
#6  0x000015554af1606b in __perfmon_readCounters (groupId=0, threadId=0) at ./src/perfmon.c:2662
#7  0x000015554af165b7 in perfmon_readCounters () at ./src/perfmon.c:2703
#8  0x000000000040e9d0 in likwid_events_read (id=0x63d020 <events>, ctrs=0x15554c0fba70) at likwid.c:231

We load likwid library using dlopen, and the "likwid." is the struct containing the functions.
Our initialization code:

    setenv("LIKWID_FORCE", "1", 1);
    appenv("PATH", LIKWID_PATH "/sbin");
    //likwid.perfmon_setVerbosity(1);
    if (likwid.topology_init() < 0) {
        return -1;
    }
    info = likwid.get_cpuInfo();
    topo = likwid.get_cpuTopology();
    if((cpus = (int *) calloc(topo->numHWThreads, sizeof(int))) == NULL) {
        return -1;
    }
    for (i = 0; i < topo->numHWThreads; i++) {
        cpus[i] = topo->threadPool[i].apicId;
    }
    if (likwid.perfmon_init(topo->numHWThreads, cpus) < 0) {
        return -1;
    }

Opening events:

static char *evs_names = \
                           "CAS_COUNT_ALL:MBOX0C0," \
                           "CAS_COUNT_ALL:MBOX1C0," \
                           "CAS_COUNT_ALL:MBOX2C0," \
                           "CAS_COUNT_ALL:MBOX3C0," \
                           "CAS_COUNT_ALL:MBOX4C0," \
                           "CAS_COUNT_ALL:MBOX5C0," \
                           "CAS_COUNT_ALL:MBOX6C0," \
                           "CAS_COUNT_ALL:MBOX7C0";
    if ((id->gid = likwid.perfmon_addEventSet(evs_names)) < 0) {
        return -1;
    }
    if (likwid.perfmon_setupCounters(id->gid) < 0) {
        return -1;
    }
    if (likwid.perfmon_startCounters() < 0) {
        return -1;
    }

And the reading:

    if (likwid.perfmon_readCounters() < 0) {
        return -1;
    }
    for (e = j = 0; e < id->evs_count; e++) {
        for (i = 0; i < topo->numHWThreads; i++, j++) {
            ctrs[j] = likwid.perfmon_getLastResult(id->gid, e, i);
        }
    }
@uenomad uenomad added the bug label May 31, 2022
@TomTheBear
Copy link
Member

There seem to be a cascade of problems here. When you add a raw event string instead of a performance group, the LIKWID library adds the fixed counters at the end of the event string depending on the architecture. Intel Icelake introduces a new fixed counter TOPDOWN_SLOTS:FIXC3 (register 0x30C = access_x86_rdpmc_read (..., reg=780, ...)). Since this is a MSR which can be read throught the RDPMC instruction, it uses it as a faster access method for the read. This instruction seems to fails on your system. It might be that the first perf usage activates the rdpmc instruction (it's just a MSR bit flip but it has to be done in kernel-space). Normally, LIKWID checks the usability of the instruction and otherwise falls back to the rdmsr way. This does not seem to work for you.

When I got my hands on an Icelake SP system the first time, I experienced that the RDPMC instruction failed for the FIXC3 counter, so exactly your problem. I try to investigate that.

Can you please supply some further information:

$ grep -E "model|family|stepping" /proc/cpuinfo | sort -u
$ likwid-perfctr -C 0 -g TOPDOWN_SLOTS:FIXC3 -V 3 hostname
$ perf ....
$ likwid-perfctr -C 0 -g TOPDOWN_SLOTS:FIXC3 -V 3 hostname

Use your "fixing perf command" here or install LIKWID in two different folders with accessdaemon and perf_event mode. I try to spot the difference in the LIKWID execution. If it works for likwid-perfctr, we have to check your code but at the moment, I don't see a problem with your approach.

@uenomad
Copy link
Author

uenomad commented Jun 3, 2022

I can't reboot the node now. I hope is enough. If not, I will come back when the node is rebooted.

# grep -E "model|family|stepping" /proc/cpuinfo | sort -u
cpu family	: 6
model		: 106
model name	: Intel(R) Xeon(R) Platinum 8352Y CPU @ 2.20GHz
stepping	: 6

From Access Daemon installation:

# ./likwid-perfctr -C 0 -g TOPDOWN_SLOTS:FIXC3 -V 3 hostname
--------------------------------------------------------------------------------
CPU name:	Intel(R) Xeon(R) Platinum 8352Y CPU @ 2.20GHz
CPU type:	Intel Icelake SP processor
CPU clock:	2.19 GHz
CPU family:	6
CPU model:	106
CPU short:	ICX
CPU stepping:	6
CPU features:	FP ACPI MMX SSE SSE2 HTT TM RDTSCP VMX EIST TM2 SSSE FMA SSE4.1 SSE4.2 AES AVX RDRAND HLE AVX2 AVX512 RDSEED SSE3 
CPU arch:	x86_64
--------------------------------------------------------------------------------
PERFMON version:			5
PERFMON number of counters:		8
PERFMON width of counters:		48
PERFMON number of fixed counters:	4
--------------------------------------------------------------------------------
[likwid-pin] Main PID -> hwthread 0 - OK
DEBUG - [HPMinit:98] Adjusting functions for x86 architecture in daemon mode
DEBUG - [access_x86_rdpmc_init:152] Test for RDPMC for PMC counters returned 1
DEBUG - [access_x86_rdpmc_init:158] Test for RDPMC for FIXED instruction counter returned -1
DEBUG - [access_x86_rdpmc_init:164] Test for RDPMC for FIXED core cycles counter returned -1
DEBUG - [access_x86_rdpmc_init:170] Test for RDPMC for FIXED reference cycle counter returned -1
DEBUG - [access_x86_rdpmc_init:175] Test for RDPMC for FIXED slots counter returned -1
DEBUG - [access_client_startDaemon:141] Starting daemon /hpc/opt/likwid-multi/likwid-ad/sbin/likwid-accessD
DEBUG - [access_client_startDaemon:189] Still waiting for socket /tmp/likwid-30207 for CPU 0...
DEBUG - [access_client_startDaemon:189] Still waiting for socket /tmp/likwid-30207 for CPU 0...
DEBUG - [access_client_startDaemon:189] Still waiting for socket /tmp/likwid-30207 for CPU 0...
DEBUG - [access_client_startDaemon:189] Still waiting for socket /tmp/likwid-30207 for CPU 0...
DEBUG - [access_client_startDaemon:189] Still waiting for socket /tmp/likwid-30207 for CPU 0...
DEBUG - [access_client_startDaemon:189] Still waiting for socket /tmp/likwid-30207 for CPU 0...
DEBUG - [access_client_startDaemon:189] Still waiting for socket /tmp/likwid-30207 for CPU 0...
DEBUG - [access_client_startDaemon:189] Still waiting for socket /tmp/likwid-30207 for CPU 0...
DEBUG - [access_client_startDaemon:201] Successfully opened socket /tmp/likwid-30207 to daemon for CPU 0
DEBUG - [HPMaddThread:143] Adding CPU 0 to access module
DEBUG - [access_client_read:345] Got error 'failed to read/write register' from access daemon reading reg 0x63A at CPU 0
DEBUG - [power_init:414] Deactivating policy register for RAPL domain PP0
DEBUG - [access_client_read:345] Got error 'failed to read/write register' from access daemon reading reg 0x641 at CPU 0
DEBUG - [power_init:367] RAPL domain PP1 not supported
DEBUG - [access_client_read:345] Got error 'failed to read/write register' from access daemon reading reg 0x65C at CPU 0
DEBUG - [power_init:379] Deactivating limit register for RAPL domain PLATFORM
Executing: hostname
DEBUG - [perfmon_addEventSet:2181] Currently 1 groups of 2 active
DEBUG - [perfgroup_customGroup:630] Creating custom group for event string TOPDOWN_SLOTS:FIXC3
DEBUG - [perfmon_addEventSet:2362] Added event TOPDOWN_SLOTS for counter FIXC3 to group 0
DEBUG - [perfmon_addEventSet:2362] Added event INSTR_RETIRED_ANY for counter FIXC0 to group 0
DEBUG - [perfmon_addEventSet:2362] Added event CPU_CLK_UNHALTED_CORE for counter FIXC1 to group 0
DEBUG - [perfmon_addEventSet:2362] Added event CPU_CLK_UNHALTED_REF for counter FIXC2 to group 0
DEBUG - [perfmon_setupCounterThread_icelake:717] FREEZE_PMC_AND_FIXED [0] Register 0x38F , Flags: 0x0 
DEBUG - [perfmon_setupCounterThread_icelake:922] SETUP_FIXED [0] Register 0x38D , Flags: 0x2222 
--------------------------------------------------------------------------------
DEBUG - [perfmon_startCountersThread_icelake:961] CLEAR_FIXED [0] Device 0 Register 0x30C , Flags: 0x0 
DEBUG - [perfmon_startCountersThread_icelake:961] CLEAR_FIXED [0] Device 0 Register 0x309 , Flags: 0x0 
DEBUG - [perfmon_startCountersThread_icelake:961] CLEAR_FIXED [0] Device 0 Register 0x30A , Flags: 0x0 
DEBUG - [perfmon_startCountersThread_icelake:961] CLEAR_FIXED [0] Device 0 Register 0x30B , Flags: 0x0 
DEBUG - [perfmon_startCountersThread_icelake:1232] CLEAR_PMC_AND_FIXED_OVERFLOW [0] Register 0x390 , Flags: 0xC000000F00000000 
DEBUG - [perfmon_startCountersThread_icelake:1234] UNFREEZE_PMC_AND_FIXED [0] Register 0x38F , Flags: 0xF00000000 
ice2760
DEBUG - [perfmon_stopCountersThread_icelake:1370] FREEZE_PMC_AND_FIXED [0] Register 0x38F , Flags: 0x0 
DEBUG - [access_x86_rdpmc_read:321] Read FIXED slots counter with RDPMC instruction with index 0x40000003
DEBUG - [perfmon_stopCountersThread_icelake:1410] STOP_FIXED [0] Register 0x30C , Flags: 0x189F98 
DEBUG - [perfmon_stopCountersThread_icelake:1410] STOP_FIXED [0] Register 0x309 , Flags: 0x3B27B 
DEBUG - [perfmon_stopCountersThread_icelake:1410] STOP_FIXED [0] Register 0x30A , Flags: 0x4ECBC 
DEBUG - [perfmon_stopCountersThread_icelake:1410] STOP_FIXED [0] Register 0x30B , Flags: 0x3DF60 
--------------------------------------------------------------------------------
Group 1: Custom
+-----------------------+---------+--------------+
|         Event         | Counter |  HWThread 0  |
+-----------------------+---------+--------------+
|  Runtime (RDTSC) [s]  |   TSC   | 5.093438e-04 |
|     TOPDOWN_SLOTS     |  FIXC3  |      1613720 |
|   INSTR_RETIRED_ANY   |  FIXC0  |       242299 |
| CPU_CLK_UNHALTED_CORE |  FIXC1  |       322748 |
|  CPU_CLK_UNHALTED_REF |  FIXC2  |       253792 |
+-----------------------+---------+--------------+

DEBUG - [perfmon_finalizeCountersThread_icelake:2116] SHOW_CTL [0] Device 0 Register 0x38D , Flags: 0x2222 
DEBUG - [perfmon_finalizeCountersThread_icelake:2118] CLEAR_CTL [0] Device 0 Register 0x38D , Flags: 0x0 
DEBUG - [perfmon_finalizeCountersThread_icelake:2124] CLEAR_CTR [0] Device 0 Register 0x30C , Flags: 0x0 
DEBUG - [perfmon_finalizeCountersThread_icelake:2116] SHOW_CTL [0] Device 0 Register 0x38D , Flags: 0x0 
DEBUG - [perfmon_finalizeCountersThread_icelake:2118] CLEAR_CTL [0] Device 0 Register 0x38D , Flags: 0x0 
DEBUG - [perfmon_finalizeCountersThread_icelake:2124] CLEAR_CTR [0] Device 0 Register 0x309 , Flags: 0x0 
DEBUG - [perfmon_finalizeCountersThread_icelake:2116] SHOW_CTL [0] Device 0 Register 0x38D , Flags: 0x0 
DEBUG - [perfmon_finalizeCountersThread_icelake:2118] CLEAR_CTL [0] Device 0 Register 0x38D , Flags: 0x0 
DEBUG - [perfmon_finalizeCountersThread_icelake:2124] CLEAR_CTR [0] Device 0 Register 0x30A , Flags: 0x0 
DEBUG - [perfmon_finalizeCountersThread_icelake:2116] SHOW_CTL [0] Device 0 Register 0x38D , Flags: 0x0 
DEBUG - [perfmon_finalizeCountersThread_icelake:2118] CLEAR_CTL [0] Device 0 Register 0x38D , Flags: 0x0 
DEBUG - [perfmon_finalizeCountersThread_icelake:2124] CLEAR_CTR [0] Device 0 Register 0x30B , Flags: 0x0 
DEBUG - [perfmon_finalizeCountersThread_icelake:2131] CLEAR_GLOBAL_OVF [0] Register 0x390 , Flags: 0xC000000F00000000 
DEBUG - [perfmon_finalizeCountersThread_icelake:2133] CLEAR_GLOBAL_CTRL [0] Register 0x38F , Flags: 0x0 
DEBUG - [HPMfinalize:170] Removing CPU 0 from access module

From Perf Event installation:

# ./likwid-perfctr -C 0 -g TOPDOWN_SLOTS:FIXC3 -V 3 hostname
--------------------------------------------------------------------------------
CPU name:	Intel(R) Xeon(R) Platinum 8352Y CPU @ 2.20GHz
CPU type:	Intel Icelake SP processor
CPU clock:	2.19 GHz
CPU family:	6
CPU model:	106
CPU short:	ICX
CPU stepping:	6
CPU features:	FP ACPI MMX SSE SSE2 HTT TM RDTSCP VMX EIST TM2 SSSE FMA SSE4.1 SSE4.2 AES AVX RDRAND HLE AVX2 AVX512 RDSEED SSE3 
CPU arch:	x86_64
--------------------------------------------------------------------------------
PERFMON version:			5
PERFMON number of counters:		8
PERFMON width of counters:		48
PERFMON number of fixed counters:	4
--------------------------------------------------------------------------------
[likwid-pin] Main PID -> hwthread 0 - OK
Executing: hostname
DEBUG - [perfmon_addEventSet:2181] Currently 1 groups of 2 active
DEBUG - [perfgroup_customGroup:630] Creating custom group for event string TOPDOWN_SLOTS:FIXC3
DEBUG - [perfmon_addEventSet:2362] Added event TOPDOWN_SLOTS for counter FIXC3 to group 0
DEBUG - [perfmon_addEventSet:2362] Added event INSTR_RETIRED_ANY for counter FIXC0 to group 0
DEBUG - [perfmon_addEventSet:2362] Added event CPU_CLK_UNHALTED_CORE for counter FIXC1 to group 0
DEBUG - [perfmon_addEventSet:2362] Added event CPU_CLK_UNHALTED_REF for counter FIXC2 to group 0
DEBUG - [perfmon_setupCountersThread_perfevent:691] SETUP_FIXED [0] Register 0x0 , Flags: 0x1 
DEBUG - [perfmon_setupCountersThread_perfevent:853] perf_event_open: cpu_id=0 pid=30342 flags=0
DEBUG - [perfmon_setupCountersThread_perfevent:691] SETUP_FIXED [0] Register 0x1 , Flags: 0x0 
DEBUG - [perfmon_setupCountersThread_perfevent:853] perf_event_open: cpu_id=0 pid=30342 flags=0
DEBUG - [perfmon_setupCountersThread_perfevent:691] SETUP_FIXED [0] Register 0x2 , Flags: 0x9 
DEBUG - [perfmon_setupCountersThread_perfevent:853] perf_event_open: cpu_id=0 pid=30342 flags=0
--------------------------------------------------------------------------------
DEBUG - [perfmon_startCountersThread_perfevent:909] RESET_COUNTER [0] Register 0x0 , Flags: 0x0 
DEBUG - [perfmon_startCountersThread_perfevent:922] START_COUNTER [0] Register 0x0 , Flags: 0x0 
DEBUG - [perfmon_startCountersThread_perfevent:909] RESET_COUNTER [0] Register 0x0 , Flags: 0x0 
DEBUG - [perfmon_startCountersThread_perfevent:922] START_COUNTER [0] Register 0x0 , Flags: 0x0 
DEBUG - [perfmon_startCountersThread_perfevent:909] RESET_COUNTER [0] Register 0x0 , Flags: 0x0 
DEBUG - [perfmon_startCountersThread_perfevent:922] START_COUNTER [0] Register 0x0 , Flags: 0x0 
ice2760
DEBUG - [perfmon_stopCountersThread_perfevent:945] FREEZE_COUNTER [0] Register 0x3 , Flags: 0x0 
DEBUG - [perfmon_stopCountersThread_perfevent:949] READ_COUNTER [0] Register 0x3 , Flags: 0x39370 
DEBUG - [perfmon_stopCountersThread_perfevent:974] RESET_COUNTER [0] Register 0x3 , Flags: 0x0 
DEBUG - [perfmon_stopCountersThread_perfevent:945] FREEZE_COUNTER [0] Register 0x4 , Flags: 0x0 
DEBUG - [perfmon_stopCountersThread_perfevent:949] READ_COUNTER [0] Register 0x4 , Flags: 0x49344 
DEBUG - [perfmon_stopCountersThread_perfevent:974] RESET_COUNTER [0] Register 0x4 , Flags: 0x0 
DEBUG - [perfmon_stopCountersThread_perfevent:945] FREEZE_COUNTER [0] Register 0x5 , Flags: 0x0 
DEBUG - [perfmon_stopCountersThread_perfevent:949] READ_COUNTER [0] Register 0x5 , Flags: 0x397E0 
DEBUG - [perfmon_stopCountersThread_perfevent:974] RESET_COUNTER [0] Register 0x5 , Flags: 0x0 
--------------------------------------------------------------------------------
Group 1: Custom
+-----------------------+---------+--------------+
|         Event         | Counter |  HWThread 0  |
+-----------------------+---------+--------------+
|  Runtime (RDTSC) [s]  |   TSC   | 7.135542e-04 |
|     TOPDOWN_SLOTS     |  FIXC3  |            0 |
|   INSTR_RETIRED_ANY   |  FIXC0  |       234352 |
| CPU_CLK_UNHALTED_CORE |  FIXC1  |       299844 |
|  CPU_CLK_UNHALTED_REF |  FIXC2  |       235488 |
+-----------------------+---------+--------------+

@TomTheBear
Copy link
Member

Here I see two problems in den different backends:

  • The accessdaemon backend checks the RDPMC access for FIXC3 and it fails (-1) but it uses the RDPMC methods afterwards despite the check (flag == 1). Interestingly the counter seems to return values.
    DEBUG - [access_x86_rdpmc_init:175] Test for RDPMC for FIXED slots counter returned -1
    DEBUG - [access_x86_rdpmc_read:321] Read FIXED slots counter with RDPMC instruction with index 0x40000003
    
  • In the perf_event backend, the FIXC3 counter is not handled properly. LIKWID uses the perf event definitions (PERF_COUNT_HW_INSTRUCTIONS, PERF_COUNT_HW_CPU_CYCLES, ...) for the fixed-purpose events but the kernel does not provide one for TOPDOWN_SLOTS yet (checked 2-3 months ago). That's why you see four added events but only three setups (register 0x3 is missing and returns 0 as result):
    DEBUG - [perfmon_addEventSet:2362] Added event TOPDOWN_SLOTS for counter FIXC3 to group 0
    DEBUG - [perfmon_addEventSet:2362] Added event INSTR_RETIRED_ANY for counter FIXC0 to group 0
    DEBUG - [perfmon_addEventSet:2362] Added event CPU_CLK_UNHALTED_CORE for counter FIXC1 to group 0
    DEBUG - [perfmon_addEventSet:2362] Added event CPU_CLK_UNHALTED_REF for counter FIXC2 to group 0
    DEBUG - [perfmon_setupCountersThread_perfevent:691] SETUP_FIXED [0] Register 0x0 , Flags: 0x1 
    DEBUG - [perfmon_setupCountersThread_perfevent:691] SETUP_FIXED [0] Register 0x1 , Flags: 0x0 
    DEBUG - [perfmon_setupCountersThread_perfevent:691] SETUP_FIXED [0] Register 0x2 , Flags: 0x9 
    
    I remember to wrote that fix for FIXC3 already but it is still in one of my local branches.

So, I don't know why it works if you use perf_event once. Either is generally enables RDPMC and does not deactivate it again (I would expect the kernel to deactivate it) or it is a different effect.

@TomTheBear
Copy link
Member

I was able to reproduce it on one of our Icelake nodes.

@TomTheBear
Copy link
Member

TOPDOWN_SLOTS for perf_event: 4fff949

@TomTheBear
Copy link
Member

Fixes for RDPMC: 18cc5c3 and 1b6daa2

@TomTheBear
Copy link
Member

This does not solve the problems with failing RDPMC instructions after reboot. All possible checks are passed but we get an illegal instruction. The real check would be the status of the CR4.PCE bit, but the CR4 register is only readable in kernel-space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants