Skip to content

Commit

Permalink
small fix
Browse files Browse the repository at this point in the history
  • Loading branch information
dendibakh committed Feb 22, 2024
1 parent cac787c commit 0bf2cdc
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ Let's confirm it with Intel [SDE](https://www.intel.com/content/www/us/en/develo
To prevent compiler interference in our experiment, we disabled vectorization and unrolling optimizations, so that each version has only one hot loop with exactly 7 assembly instructions. We use this to uniformly compare memory footprint intervals. Instead of time intervals, we use intervals measured in machine instructions. The command line we used to collect memory footprint with SDE, along with the part of its output, is shown in [@lst:MemFootprintMatmul]. Notice we use the `-fp_icount 28K` option which indicates measuring memory footprint for each interval of 28K instructions. This value is specifically chosen because it matches one iteration of the inner loop in "before" and "after" cases: `4K inner loop iterations * 7 instructions = 28K`.
By default, SDE measures footprint in cache lines (64 KB), but it can also measure it in memory pages (4KB on x86). We combined the output and put it side by side. Also, a few non-relevant columns were removed from the output. The first column `PERIOD` marks the start of a new interval of 28K instructions. The difference between each period is 28K instructions. The column `LOAD` tells how many cache lines were accessed by load instructions. Recall from the previous discussion, the same cache line accessed twice counts only once. Similarly, the column `STORE` tells how many cache lines were stored. The column `CODE` counts accessed cache lines that contain instructions that were executed during that period. Finally, `NEW` counts cache lines touched during a period, that were not seen before by the program.
By default, SDE measures footprint in cache lines (64 bytes), but it can also measure it in memory pages (4KB on x86). We combined the output and put it side by side. Also, a few non-relevant columns were removed from the output. The first column `PERIOD` marks the start of a new interval of 28K instructions. The difference between each period is 28K instructions. The column `LOAD` tells how many cache lines were accessed by load instructions. Recall from the previous discussion, the same cache line accessed twice counts only once. Similarly, the column `STORE` tells how many cache lines were stored. The column `CODE` counts accessed cache lines that contain instructions that were executed during that period. Finally, `NEW` counts cache lines touched during a period, that were not seen before by the program.
Important note before we proceed: the memory footprint reported by SDE does not equal to utilized memory bandwidth. It is because it doesn't account for whether a memory operation was served from cache or memory.
Expand Down

0 comments on commit 0bf2cdc

Please sign in to comment.