Add ability for pointer chasing benchmarks #500
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We wanted to measure the performance of pointer chasing on different architectures and noted that this was/is currently not directly possible with likwid-bench directly. Thus, an additional directive was added to the pseudo assembly parser(s) that allows to initialize a memory stream with two new methods (the current default is init with "all 1" - which is not changed by this PR). Both methods have in common that the resulting values in the stream can be used for register indirect, register index addressing. The two methods are
INDEX_STRIDE
follows the method used in [1, Sec. 3.B.1] to init a stream with (index + stride) % stream_sizeLINKED_LIST
creates a circularly linked list where each "virtual" list entry occupies a configurable amount of bytes which can be used to ensure that one jumps over cache lines between list elements. Note that "pointers" to the next list element are randomly arranged but the initialization ensures that a traversal of the created list will cover the whole stream, such that there are no "shortcuts".The PR also refactors stream initialization code to reduce code duplication for memory initialization. It also syncs the memory initialization to use
off_t
instead ofint
as datatype for the offset, which is what the declaration of the allocation struct employ uses.There are some drawbacks currently:
INIT
statement, currently rendering them to always load the same address. So one either chooses a different default for this case or reasons about a general solution to control these values/benchmark parameters.TYPE INT
uses... (surprise)int
under the hood. This might become a limitation when one wants to use larger arrays. Also the initialization may produce bad values when the stream size grows beyond 2 GB.So there is still stuff to elaborate on...