Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability for pointer chasing benchmarks #500

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

christgau
Copy link
Contributor

We wanted to measure the performance of pointer chasing on different architectures and noted that this was/is currently not directly possible with likwid-bench directly. Thus, an additional directive was added to the pseudo assembly parser(s) that allows to initialize a memory stream with two new methods (the current default is init with "all 1" - which is not changed by this PR). Both methods have in common that the resulting values in the stream can be used for register indirect, register index addressing. The two methods are

  1. INDEX_STRIDE follows the method used in [1, Sec. 3.B.1] to init a stream with (index + stride) % stream_size
  2. LINKED_LIST creates a circularly linked list where each "virtual" list entry occupies a configurable amount of bytes which can be used to ensure that one jumps over cache lines between list elements. Note that "pointers" to the next list element are randomly arranged but the initialization ensures that a traversal of the created list will cover the whole stream, such that there are no "shortcuts".

The PR also refactors stream initialization code to reduce code duplication for memory initialization. It also syncs the memory initialization to use off_t instead of int as datatype for the offset, which is what the declaration of the allocation struct employ uses.

There are some drawbacks currently:

  1. Both stride and the linked list item/block size are currently read from the pseudo assembly file. It would be nice to control them from the "outside", e.g. by a command line argument to likwid-bench. This was attempted during development, but the approach caused crashes when the tests/benchmarks are "burned" into likwid-bench as they appear to be read-only in that case (didn't dig too deep here). The strided benchmarks files actually lack the optional stride in the INIT statement, currently rendering them to always load the same address. So one either chooses a different default for this case or reasons about a general solution to control these values/benchmark parameters.
  2. The benchmarks use 32-bit offsets only, mainly due to the fact that the TYPE INT uses... (surprise) int under the hood. This might become a limitation when one wants to use larger arrays. Also the initialization may produce bad values when the stream size grows beyond 2 GB.

So there is still stuff to elaborate on...

@TomTheBear
Copy link
Member

Thanks for the PR. This is on my Todo list already a long time.

Refactoring int to off_t (and size_t) internally is probably a good idea.

Addressing your drawbacks:

  1. As you noticed there are two ways of kernel-generation in likwid-bench, once at build time and at runtime. The build time generation bakes them into the binary in a read-only fashion. The other one may allow adaptions based on command line parameters. I'm fully aware that the baked-in kernels cause problems for some use-cases and the best would be to split the kernel definitions into 2 parts. The static kernels (without any change at runtime) can still be baked in while the dynamic kernels are installed to some folder at make install and we add a new search path for it to likwid-bench. It already searches for kernels in $HOME/.bench/<arch>.
  2. The INT data type is not used by any kernel at the moment. So, adapting the data type in likwid-bench shouldn't be a problem. The only issue is that we have to document it because the size might change. At the moment, you would use 4 Bytes in the INT kernels but with int64, we have to document that it's 8 Bytes now.

Unfortunately, I decided to drop the development of the current likwid-bench and put my efforts into a new version with multi-dimensional array support and a more flexible kernel creation pipeline. If you are interested I invite you to the currently separate and private repository, I would value your input.

@christgau
Copy link
Contributor Author

Thanks for you feedback and providing some more details.

I'd be happy to support your efforts in rewriting likwid-bench by integrating the pointer chasing capabilities or related stuff.

@TomTheBear
Copy link
Member

I would love to merge this for version 5.3 but the problem with the baked-in kernels is not resolved yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants