-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BLAS] Add BLAS ARM performance libraries backend. #629
base: develop
Are you sure you want to change the base?
Conversation
Signed-off-by: Augustin Degomme <[email protected]> Co-authored-by: Nicolas Bouton <[email protected]> Co-authored-by: Romain Dolbeau <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this contribution! I have a few initial comments/questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just to double-check: you have tested both the Netlib backend and the Arm Performance Libraries backend on ARM hardware?
Indeed, see log attached for the netlib one (on an ARM neoverse v2 platform) |
good catch, it shoud be fixed now. Thanks! |
Description
This adds the support for aarch64 CPUs using ARM performance libraries backend, with BLAS domain for now (LAPACK to come later).
Support for most functions is native, some batch ones are implemented directly.
It also enables NETLIB backend on aarch64 CPUs.
This has been tested on Neoverse N1, V1, V2 CPUs, with dpcpp compiler and pocl backend.
100% tests passed, 0 tests failed out of 1960
128 tests are skipped, due to unimplemented omatadd/copy, batch or int8/bfloat16 unsupported features.
AdaptiveCpp also has been tested to compile and run.
log.txt